GitHub topics: datapipeline
tenzir/library
The Tenzir Community Library.
Size: 636 KB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 5 - Forks: 2

zhaoyachao/zdh_web
大数据采集,抽取平台,zdh_web是zdh系列服务的可视化管理平台,包含数据采集,调度,权限,审批流,私域营销等模块
Language: Java - Size: 141 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 521 - Forks: 182

kkumyk/server-logs-daily-data-pipeline
A data engineering project with dbt, Docker, Kestra, Terraform, GCP and Looker.
Language: HCL - Size: 923 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 6 - Forks: 0

FiredKreeper/AdventureWorks-SplitPackage
This repository contains an SSIS package that splits employee data from the AdventureWorksDW2017 database into country-specific tables (United States, United Kingdom, Germany, and others). It demonstrates ETL processes using tools like Merge Join, Conditional Split, and OLE DB Destination for efficient data integration.
Size: 1000 Bytes - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

bcgov/nr-rfc-dischargeobs
data pipeline code to download / process hydrological observation data
Language: Python - Size: 118 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

Venkatanarayana-batthi/RetailSales_ETL_Fabric
ETL project using Microsoft Fabric with Data Pipelines, Notebooks, Delta Lake, and Lakehouse integration.
Language: Jupyter Notebook - Size: 671 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

jotstolu/Flight-Booking-Analytics-Data-Engineering-Project-using-Databricks-and-DBT
This project demonstrates an end-to-end data pipeline built using Databricks, Delta Live Tables, and DBT (Data Build Tool) to process, transform, and model flight booking data for advanced analytics and business intelligence.
Size: 901 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

jotstolu/Azure-Data-Engineering-End--to-End-Project
An end-to-end Netflix data engineering pipeline built on Microsoft Azure. This project ingests raw Netflix data, applies PySpark transformations , enforces data quality with Delta Live Tables, and orchestrates workflows via Azure Data Factory and Databricks.
Size: 5.49 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

jotstolu/Azure-Data-Engineering-End-to-End-Project---NYC-taxi-dataset
An end‑to‑end data engineering pipeline for NYC Green Taxi trip records, built on Microsoft Azure. This project ingests Jan–Dec 2024 Parquet files from the NYC Taxi API into a Bronze Delta Lake layer, cleans and enriches the data in a Silver layer with PySpark on Azure Databricks, then saves the transformed data to the Gold layer in delta format
Size: 1.69 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

YourDataArchitect/French-realestate-data-pipeline
This repository contains a fully automated data pipeline built with Apache Airflow to extract, clean, analyze, and report real estate listings from Seloger. It pushes data to MongoDB, Elasticsearch, and Google Sheets, with real-time Slack alerts for monitoring.
Language: Python - Size: 4.64 MB - Last synced at: 9 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

wri/gfw_forest_loss_geotrellis
Global Tree Cover Loss Analysis using Geotrellis and SPARK
Language: Scala - Size: 2.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 10 - Forks: 8

Deshan-Senanayake/Bird-Range-Prediction
This is a lightweight web application that allows users to predict bird presence, location, and the best time to observe birds based on machine learning models trained on real birdwatching data from the Hambantota District.
Language: Jupyter Notebook - Size: 20.5 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Dashboard-Design/PythonETLPipelineProjects
Build your data engineering skills with Python ETL/ELT projects and warehousing courses.
Language: Python - Size: 11.9 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

ErdemOzgen/Data-Engineering-Roadmap
Roadmap for Data Engineering
Language: Java - Size: 1.98 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 228 - Forks: 30

vishnupriyan123/wedding-venues-data-pipeline
AI-powered wedding venue data pipeline — scraping, enrichment, and NLP-driven insights from Hitched UK. 🔄 Actively evolving – multi-sprint project in progress.
Language: Python - Size: 60.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

cloudposse/terraform-aws-efs-backup
Terraform module designed to easily backup EFS filesystems to S3 using DataPipeline
Language: HCL - Size: 3.91 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 44 - Forks: 33

WaylonWalker/kedro-action
A GitHub Action to lint, test, build-docs, package, and run your kedro pipelines. Supports any Python version you'll give it (that is also supported by pyenv).
Language: Shell - Size: 261 KB - Last synced at: about 20 hours ago - Pushed at: about 21 hours ago - Stars: 19 - Forks: 3

wri/gfw_forest_loss_geotrellis_arcpy_client
Arcpy client for GFW Forest Loss Analysis
Language: Python - Size: 138 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 1

Alireza-Akhavan/tf2-tutorial
Tensorflow 2 Tutorials (use tensorflow and keras in a better way!)
Language: Jupyter Notebook - Size: 14.4 MB - Last synced at: 18 days ago - Pushed at: about 1 year ago - Stars: 56 - Forks: 9

ContextData/VectorETL
Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications
Language: Python - Size: 296 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 97 - Forks: 15

indix/sparkplug
Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌
Language: Scala - Size: 503 KB - Last synced at: about 2 months ago - Pushed at: about 5 years ago - Stars: 29 - Forks: 2

NVIDIA/go-tfdata
Go library that provides easy-to-use interfaces and tools for TensorFlow users, in particular allowing to train existing TF models on .tar and .tgz datasets
Language: Go - Size: 3.62 MB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 3

covalenthq/bsp-geth
Ethereum client written in Go, modified for full-hierarchy data exports and block specimen production
Language: Go - Size: 155 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 28 - Forks: 11

josephmachado/api_data_extract
Code for extracting data from API with Python
Language: Jupyter Notebook - Size: 85.9 KB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

nsiriwardhana/Vegetable-Price-Forecasting-Using-Time-Series-Models
Forecasting vegetable prices using economic indicators and time series models (VAR, VECM, ARDL).
Language: Jupyter Notebook - Size: 2.06 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Kushalkhadka7/dagster_clickhouse_dbt
DBT and clickhouse test project with dagster
Language: Python - Size: 4.03 MB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 7 - Forks: 0

rishavvrajj/Data-Wharehouse
Data Warehouse Project: A structured Data Warehouse using Bronze, Silver, and Gold layers for efficient data ingestion, transformation, and analytics with SQL Server.
Language: TSQL - Size: 189 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

josephmachado/data_engineering_systems
How to quickly deliver data to business users?
Language: Jupyter Notebook - Size: 569 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

josephmachado/de_project
Step by step instructions to create a production-ready data pipeline
Language: Jupyter Notebook - Size: 4.31 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 44 - Forks: 12

josephmachado/beginner_de_project_stream
Simple stream processing pipeline
Language: Python - Size: 5.81 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 100 - Forks: 31

cc59chong/Cleaning-Data-with-PySpark
Language: Jupyter Notebook - Size: 6.48 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 2

JasonZhangHub/CS5346CaseStudy
2420 CS5346 tableau case study data preparation pipeline
Language: Jupyter Notebook - Size: 58.9 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 1

cjunwon/Youtube-Data-Analysis
End-to-end Youtube data analysis project using Youtube Data API, MySQL, AWS, Flask
Language: HTML - Size: 4.44 MB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

srimantapal205/DataEngineerWireframeDesigns
Data Engineer Wireframe Designs are essential for planning and visualizing data pipelines, architecture, and workflows before implementation.
Size: 10.7 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

yuhexiong/kafka-data-pipeline-flink-java
Data pipeline from Kafka to Kafka, Doris, MongoDB and Doris to Kafka using Flink Java.
Language: Java - Size: 70.3 KB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 2 - Forks: 0

KennethanCeyer/awesome-data-pipeline
Awesome list for datapipeline
Size: 200 KB - Last synced at: 25 days ago - Pushed at: over 2 years ago - Stars: 34 - Forks: 4

joaoblasques/Real-Time-Weather-Analysis-Data-Pipeline
A real-time weather analysis pipeline
Language: Python - Size: 14.6 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

maxkrivich/debezium-data-masking-cdc-poc
Streaming data from PostgreSQL to Elasticsearch with masking sensitive data
Language: Dockerfile - Size: 7.81 KB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

myceliumAI/mycelium
A powerful platform designed to simplify the creation and management of Data Contracts, bridging systems for seamless data ingestion
Language: Python - Size: 2.51 MB - Last synced at: 24 days ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 0

Kaushik-Puttaswamy/Car-Rental-Analytics-Pipeline-with-Airflow-Snowflake-and-GCP
This project automates car rental data ingestion using Apache Airflow for orchestration, Google Dataproc for PySpark-based processing, and Snowflake for data warehousing, leveraging GCS for storage. It provides a scalable, efficient pipeline for transforming raw data into analytics-ready insights.
Language: Python - Size: 1.72 MB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

thanaphongK37/Data-Science-and-Data-Analyst-Project
Portfolio Data Analysis and Data Science projects and Data Engineer built using Azure Service, SQL and Python.
Language: Jupyter Notebook - Size: 3.92 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

ryanninodizon/EHR-PoC
End-to-End Concept Project: Azure Data Factory Pipeline, Azure IoT Central, Azure Storage Account, Azure SQL, .NET, Angular
Language: TypeScript - Size: 42.4 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Safwan2003/RandomForest_Heart_Disease_Prediction
A machine learning project using Random Forest Classifier to predict heart disease. Includes data preprocessing (with binning), feature selection, and model evaluation.
Language: Jupyter Notebook - Size: 4.86 MB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

akarce/elk-stack-mastery
A comprehensive project focusing on setting up and configuring the Elastic Stack (Elasticsearch, Logstash, and Kibana) for efficient log management and analytics. This project includes Elasticsearch configurations, Logstash pipelines, and Kibana visualizations, with detailed step-by-step documentation.
Size: 456 KB - Last synced at: 5 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

Sabab080/pyspark-etl-customer-sales
PySpark-based ETL pipeline that extracts transaction data from a MySQL database, cleans and transforms it, aggregates monthly sales per customer, and writes the processed data to an S3 bucket in Parquet format.
Language: Python - Size: 6.84 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

adilkhash/luigi-course-materials
Материалы для курса Введение в Data Engineering: дата пайплайны
Language: Python - Size: 17.6 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 5

Zeekersky/AdventureWorks-SplitPackage
This repository contains an SSIS package that splits employee data from the AdventureWorksDW2017 database into country-specific tables (United States, United Kingdom, Germany, and others). It demonstrates ETL processes using tools like Merge Join, Conditional Split, and OLE DB Destination for efficient data integration.
Size: 862 KB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

muhammadelfikry/Ecommerce-Data-Pipeline-PySpark
This project aims to develop a data pipeline using PySpark, designed to perform ETL processes, data transformation, and RFM analysis execution.
Language: Python - Size: 3.91 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

epsi10nvn/vn-job-data-crawler
Web scraping project using Scrapy and Selenium to gather job postings in Vietnam (vietnamworks, topcv, linkedin)
Language: Python - Size: 769 KB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

ibnufajar1994/elt-data-warehouse
Build and Orchestrate an ELT Data Pipeline Using Luigi
Language: Python - Size: 40.4 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

haajarsh/data-analytics-dojo
Collection of continuous learning and growth in the world of data analytics. Lifelong Learner! 🚀
Language: Jupyter Notebook - Size: 3.92 MB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

CS2219/Recommender_system
Recommendation system for Stock market
Language: Python - Size: 5.92 MB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

SURESHBEEKHANI/Wine-Quality-Prediction
This project involves the development of a complete ML pipeline with tracking and deployment capabilities.
Language: Jupyter Notebook - Size: 187 KB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

awillis/sluus
a micro batch processing pipeline
Language: Go - Size: 3.67 MB - Last synced at: 1 day ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

divithraju/divith-raju-Immigration-Data-Engineering
A Capstone Project that covers several aspects of Data Engineering (Data Exploration, Cleaning, Modeling, Pipelining, Processing)
Language: Jupyter Notebook - Size: 2.5 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

DataForgeOpenAIHub/.github
GitHub profile of this organization.
Size: 12.7 KB - Last synced at: 5 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Master-Project-Hate-Speech/STITCHeD
A Python-based data tool for Integrating Hate Speech datasets with varying schemas.
Language: Jupyter Notebook - Size: 69.3 MB - Last synced at: 29 days ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

datahangar/datahangar
Network data pipeline stack (pmacct, kafka, DB...) in K8s
Language: Python - Size: 136 KB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

zekeriyyaa/Building-A-Data-Pipeline-For-ROS-Compliant-Robotic-System-Via-Amazon-Web-Services
Language: Python - Size: 959 KB - Last synced at: 4 months ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 2

rishabhpanda/streamliner_beta
Streamliner AI is an advanced AI-powered data cleaning web application, built with the Streamlit framework and integrated with GPT-4 for seamless, intuitive data processing.
Language: Python - Size: 199 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

divithraju/divith-raju-Customer-Sales-ETL-Pipeline
This ETL project was designed to demonstrate the development of a scalable data pipeline for customer sales analysis. It covers all essential steps, from data extraction to transformation and loading into a database, with Apache Airflow used.
Language: Python - Size: 7.81 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

teckkean/GTFS-Data-Pipeline-TfNSW-Bus
GTFS Data Pipeline for TfNSW Bus Datasets
Language: Jupyter Notebook - Size: 12 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 2

Smohanta23/Uber_Data-Engineering_ETL-Project
This project demonstrates a comprehensive data engineering workflow using the Uber information dataset. It covers the full spectrum of data engineering pipelines, from data transformation to deployment on Google Cloud, with a focus on creating a scalable and insightful data model.
Language: Jupyter Notebook - Size: 19.6 MB - Last synced at: 4 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

VisheshData/Automated-AQI-Traffic-Data-Ingestion
Automated-AQI-Traffic-Data Ingestion for free. No cloud service pipeline creation required.
Language: Python - Size: 10.7 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

bilalhameed248/Diagnosis-Effecting-Patients-Recovery-Detection
A DNN Based Diagnosis Impact Detection Model. - Feb 2022 - Jun 2023
Language: Jupyter Notebook - Size: 93.8 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

wri/gfw_pixetl
GFW ETL for raster tiles
Language: Python - Size: 1.6 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 5 - Forks: 1

sravanigodavarthi/Automated-ELT-Pipeline-AWS
An Apache Airflow data pipeline is designed to perform ELT operations, utilizing Amazon S3 and Amazon Redshift Serverless.
Language: Python - Size: 48 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ukokobili/data_aggregator
Automated data pipeline for cryptocurrency exchange analytics. Extracts data from multiple sources, processes it, and visualizes insights via a near real-time dashboard. Built with Python and Docker, featuring modular ETL, comprehensive logging, and automated testing.
Language: Python - Size: 516 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

deshk04/sid
Simple ETL tool for Salesforce
Language: JavaScript - Size: 9.29 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 1

julian-King22/etl_with_mage_ai
An ETL data pipeline that extracts data from source and loads it to destination, automated using mage.ai
Language: Python - Size: 246 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 5

behnamyazdan/PythonForDataEngineeringCourse
This course is designed to provide learners with the fundamental skills needed for data engineering using Python. The objective is to introduce anyone interested in the topic to Python's data engineering-related features.
Language: Python - Size: 1.41 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 0

mihirkudale/Twitter-Data-Pipeline-using-Airflow
This is End-To-End Data Engineering Project using Airflow and Python. In this project, we will extract data using Twitter API, use python to transform data, deploy the code on Airflow/EC2 and save the final result on Amazon S3
Language: Python - Size: 5.86 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

essraahmed/Data-Pipeline-with-Airflow
Data Pipeline with Apache Airflow
Language: Python - Size: 444 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 1

manoharpalanisamy/Image-Classification
Automatic Extraction of image from WhatsApp Image Folder or Customized Folder
Language: Python - Size: 2.51 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

WaylonWalker/kedro-static-viz
kedro cli plugin for generating a static kedro viz site (html, css, js) that can be deployed on many serverless tools.
Language: Python - Size: 13.6 MB - Last synced at: 18 days ago - Pushed at: over 2 years ago - Stars: 27 - Forks: 2

huanwuji/teleporter
Reactive Streams distributed datapipeline for data process. Now support kafka,jdbc,kudu,elasticsearch,hdfs.etc
Language: Scala - Size: 740 KB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 8 - Forks: 1

eldhosejohn/aws_datapipeline_auto_load
Auto load datapipeline from console
Language: JavaScript - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

manyuzhang1996/Data-Pipeline-with-dbt-and-Snowflake
Data Pipeline Project with dbt and snowflake
Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

amararyal/Co-Tags
Language: Python - Size: 582 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 0

ac223028/AlgoTrading
Trend following algorithm written in Golang.
Language: Go - Size: 23.7 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

parth2050/aws-data-pipeline
An End-To-End data pipeline integration from Website Source to analytical dashboard in AWS using Python flask, ML models, DynamoDB and other AWS services.
Language: HTML - Size: 8.79 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

rizkyirw/Pipeline-Project
Resource for ETL & Data Ingestion program using Apache Airflow
Language: Python - Size: 207 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

rizkyirw/Scraping-Pipeline
Scraping Pipeline using Orchestration Tools in Docker Environment
Language: Python - Size: 46.9 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

rashakil-ds/Methods-of-Advanced-Data-Engineering
Methods of Advanced Data Engineering is one of the courses in the department of DATA SCIENCE at the University of Erlangen (FAU)
Size: 4.88 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

madhura711/American-Airlines--Data-Mining-and-Variance-Analysis
Size: 2.23 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

matthewhoung/quantmind
Taiwan's Financial market data pipeline project
Language: Python - Size: 5.86 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

denironyx/snowflake_dbt_lab
Snowflake and dbt labs
Size: 2.93 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

mundra-ankur/MSW_AI_Pipeline
Municipal solid waste (MSW) characterization, AI and Data pipeline to charcterize solid waste in real time into diffrent buckets using Yolo
Language: Python - Size: 41 KB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

kartik4949/TensorPipe
High Performance Tensorflow Data Pipeline with State of Art Augmentations and low level optimizations.
Language: Python - Size: 183 KB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 86 - Forks: 20

elau1004/ETLite
A lightweight framework to host your ETL data-pipeline
Language: Python - Size: 513 KB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

Tanay0510/Data-Pipeline-with-Airflow
Built Data Pipelines with Airflow. Created custom operators to perform tasks such as staging the data, filling the data warehouse, and running checks on the data as the final step
Language: Python - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

teja-goud-kandula/StockDB
Building a database containing daily NSE stocks data
Language: Python - Size: 1.11 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

vvspearlvvs/MusicChatbot
AWS 데이터파이프라인 개발과 음악추천 챗봇
Language: Python - Size: 1.6 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 0

Cosmo-Tech/getting-started-with-data-injection
This sample demonstrate how to create Azure Digital Twins instances from your entreprise data to populate your simulation models.
Language: Shell - Size: 526 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 1

praveendecode/Data_Science_Projects
Executed diverse ML and NLP initiatives, implementing robust data pipelines, thorough data analysis, and seamless Docker projects. Demonstrated versatility in handling complex tasks, ensuring successful project outcomes
Size: 20.5 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

SarahMohammadiNejad/Reddit_Sentiment_Analysis
Language: Python - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

asis-tobe/asis-tobe.github.io
AsIs-ToBe Public Website Repo
Language: HTML - Size: 8.73 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

soham7998/Data-Engineering-Youtube-End-to-End-Project
Youtube Data Engineering End to End pipeline analyzing the data .
Language: Python - Size: 243 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

frasermarlow/fake-stars
A single-file Dagster project for evaluating fake GitHub stars.
Language: Python - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

coder2j/dagster-tutorial
Dagster Tutorial to get you started with Dagster as an absolute Beginner. The tutorial covers various topics like Dagster Installation, Dagster Asset, Dagster Job, Dagster Scheduler, Dagster Ops, and more. It is completely free on YouTube and is beginner-friendly without any prerequisites.
Language: Python - Size: 4.88 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
