GitHub topics: datapipeline
vishnupriyan123/wedding-venues-data-pipeline
AI-powered wedding venue data pipeline — scraping, enrichment, and NLP-driven insights from Hitched UK. 🔄 Actively evolving – multi-sprint project in progress.
Language: Python - Size: 50.7 MB - Last synced at: about 14 hours ago - Pushed at: about 15 hours ago - Stars: 0 - Forks: 0

FiredKreeper/AdventureWorks-SplitPackage
This repository contains an SSIS package that splits employee data from the AdventureWorksDW2017 database into country-specific tables (United States, United Kingdom, Germany, and others). It demonstrates ETL processes using tools like Merge Join, Conditional Split, and OLE DB Destination for efficient data integration.
Size: 1000 Bytes - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

Kushalkhadka7/dagster_clickhouse_dbt
DBT and clickhouse test project with dagster
Language: Python - Size: 4.03 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 0

zhaoyachao/zdh_web
大数据采集,抽取平台,zdh_web是zdh系列服务的可视化管理平台,包含数据采集,调度,权限,审批流,私域营销等模块
Language: Java - Size: 141 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 501 - Forks: 176

rishavvrajj/Data-Wharehouse
Data Warehouse Project: A structured Data Warehouse using Bronze, Silver, and Gold layers for efficient data ingestion, transformation, and analytics with SQL Server.
Language: TSQL - Size: 189 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0

tenzir/library
The Tenzir Community Library.
Size: 575 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 5 - Forks: 2

josephmachado/data_engineering_systems
How to quickly deliver data to business users?
Language: Jupyter Notebook - Size: 569 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 2 - Forks: 0

josephmachado/de_project
Step by step instructions to create a production-ready data pipeline
Language: Jupyter Notebook - Size: 4.31 MB - Last synced at: 8 days ago - Pushed at: 4 months ago - Stars: 44 - Forks: 12

ErdemOzgen/Data-Engineering-Roadmap
Roadmap for Data Engineering
Language: Java - Size: 1.98 MB - Last synced at: 8 days ago - Pushed at: 10 months ago - Stars: 225 - Forks: 30

cloudposse/terraform-aws-efs-backup
Terraform module designed to easily backup EFS filesystems to S3 using DataPipeline
Language: HCL - Size: 3.91 MB - Last synced at: 15 days ago - Pushed at: 6 months ago - Stars: 44 - Forks: 33

wri/gfw_forest_loss_geotrellis
Global Tree Cover Loss Analysis using Geotrellis and SPARK
Language: Scala - Size: 2.83 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 10 - Forks: 8

wri/gfw_forest_loss_geotrellis_arcpy_client
Arcpy client for GFW Forest Loss Analysis
Language: Python - Size: 122 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 1 - Forks: 1

josephmachado/beginner_de_project_stream
Simple stream processing pipeline
Language: Python - Size: 5.81 MB - Last synced at: 8 days ago - Pushed at: 10 months ago - Stars: 100 - Forks: 31

Alireza-Akhavan/tf2-tutorial
Tensorflow 2 Tutorials (use tensorflow and keras in a better way!)
Language: Jupyter Notebook - Size: 14.4 MB - Last synced at: 16 days ago - Pushed at: 11 months ago - Stars: 54 - Forks: 9

JasonZhangHub/CS5346CaseStudy
2420 CS5346 tableau case study data preparation pipeline
Language: Jupyter Notebook - Size: 58.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 1

cjunwon/Youtube-Data-Analysis
End-to-end Youtube data analysis project using Youtube Data API, MySQL, AWS, Flask
Language: HTML - Size: 4.44 MB - Last synced at: 20 days ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

ContextData/VectorETL
Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications
Language: Python - Size: 296 KB - Last synced at: 14 days ago - Pushed at: 7 months ago - Stars: 91 - Forks: 10

srimantapal205/DataEngineerWireframeDesigns
Data Engineer Wireframe Designs are essential for planning and visualizing data pipelines, architecture, and workflows before implementation.
Size: 10.7 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

yuhexiong/kafka-data-pipeline-flink-java
Data pipeline from Kafka to Kafka, Doris, MongoDB and Doris to Kafka using Flink Java.
Language: Java - Size: 70.3 KB - Last synced at: 29 days ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

KennethanCeyer/awesome-data-pipeline
Awesome list for datapipeline
Size: 200 KB - Last synced at: 11 days ago - Pushed at: about 2 years ago - Stars: 34 - Forks: 4

tilakapash/Real-Time-Weather-Analysis-Data-Pipeline
Size: 6.84 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

maxkrivich/debezium-data-masking-cdc-poc
Streaming data from PostgreSQL to Elasticsearch with masking sensitive data
Language: Dockerfile - Size: 7.81 KB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

covalenthq/bsp-geth
Ethereum client written in Go, modified for full-hierarchy data exports and block specimen production
Language: Go - Size: 152 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 28 - Forks: 10

myceliumAI/mycelium
A powerful platform designed to simplify the creation and management of Data Contracts, bridging systems for seamless data ingestion
Language: Python - Size: 2.51 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

kkumyk/server-logs-daily-data-pipeline
A data engineering project with dbt, Docker, Kestra, Terraform, GCP and Looker.
Language: HCL - Size: 755 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

Kaushik-Puttaswamy/Car-Rental-Analytics-Pipeline-with-Airflow-Snowflake-and-GCP
This project automates car rental data ingestion using Apache Airflow for orchestration, Google Dataproc for PySpark-based processing, and Snowflake for data warehousing, leveraging GCS for storage. It provides a scalable, efficient pipeline for transforming raw data into analytics-ready insights.
Language: Python - Size: 1.72 MB - Last synced at: 24 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

thanaphongK37/Data-Science-and-Data-Analyst-Project
Portfolio Data Analysis and Data Science projects and Data Engineer built using Azure Service, SQL and Python.
Language: Jupyter Notebook - Size: 3.92 MB - Last synced at: 25 days ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

ryanninodizon/EHR-PoC
End-to-End Concept Project: Azure Data Factory Pipeline, Azure IoT Central, Azure Storage Account, Azure SQL, .NET, Angular
Language: TypeScript - Size: 42.4 MB - Last synced at: 10 days ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Safwan2003/RandomForest_Heart_Disease_Prediction
A machine learning project using Random Forest Classifier to predict heart disease. Includes data preprocessing (with binning), feature selection, and model evaluation.
Language: Jupyter Notebook - Size: 4.86 MB - Last synced at: 14 days ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

akarce/elk-stack-mastery
A comprehensive project focusing on setting up and configuring the Elastic Stack (Elasticsearch, Logstash, and Kibana) for efficient log management and analytics. This project includes Elasticsearch configurations, Logstash pipelines, and Kibana visualizations, with detailed step-by-step documentation.
Size: 456 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

Sabab080/pyspark-etl-customer-sales
PySpark-based ETL pipeline that extracts transaction data from a MySQL database, cleans and transforms it, aggregates monthly sales per customer, and writes the processed data to an S3 bucket in Parquet format.
Language: Python - Size: 6.84 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

WaylonWalker/kedro-action
A GitHub Action to lint, test, build-docs, package, and run your kedro pipelines. Supports any Python version you'll give it (that is also supported by pyenv).
Language: Shell - Size: 254 KB - Last synced at: 7 days ago - Pushed at: 2 months ago - Stars: 20 - Forks: 3

adilkhash/luigi-course-materials
Материалы для курса Введение в Data Engineering: дата пайплайны
Language: Python - Size: 17.6 KB - Last synced at: 17 days ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 5

Zeekersky/AdventureWorks-SplitPackage
This repository contains an SSIS package that splits employee data from the AdventureWorksDW2017 database into country-specific tables (United States, United Kingdom, Germany, and others). It demonstrates ETL processes using tools like Merge Join, Conditional Split, and OLE DB Destination for efficient data integration.
Size: 862 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

muhammadelfikry/Ecommerce-Data-Pipeline-PySpark
This project aims to develop a data pipeline using PySpark, designed to perform ETL processes, data transformation, and RFM analysis execution.
Language: Python - Size: 3.91 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

epsi10nvn/vn-job-data-crawler
Web scraping project using Scrapy and Selenium to gather job postings in Vietnam (vietnamworks, topcv, linkedin)
Language: Python - Size: 769 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

ibnufajar1994/elt-data-warehouse
Build and Orchestrate an ELT Data Pipeline Using Luigi
Language: Python - Size: 40.4 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

bcgov/nr-rfc-dischargeobs
data pipeline code to download / process hydrological observation data
Language: Python - Size: 118 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

hajarshu/data-analytics-dojo
Collection of continuous learning and growth in the world of data analytics. Lifelong Learner! 🚀
Language: Jupyter Notebook - Size: 3.92 MB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

CS2219/Recommender_system
Recommendation system for Stock market
Language: Python - Size: 5.92 MB - Last synced at: 21 days ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

SURESHBEEKHANI/Wine-Quality-Prediction
This project involves the development of a complete ML pipeline with tracking and deployment capabilities.
Language: Jupyter Notebook - Size: 187 KB - Last synced at: 26 days ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

awillis/sluus
a micro batch processing pipeline
Language: Go - Size: 3.67 MB - Last synced at: 4 days ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

divithraju/divith-raju-Immigration-Data-Engineering
A Capstone Project that covers several aspects of Data Engineering (Data Exploration, Cleaning, Modeling, Pipelining, Processing)
Language: Jupyter Notebook - Size: 2.5 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

DataForgeOpenAIHub/.github
GitHub profile of this organization.
Size: 12.7 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Master-Project-Hate-Speech/STITCHeD
A Python-based data tool for Integrating Hate Speech datasets with varying schemas.
Language: Jupyter Notebook - Size: 69.3 MB - Last synced at: about 6 hours ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

zekeriyyaa/Building-A-Data-Pipeline-For-ROS-Compliant-Robotic-System-Via-Amazon-Web-Services
Language: Python - Size: 959 KB - Last synced at: 12 days ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 2

rishabhpanda/streamliner_beta
Streamliner AI is an advanced AI-powered data cleaning web application, built with the Streamlit framework and integrated with GPT-4 for seamless, intuitive data processing.
Language: Python - Size: 199 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

divithraju/divith-raju-Customer-Sales-ETL-Pipeline
This ETL project was designed to demonstrate the development of a scalable data pipeline for customer sales analysis. It covers all essential steps, from data extraction to transformation and loading into a database, with Apache Airflow used.
Language: Python - Size: 7.81 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

teckkean/GTFS-Data-Pipeline-TfNSW-Bus
GTFS Data Pipeline for TfNSW Bus Datasets
Language: Jupyter Notebook - Size: 12 MB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 2

Smohanta23/Uber_Data-Engineering_ETL-Project
This project demonstrates a comprehensive data engineering workflow using the Uber information dataset. It covers the full spectrum of data engineering pipelines, from data transformation to deployment on Google Cloud, with a focus on creating a scalable and insightful data model.
Language: Jupyter Notebook - Size: 19.6 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

VisheshData/Automated-AQI-Traffic-Data-Ingestion
Automated-AQI-Traffic-Data Ingestion for free. No cloud service pipeline creation required.
Language: Python - Size: 10.7 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

bilalhameed248/Diagnosis-Effecting-Patients-Recovery-Detection
A DNN Based Diagnosis Impact Detection Model. - Feb 2022 - Jun 2023
Language: Jupyter Notebook - Size: 93.8 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

wri/gfw_pixetl
GFW ETL for raster tiles
Language: Python - Size: 1.6 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 5 - Forks: 1

sravanigodavarthi/Automated-ELT-Pipeline-AWS
An Apache Airflow data pipeline is designed to perform ELT operations, utilizing Amazon S3 and Amazon Redshift Serverless.
Language: Python - Size: 48 MB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

ukokobili/data_aggregator
Automated data pipeline for cryptocurrency exchange analytics. Extracts data from multiple sources, processes it, and visualizes insights via a near real-time dashboard. Built with Python and Docker, featuring modular ETL, comprehensive logging, and automated testing.
Language: Python - Size: 516 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

deshk04/sid
Simple ETL tool for Salesforce
Language: JavaScript - Size: 9.29 MB - Last synced at: 11 months ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 1

behnamyazdan/PythonForDataEngineeringCourse
This course is designed to provide learners with the fundamental skills needed for data engineering using Python. The objective is to introduce anyone interested in the topic to Python's data engineering-related features.
Language: Python - Size: 1.41 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 5 - Forks: 0

mihirkudale/Twitter-Data-Pipeline-using-Airflow
This is End-To-End Data Engineering Project using Airflow and Python. In this project, we will extract data using Twitter API, use python to transform data, deploy the code on Airflow/EC2 and save the final result on Amazon S3
Language: Python - Size: 5.86 KB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

essraahmed/Data-Pipeline-with-Airflow
Data Pipeline with Apache Airflow
Language: Python - Size: 444 KB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

manoharpalanisamy/Image-Classification
Automatic Extraction of image from WhatsApp Image Folder or Customized Folder
Language: Python - Size: 2.51 MB - Last synced at: 12 months ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

WaylonWalker/kedro-static-viz
kedro cli plugin for generating a static kedro viz site (html, css, js) that can be deployed on many serverless tools.
Language: Python - Size: 13.6 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 27 - Forks: 2

huanwuji/teleporter
Reactive Streams distributed datapipeline for data process. Now support kafka,jdbc,kudu,elasticsearch,hdfs.etc
Language: Scala - Size: 740 KB - Last synced at: about 1 year ago - Pushed at: almost 8 years ago - Stars: 8 - Forks: 1

eldhosejohn/aws_datapipeline_auto_load
Auto load datapipeline from console
Language: JavaScript - Size: 7.81 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

manyuzhang1996/Data-Pipeline-with-dbt-and-Snowflake
Data Pipeline Project with dbt and snowflake
Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

amararyal/Co-Tags
Language: Python - Size: 582 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 0

ac223028/AlgoTrading
Trend following algorithm written in Golang.
Language: Go - Size: 23.7 MB - Last synced at: 10 months ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

parth2050/aws-data-pipeline
An End-To-End data pipeline integration from Website Source to analytical dashboard in AWS using Python flask, ML models, DynamoDB and other AWS services.
Language: HTML - Size: 8.79 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

rizkyirw/Pipeline-Project
Resource for ETL & Data Ingestion program using Apache Airflow
Language: Python - Size: 207 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

rizkyirw/Scraping-Pipeline
Scraping Pipeline using Orchestration Tools in Docker Environment
Language: Python - Size: 46.9 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

rashakil-ds/Methods-of-Advanced-Data-Engineering
Methods of Advanced Data Engineering is one of the courses in the department of DATA SCIENCE at the University of Erlangen (FAU)
Size: 4.88 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

madhura711/American-Airlines--Data-Mining-and-Variance-Analysis
Size: 2.23 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

matthewhoung/quantmind
Taiwan's Financial market data pipeline project
Language: Python - Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

denironyx/snowflake_dbt_lab
Snowflake and dbt labs
Size: 2.93 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

mundra-ankur/MSW_AI_Pipeline
Municipal solid waste (MSW) characterization, AI and Data pipeline to charcterize solid waste in real time into diffrent buckets using Yolo
Language: Python - Size: 41 KB - Last synced at: 12 days ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

kartik4949/TensorPipe
High Performance Tensorflow Data Pipeline with State of Art Augmentations and low level optimizations.
Language: Python - Size: 183 KB - Last synced at: 21 days ago - Pushed at: about 3 years ago - Stars: 86 - Forks: 20

elau1004/ETLite
A lightweight framework to host your ETL data-pipeline
Language: Python - Size: 513 KB - Last synced at: 14 days ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

Tanay0510/Data-Pipeline-with-Airflow
Built Data Pipelines with Airflow. Created custom operators to perform tasks such as staging the data, filling the data warehouse, and running checks on the data as the final step
Language: Python - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

teja-goud-kandula/StockDB
Building a database containing daily NSE stocks data
Language: Python - Size: 1.11 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

vvspearlvvs/MusicChatbot
AWS 데이터파이프라인 개발과 음악추천 챗봇
Language: Python - Size: 1.6 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 0

Cosmo-Tech/getting-started-with-data-injection
This sample demonstrate how to create Azure Digital Twins instances from your entreprise data to populate your simulation models.
Language: Shell - Size: 526 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 1

praveendecode/Data_Science_Projects
Executed diverse ML and NLP initiatives, implementing robust data pipelines, thorough data analysis, and seamless Docker projects. Demonstrated versatility in handling complex tasks, ensuring successful project outcomes
Size: 20.5 KB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

SarahMohammadiNejad/Reddit_Sentiment_Analysis
Language: Python - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

NVIDIA/go-tfdata
Go library that provides easy-to-use interfaces and tools for TensorFlow users, in particular allowing to train existing TF models on .tar and .tgz datasets
Language: Go - Size: 3.62 MB - Last synced at: 17 days ago - Pushed at: about 1 year ago - Stars: 14 - Forks: 3

asis-tobe/asis-tobe.github.io
AsIs-ToBe Public Website Repo
Language: HTML - Size: 8.73 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

soham7998/Data-Engineering-Youtube-End-to-End-Project
Youtube Data Engineering End to End pipeline analyzing the data .
Language: Python - Size: 243 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

AviatorIfeanyi/etl_with_mage_ai
An ETL data pipeline that extracts data from source and loads it to destination, automated using mage.ai
Language: Python - Size: 246 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 2

frasermarlow/fake-stars
A single-file Dagster project for evaluating fake GitHub stars.
Language: Python - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

coder2j/dagster-tutorial
Dagster Tutorial to get you started with Dagster as an absolute Beginner. The tutorial covers various topics like Dagster Installation, Dagster Asset, Dagster Job, Dagster Scheduler, Dagster Ops, and more. It is completely free on YouTube and is beginner-friendly without any prerequisites.
Language: Python - Size: 4.88 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Aditi-2512/E-commerce_Project
This project creates and builds data models and deploy database for E-commerce store
Language: Jupyter Notebook - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

danigil/Pizza-Place-Pipeline
A data pipeline processing simulated events generated by a pizza restaurant chain, generating NRT statistics and persisting data.
Language: CSS - Size: 1.82 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

travisyardley/ailieful
A data pipeline project inspired by our Scottish Fold's recent trip to the vet!
Language: Python - Size: 467 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

ankitanshumanmohapatra/Azure-Olympics-Analysis-Data-Engineering-End-to-End-Project
This is a End-to-End Azure Data Engineering Project | Analysis on the entire ETL Pipeline - Azure Factory, Azure Lake Gen 2, Databricks, Azure Synapse Analytics & Dashboards
Language: Jupyter Notebook - Size: 4.92 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

emirhanayhan/KafkaDataPipeline
Python data pipeline boilerplate which takes advantages from combine of multiprocessing, multithreading and asynchronous programming
Language: Python - Size: 14.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

JESUSC1/Mining-Software-Repositories
Utilized Python, PyDriller, and the GitHub API to mine GitHub repositories, capturing commits, issues, and code sizes. Visualized patterns and computed repository metrics, offering a detailed perspective on software repository evolution and insights for stakeholders.
Language: Jupyter Notebook - Size: 48.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Cosmo-Tech/azure-digital-twin-injector 📦
Data injection pipeline for Azure Digital Twin
Language: JavaScript - Size: 204 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 1

VikramBansall/Cyber-Attack
Cyber Attack Prediction Model
Language: Jupyter Notebook - Size: 2.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

joaoallmeida/texas-traffic-incidents-etl
Data pipeline for batch processing using Texas traffic incident data.
Language: Python - Size: 1.42 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

tabdulazeez/Data-Warehouse
Building a Data Warehouse for Fudgemart Inc. by Integrating Data from two Subsidiaries to support Business Intelligent
Language: Shell - Size: 1.09 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

HelloSongi/Spark-Structured-streaming-IoT-Weather-Sensors
A simulation to automatically collect weather data and visualize it on maps. Tech stack: Kafka, Spark Streaming, Cassandra, Tableau
Language: Scala - Size: 80.1 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

judeleonard/e-commerce_activity_tracking
This is an ELT data pipeline setup to track the activities of an e-commerce website based on orders, reviews, deliveries and shipment date. This project utilized technologies like Airflow, AWS RDS-Postgres, Python etc.
Language: Python - Size: 596 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0
