An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: datapipeline

vishnupriyan123/wedding-venues-data-pipeline

AI-powered wedding venue data pipeline — scraping, enrichment, and NLP-driven insights from Hitched UK. 🔄 Actively evolving – multi-sprint project in progress.

Language: Python - Size: 50.7 MB - Last synced at: about 14 hours ago - Pushed at: about 15 hours ago - Stars: 0 - Forks: 0

FiredKreeper/AdventureWorks-SplitPackage

This repository contains an SSIS package that splits employee data from the AdventureWorksDW2017 database into country-specific tables (United States, United Kingdom, Germany, and others). It demonstrates ETL processes using tools like Merge Join, Conditional Split, and OLE DB Destination for efficient data integration.

Size: 1000 Bytes - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

Kushalkhadka7/dagster_clickhouse_dbt

DBT and clickhouse test project with dagster

Language: Python - Size: 4.03 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 0

zhaoyachao/zdh_web

大数据采集,抽取平台,zdh_web是zdh系列服务的可视化管理平台,包含数据采集,调度,权限,审批流,私域营销等模块

Language: Java - Size: 141 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 501 - Forks: 176

rishavvrajj/Data-Wharehouse

Data Warehouse Project: A structured Data Warehouse using Bronze, Silver, and Gold layers for efficient data ingestion, transformation, and analytics with SQL Server.

Language: TSQL - Size: 189 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0

tenzir/library

The Tenzir Community Library.

Size: 575 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 5 - Forks: 2

josephmachado/data_engineering_systems

How to quickly deliver data to business users?

Language: Jupyter Notebook - Size: 569 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 2 - Forks: 0

josephmachado/de_project

Step by step instructions to create a production-ready data pipeline

Language: Jupyter Notebook - Size: 4.31 MB - Last synced at: 8 days ago - Pushed at: 4 months ago - Stars: 44 - Forks: 12

ErdemOzgen/Data-Engineering-Roadmap

Roadmap for Data Engineering

Language: Java - Size: 1.98 MB - Last synced at: 8 days ago - Pushed at: 10 months ago - Stars: 225 - Forks: 30

cloudposse/terraform-aws-efs-backup

Terraform module designed to easily backup EFS filesystems to S3 using DataPipeline

Language: HCL - Size: 3.91 MB - Last synced at: 15 days ago - Pushed at: 6 months ago - Stars: 44 - Forks: 33

wri/gfw_forest_loss_geotrellis

Global Tree Cover Loss Analysis using Geotrellis and SPARK

Language: Scala - Size: 2.83 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 10 - Forks: 8

wri/gfw_forest_loss_geotrellis_arcpy_client

Arcpy client for GFW Forest Loss Analysis

Language: Python - Size: 122 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 1 - Forks: 1

josephmachado/beginner_de_project_stream

Simple stream processing pipeline

Language: Python - Size: 5.81 MB - Last synced at: 8 days ago - Pushed at: 10 months ago - Stars: 100 - Forks: 31

Alireza-Akhavan/tf2-tutorial

Tensorflow 2 Tutorials (use tensorflow and keras in a better way!)

Language: Jupyter Notebook - Size: 14.4 MB - Last synced at: 16 days ago - Pushed at: 11 months ago - Stars: 54 - Forks: 9

JasonZhangHub/CS5346CaseStudy

2420 CS5346 tableau case study data preparation pipeline

Language: Jupyter Notebook - Size: 58.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 1

cjunwon/Youtube-Data-Analysis

End-to-end Youtube data analysis project using Youtube Data API, MySQL, AWS, Flask

Language: HTML - Size: 4.44 MB - Last synced at: 20 days ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

ContextData/VectorETL

Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications

Language: Python - Size: 296 KB - Last synced at: 14 days ago - Pushed at: 7 months ago - Stars: 91 - Forks: 10

srimantapal205/DataEngineerWireframeDesigns

Data Engineer Wireframe Designs are essential for planning and visualizing data pipelines, architecture, and workflows before implementation.

Size: 10.7 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

yuhexiong/kafka-data-pipeline-flink-java

Data pipeline from Kafka to Kafka, Doris, MongoDB and Doris to Kafka using Flink Java.

Language: Java - Size: 70.3 KB - Last synced at: 29 days ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

KennethanCeyer/awesome-data-pipeline

Awesome list for datapipeline

Size: 200 KB - Last synced at: 11 days ago - Pushed at: about 2 years ago - Stars: 34 - Forks: 4

tilakapash/Real-Time-Weather-Analysis-Data-Pipeline

Size: 6.84 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

maxkrivich/debezium-data-masking-cdc-poc

Streaming data from PostgreSQL to Elasticsearch with masking sensitive data

Language: Dockerfile - Size: 7.81 KB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

covalenthq/bsp-geth

Ethereum client written in Go, modified for full-hierarchy data exports and block specimen production

Language: Go - Size: 152 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 28 - Forks: 10

myceliumAI/mycelium

A powerful platform designed to simplify the creation and management of Data Contracts, bridging systems for seamless data ingestion

Language: Python - Size: 2.51 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

kkumyk/server-logs-daily-data-pipeline

A data engineering project with dbt, Docker, Kestra, Terraform, GCP and Looker.

Language: HCL - Size: 755 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

Kaushik-Puttaswamy/Car-Rental-Analytics-Pipeline-with-Airflow-Snowflake-and-GCP

This project automates car rental data ingestion using Apache Airflow for orchestration, Google Dataproc for PySpark-based processing, and Snowflake for data warehousing, leveraging GCS for storage. It provides a scalable, efficient pipeline for transforming raw data into analytics-ready insights.

Language: Python - Size: 1.72 MB - Last synced at: 24 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

thanaphongK37/Data-Science-and-Data-Analyst-Project

Portfolio Data Analysis and Data Science projects and Data Engineer built using Azure Service, SQL and Python.

Language: Jupyter Notebook - Size: 3.92 MB - Last synced at: 25 days ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

ryanninodizon/EHR-PoC

End-to-End Concept Project: Azure Data Factory Pipeline, Azure IoT Central, Azure Storage Account, Azure SQL, .NET, Angular

Language: TypeScript - Size: 42.4 MB - Last synced at: 10 days ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Safwan2003/RandomForest_Heart_Disease_Prediction

A machine learning project using Random Forest Classifier to predict heart disease. Includes data preprocessing (with binning), feature selection, and model evaluation.

Language: Jupyter Notebook - Size: 4.86 MB - Last synced at: 14 days ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

akarce/elk-stack-mastery

A comprehensive project focusing on setting up and configuring the Elastic Stack (Elasticsearch, Logstash, and Kibana) for efficient log management and analytics. This project includes Elasticsearch configurations, Logstash pipelines, and Kibana visualizations, with detailed step-by-step documentation.

Size: 456 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

Sabab080/pyspark-etl-customer-sales

PySpark-based ETL pipeline that extracts transaction data from a MySQL database, cleans and transforms it, aggregates monthly sales per customer, and writes the processed data to an S3 bucket in Parquet format.

Language: Python - Size: 6.84 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

WaylonWalker/kedro-action

A GitHub Action to lint, test, build-docs, package, and run your kedro pipelines. Supports any Python version you'll give it (that is also supported by pyenv).

Language: Shell - Size: 254 KB - Last synced at: 7 days ago - Pushed at: 2 months ago - Stars: 20 - Forks: 3

adilkhash/luigi-course-materials

Материалы для курса Введение в Data Engineering: дата пайплайны

Language: Python - Size: 17.6 KB - Last synced at: 17 days ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 5

Zeekersky/AdventureWorks-SplitPackage

This repository contains an SSIS package that splits employee data from the AdventureWorksDW2017 database into country-specific tables (United States, United Kingdom, Germany, and others). It demonstrates ETL processes using tools like Merge Join, Conditional Split, and OLE DB Destination for efficient data integration.

Size: 862 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

muhammadelfikry/Ecommerce-Data-Pipeline-PySpark

This project aims to develop a data pipeline using PySpark, designed to perform ETL processes, data transformation, and RFM analysis execution.

Language: Python - Size: 3.91 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

epsi10nvn/vn-job-data-crawler

Web scraping project using Scrapy and Selenium to gather job postings in Vietnam (vietnamworks, topcv, linkedin)

Language: Python - Size: 769 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

ibnufajar1994/elt-data-warehouse

Build and Orchestrate an ELT Data Pipeline Using Luigi

Language: Python - Size: 40.4 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

bcgov/nr-rfc-dischargeobs

data pipeline code to download / process hydrological observation data

Language: Python - Size: 118 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

hajarshu/data-analytics-dojo

Collection of continuous learning and growth in the world of data analytics. Lifelong Learner! 🚀

Language: Jupyter Notebook - Size: 3.92 MB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

CS2219/Recommender_system

Recommendation system for Stock market

Language: Python - Size: 5.92 MB - Last synced at: 21 days ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

SURESHBEEKHANI/Wine-Quality-Prediction

This project involves the development of a complete ML pipeline with tracking and deployment capabilities.

Language: Jupyter Notebook - Size: 187 KB - Last synced at: 26 days ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

awillis/sluus

a micro batch processing pipeline

Language: Go - Size: 3.67 MB - Last synced at: 4 days ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

divithraju/divith-raju-Immigration-Data-Engineering

A Capstone Project that covers several aspects of Data Engineering (Data Exploration, Cleaning, Modeling, Pipelining, Processing)

Language: Jupyter Notebook - Size: 2.5 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

DataForgeOpenAIHub/.github

GitHub profile of this organization.

Size: 12.7 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Master-Project-Hate-Speech/STITCHeD

A Python-based data tool for Integrating Hate Speech datasets with varying schemas.

Language: Jupyter Notebook - Size: 69.3 MB - Last synced at: about 6 hours ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

zekeriyyaa/Building-A-Data-Pipeline-For-ROS-Compliant-Robotic-System-Via-Amazon-Web-Services

Language: Python - Size: 959 KB - Last synced at: 12 days ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 2

rishabhpanda/streamliner_beta

Streamliner AI is an advanced AI-powered data cleaning web application, built with the Streamlit framework and integrated with GPT-4 for seamless, intuitive data processing.

Language: Python - Size: 199 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

divithraju/divith-raju-Customer-Sales-ETL-Pipeline

This ETL project was designed to demonstrate the development of a scalable data pipeline for customer sales analysis. It covers all essential steps, from data extraction to transformation and loading into a database, with Apache Airflow used.

Language: Python - Size: 7.81 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

teckkean/GTFS-Data-Pipeline-TfNSW-Bus

GTFS Data Pipeline for TfNSW Bus Datasets

Language: Jupyter Notebook - Size: 12 MB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 2

Smohanta23/Uber_Data-Engineering_ETL-Project

This project demonstrates a comprehensive data engineering workflow using the Uber information dataset. It covers the full spectrum of data engineering pipelines, from data transformation to deployment on Google Cloud, with a focus on creating a scalable and insightful data model.

Language: Jupyter Notebook - Size: 19.6 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

VisheshData/Automated-AQI-Traffic-Data-Ingestion

Automated-AQI-Traffic-Data Ingestion for free. No cloud service pipeline creation required.

Language: Python - Size: 10.7 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

bilalhameed248/Diagnosis-Effecting-Patients-Recovery-Detection

A DNN Based Diagnosis Impact Detection Model. - Feb 2022 - Jun 2023

Language: Jupyter Notebook - Size: 93.8 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

wri/gfw_pixetl

GFW ETL for raster tiles

Language: Python - Size: 1.6 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 5 - Forks: 1

sravanigodavarthi/Automated-ELT-Pipeline-AWS

An Apache Airflow data pipeline is designed to perform ELT operations, utilizing Amazon S3 and Amazon Redshift Serverless.

Language: Python - Size: 48 MB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

ukokobili/data_aggregator

Automated data pipeline for cryptocurrency exchange analytics. Extracts data from multiple sources, processes it, and visualizes insights via a near real-time dashboard. Built with Python and Docker, featuring modular ETL, comprehensive logging, and automated testing.

Language: Python - Size: 516 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

deshk04/sid

Simple ETL tool for Salesforce

Language: JavaScript - Size: 9.29 MB - Last synced at: 11 months ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 1

behnamyazdan/PythonForDataEngineeringCourse

This course is designed to provide learners with the fundamental skills needed for data engineering using Python. The objective is to introduce anyone interested in the topic to Python's data engineering-related features.

Language: Python - Size: 1.41 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 5 - Forks: 0

mihirkudale/Twitter-Data-Pipeline-using-Airflow

This is End-To-End Data Engineering Project using Airflow and Python. In this project, we will extract data using Twitter API, use python to transform data, deploy the code on Airflow/EC2 and save the final result on Amazon S3

Language: Python - Size: 5.86 KB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

essraahmed/Data-Pipeline-with-Airflow

Data Pipeline with Apache Airflow

Language: Python - Size: 444 KB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

manoharpalanisamy/Image-Classification

Automatic Extraction of image from WhatsApp Image Folder or Customized Folder

Language: Python - Size: 2.51 MB - Last synced at: 12 months ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

WaylonWalker/kedro-static-viz

kedro cli plugin for generating a static kedro viz site (html, css, js) that can be deployed on many serverless tools.

Language: Python - Size: 13.6 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 27 - Forks: 2

huanwuji/teleporter

Reactive Streams distributed datapipeline for data process. Now support kafka,jdbc,kudu,elasticsearch,hdfs.etc

Language: Scala - Size: 740 KB - Last synced at: about 1 year ago - Pushed at: almost 8 years ago - Stars: 8 - Forks: 1

eldhosejohn/aws_datapipeline_auto_load

Auto load datapipeline from console

Language: JavaScript - Size: 7.81 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

manyuzhang1996/Data-Pipeline-with-dbt-and-Snowflake

Data Pipeline Project with dbt and snowflake

Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

amararyal/Co-Tags

Language: Python - Size: 582 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 0

ac223028/AlgoTrading

Trend following algorithm written in Golang.

Language: Go - Size: 23.7 MB - Last synced at: 10 months ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

parth2050/aws-data-pipeline

An End-To-End data pipeline integration from Website Source to analytical dashboard in AWS using Python flask, ML models, DynamoDB and other AWS services.

Language: HTML - Size: 8.79 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

rizkyirw/Pipeline-Project

Resource for ETL & Data Ingestion program using Apache Airflow

Language: Python - Size: 207 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

rizkyirw/Scraping-Pipeline

Scraping Pipeline using Orchestration Tools in Docker Environment

Language: Python - Size: 46.9 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

rashakil-ds/Methods-of-Advanced-Data-Engineering

Methods of Advanced Data Engineering is one of the courses in the department of DATA SCIENCE at the University of Erlangen (FAU)

Size: 4.88 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

madhura711/American-Airlines--Data-Mining-and-Variance-Analysis

Size: 2.23 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

matthewhoung/quantmind

Taiwan's Financial market data pipeline project

Language: Python - Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

denironyx/snowflake_dbt_lab

Snowflake and dbt labs

Size: 2.93 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

mundra-ankur/MSW_AI_Pipeline

Municipal solid waste (MSW) characterization, AI and Data pipeline to charcterize solid waste in real time into diffrent buckets using Yolo

Language: Python - Size: 41 KB - Last synced at: 12 days ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

kartik4949/TensorPipe

High Performance Tensorflow Data Pipeline with State of Art Augmentations and low level optimizations.

Language: Python - Size: 183 KB - Last synced at: 21 days ago - Pushed at: about 3 years ago - Stars: 86 - Forks: 20

elau1004/ETLite

A lightweight framework to host your ETL data-pipeline

Language: Python - Size: 513 KB - Last synced at: 14 days ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

Tanay0510/Data-Pipeline-with-Airflow

Built Data Pipelines with Airflow. Created custom operators to perform tasks such as staging the data, filling the data warehouse, and running checks on the data as the final step

Language: Python - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

teja-goud-kandula/StockDB

Building a database containing daily NSE stocks data

Language: Python - Size: 1.11 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

vvspearlvvs/MusicChatbot

AWS 데이터파이프라인 개발과 음악추천 챗봇

Language: Python - Size: 1.6 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 0

Cosmo-Tech/getting-started-with-data-injection

This sample demonstrate how to create Azure Digital Twins instances from your entreprise data to populate your simulation models.

Language: Shell - Size: 526 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 1

praveendecode/Data_Science_Projects

Executed diverse ML and NLP initiatives, implementing robust data pipelines, thorough data analysis, and seamless Docker projects. Demonstrated versatility in handling complex tasks, ensuring successful project outcomes

Size: 20.5 KB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

SarahMohammadiNejad/Reddit_Sentiment_Analysis

Language: Python - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

NVIDIA/go-tfdata

Go library that provides easy-to-use interfaces and tools for TensorFlow users, in particular allowing to train existing TF models on .tar and .tgz datasets

Language: Go - Size: 3.62 MB - Last synced at: 17 days ago - Pushed at: about 1 year ago - Stars: 14 - Forks: 3

asis-tobe/asis-tobe.github.io

AsIs-ToBe Public Website Repo

Language: HTML - Size: 8.73 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

soham7998/Data-Engineering-Youtube-End-to-End-Project

Youtube Data Engineering End to End pipeline analyzing the data .

Language: Python - Size: 243 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

AviatorIfeanyi/etl_with_mage_ai

An ETL data pipeline that extracts data from source and loads it to destination, automated using mage.ai

Language: Python - Size: 246 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 2

frasermarlow/fake-stars

A single-file Dagster project for evaluating fake GitHub stars.

Language: Python - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

coder2j/dagster-tutorial

Dagster Tutorial to get you started with Dagster as an absolute Beginner. The tutorial covers various topics like Dagster Installation, Dagster Asset, Dagster Job, Dagster Scheduler, Dagster Ops, and more. It is completely free on YouTube and is beginner-friendly without any prerequisites.

Language: Python - Size: 4.88 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Aditi-2512/E-commerce_Project

This project creates and builds data models and deploy database for E-commerce store

Language: Jupyter Notebook - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

danigil/Pizza-Place-Pipeline

A data pipeline processing simulated events generated by a pizza restaurant chain, generating NRT statistics and persisting data.

Language: CSS - Size: 1.82 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

travisyardley/ailieful

A data pipeline project inspired by our Scottish Fold's recent trip to the vet!

Language: Python - Size: 467 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

ankitanshumanmohapatra/Azure-Olympics-Analysis-Data-Engineering-End-to-End-Project

This is a End-to-End Azure Data Engineering Project | Analysis on the entire ETL Pipeline - Azure Factory, Azure Lake Gen 2, Databricks, Azure Synapse Analytics & Dashboards

Language: Jupyter Notebook - Size: 4.92 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

emirhanayhan/KafkaDataPipeline

Python data pipeline boilerplate which takes advantages from combine of multiprocessing, multithreading and asynchronous programming

Language: Python - Size: 14.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

JESUSC1/Mining-Software-Repositories

Utilized Python, PyDriller, and the GitHub API to mine GitHub repositories, capturing commits, issues, and code sizes. Visualized patterns and computed repository metrics, offering a detailed perspective on software repository evolution and insights for stakeholders.

Language: Jupyter Notebook - Size: 48.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Cosmo-Tech/azure-digital-twin-injector 📦

Data injection pipeline for Azure Digital Twin

Language: JavaScript - Size: 204 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 1

VikramBansall/Cyber-Attack

Cyber Attack Prediction Model

Language: Jupyter Notebook - Size: 2.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

joaoallmeida/texas-traffic-incidents-etl

Data pipeline for batch processing using Texas traffic incident data.

Language: Python - Size: 1.42 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

tabdulazeez/Data-Warehouse

Building a Data Warehouse for Fudgemart Inc. by Integrating Data from two Subsidiaries to support Business Intelligent

Language: Shell - Size: 1.09 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

HelloSongi/Spark-Structured-streaming-IoT-Weather-Sensors

A simulation to automatically collect weather data and visualize it on maps. Tech stack: Kafka, Spark Streaming, Cassandra, Tableau

Language: Scala - Size: 80.1 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

judeleonard/e-commerce_activity_tracking

This is an ELT data pipeline setup to track the activities of an e-commerce website based on orders, reviews, deliveries and shipment date. This project utilized technologies like Airflow, AWS RDS-Postgres, Python etc.

Language: Python - Size: 596 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0