GitHub topics: datapipeline | Ecosyste.ms: Repos

tenzir/library

The Tenzir Community Library.

Size: 636 KB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 5 - Forks: 2

zhaoyachao/zdh_web

大数据采集,抽取平台,zdh_web是zdh系列服务的可视化管理平台，包含数据采集,调度,权限,审批流,私域营销等模块

Language: Java - Size: 141 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 521 - Forks: 182

kkumyk/server-logs-daily-data-pipeline

A data engineering project with dbt, Docker, Kestra, Terraform, GCP and Looker.

Language: HCL - Size: 923 KB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 6 - Forks: 0

FiredKreeper/AdventureWorks-SplitPackage

This repository contains an SSIS package that splits employee data from the AdventureWorksDW2017 database into country-specific tables (United States, United Kingdom, Germany, and others). It demonstrates ETL processes using tools like Merge Join, Conditional Split, and OLE DB Destination for efficient data integration.

Size: 1000 Bytes - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

bcgov/nr-rfc-dischargeobs

data pipeline code to download / process hydrological observation data

Language: Python - Size: 118 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

Venkatanarayana-batthi/RetailSales_ETL_Fabric

ETL project using Microsoft Fabric with Data Pipelines, Notebooks, Delta Lake, and Lakehouse integration.

Language: Jupyter Notebook - Size: 671 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

jotstolu/Flight-Booking-Analytics-Data-Engineering-Project-using-Databricks-and-DBT

This project demonstrates an end-to-end data pipeline built using Databricks, Delta Live Tables, and DBT (Data Build Tool) to process, transform, and model flight booking data for advanced analytics and business intelligence.

Size: 901 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

jotstolu/Azure-Data-Engineering-End--to-End-Project

An end-to-end Netflix data engineering pipeline built on Microsoft Azure. This project ingests raw Netflix data, applies PySpark transformations , enforces data quality with Delta Live Tables, and orchestrates workflows via Azure Data Factory and Databricks.

Size: 5.49 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

jotstolu/Azure-Data-Engineering-End-to-End-Project---NYC-taxi-dataset

An end‑to‑end data engineering pipeline for NYC Green Taxi trip records, built on Microsoft Azure. This project ingests Jan–Dec 2024 Parquet files from the NYC Taxi API into a Bronze Delta Lake layer, cleans and enriches the data in a Silver layer with PySpark on Azure Databricks, then saves the transformed data to the Gold layer in delta format

Size: 1.69 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

YourDataArchitect/French-realestate-data-pipeline

This repository contains a fully automated data pipeline built with Apache Airflow to extract, clean, analyze, and report real estate listings from Seloger. It pushes data to MongoDB, Elasticsearch, and Google Sheets, with real-time Slack alerts for monitoring.

Language: Python - Size: 4.64 MB - Last synced at: 9 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

wri/gfw_forest_loss_geotrellis

Global Tree Cover Loss Analysis using Geotrellis and SPARK

Language: Scala - Size: 2.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 10 - Forks: 8

Deshan-Senanayake/Bird-Range-Prediction

This is a lightweight web application that allows users to predict bird presence, location, and the best time to observe birds based on machine learning models trained on real birdwatching data from the Hambantota District.

Language: Jupyter Notebook - Size: 20.5 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Dashboard-Design/PythonETLPipelineProjects

Build your data engineering skills with Python ETL/ELT projects and warehousing courses.

Language: Python - Size: 11.9 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

ErdemOzgen/Data-Engineering-Roadmap

Roadmap for Data Engineering

Language: Java - Size: 1.98 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 228 - Forks: 30

vishnupriyan123/wedding-venues-data-pipeline

AI-powered wedding venue data pipeline — scraping, enrichment, and NLP-driven insights from Hitched UK. 🔄 Actively evolving – multi-sprint project in progress.

Language: Python - Size: 60.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

cloudposse/terraform-aws-efs-backup

Terraform module designed to easily backup EFS filesystems to S3 using DataPipeline

Language: HCL - Size: 3.91 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 44 - Forks: 33

WaylonWalker/kedro-action

A GitHub Action to lint, test, build-docs, package, and run your kedro pipelines. Supports any Python version you'll give it (that is also supported by pyenv).

Language: Shell - Size: 261 KB - Last synced at: about 20 hours ago - Pushed at: about 21 hours ago - Stars: 19 - Forks: 3

wri/gfw_forest_loss_geotrellis_arcpy_client

Arcpy client for GFW Forest Loss Analysis

Language: Python - Size: 138 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 1

Alireza-Akhavan/tf2-tutorial

Tensorflow 2 Tutorials (use tensorflow and keras in a better way!)

Language: Jupyter Notebook - Size: 14.4 MB - Last synced at: 18 days ago - Pushed at: about 1 year ago - Stars: 56 - Forks: 9

ContextData/VectorETL

Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications

Language: Python - Size: 296 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 97 - Forks: 15

indix/sparkplug

Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌

Language: Scala - Size: 503 KB - Last synced at: about 2 months ago - Pushed at: about 5 years ago - Stars: 29 - Forks: 2

NVIDIA/go-tfdata

Go library that provides easy-to-use interfaces and tools for TensorFlow users, in particular allowing to train existing TF models on .tar and .tgz datasets

Language: Go - Size: 3.62 MB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 3

covalenthq/bsp-geth

Ethereum client written in Go, modified for full-hierarchy data exports and block specimen production

Language: Go - Size: 155 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 28 - Forks: 11

josephmachado/api_data_extract

Code for extracting data from API with Python

Language: Jupyter Notebook - Size: 85.9 KB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

nsiriwardhana/Vegetable-Price-Forecasting-Using-Time-Series-Models

Forecasting vegetable prices using economic indicators and time series models (VAR, VECM, ARDL).

Language: Jupyter Notebook - Size: 2.06 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Kushalkhadka7/dagster_clickhouse_dbt

DBT and clickhouse test project with dagster

Language: Python - Size: 4.03 MB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 7 - Forks: 0

rishavvrajj/Data-Wharehouse

Data Warehouse Project: A structured Data Warehouse using Bronze, Silver, and Gold layers for efficient data ingestion, transformation, and analytics with SQL Server.

Language: TSQL - Size: 189 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

josephmachado/data_engineering_systems

How to quickly deliver data to business users?

Language: Jupyter Notebook - Size: 569 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

josephmachado/de_project

Step by step instructions to create a production-ready data pipeline

Language: Jupyter Notebook - Size: 4.31 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 44 - Forks: 12

josephmachado/beginner_de_project_stream

Simple stream processing pipeline

Language: Python - Size: 5.81 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 100 - Forks: 31

cc59chong/Cleaning-Data-with-PySpark

Language: Jupyter Notebook - Size: 6.48 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 2

JasonZhangHub/CS5346CaseStudy

2420 CS5346 tableau case study data preparation pipeline

Language: Jupyter Notebook - Size: 58.9 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 1

cjunwon/Youtube-Data-Analysis

End-to-end Youtube data analysis project using Youtube Data API, MySQL, AWS, Flask

Language: HTML - Size: 4.44 MB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

srimantapal205/DataEngineerWireframeDesigns

Data Engineer Wireframe Designs are essential for planning and visualizing data pipelines, architecture, and workflows before implementation.

Size: 10.7 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

yuhexiong/kafka-data-pipeline-flink-java

Data pipeline from Kafka to Kafka, Doris, MongoDB and Doris to Kafka using Flink Java.

Language: Java - Size: 70.3 KB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 2 - Forks: 0

KennethanCeyer/awesome-data-pipeline

Awesome list for datapipeline

Size: 200 KB - Last synced at: 25 days ago - Pushed at: over 2 years ago - Stars: 34 - Forks: 4

joaoblasques/Real-Time-Weather-Analysis-Data-Pipeline

A real-time weather analysis pipeline

Language: Python - Size: 14.6 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

maxkrivich/debezium-data-masking-cdc-poc

Streaming data from PostgreSQL to Elasticsearch with masking sensitive data

Language: Dockerfile - Size: 7.81 KB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

myceliumAI/mycelium

A powerful platform designed to simplify the creation and management of Data Contracts, bridging systems for seamless data ingestion

Language: Python - Size: 2.51 MB - Last synced at: 24 days ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 0

Kaushik-Puttaswamy/Car-Rental-Analytics-Pipeline-with-Airflow-Snowflake-and-GCP

This project automates car rental data ingestion using Apache Airflow for orchestration, Google Dataproc for PySpark-based processing, and Snowflake for data warehousing, leveraging GCS for storage. It provides a scalable, efficient pipeline for transforming raw data into analytics-ready insights.

Language: Python - Size: 1.72 MB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

thanaphongK37/Data-Science-and-Data-Analyst-Project

Portfolio Data Analysis and Data Science projects and Data Engineer built using Azure Service, SQL and Python.

Language: Jupyter Notebook - Size: 3.92 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

ryanninodizon/EHR-PoC

End-to-End Concept Project: Azure Data Factory Pipeline, Azure IoT Central, Azure Storage Account, Azure SQL, .NET, Angular

Language: TypeScript - Size: 42.4 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Safwan2003/RandomForest_Heart_Disease_Prediction

A machine learning project using Random Forest Classifier to predict heart disease. Includes data preprocessing (with binning), feature selection, and model evaluation.

Language: Jupyter Notebook - Size: 4.86 MB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

akarce/elk-stack-mastery

A comprehensive project focusing on setting up and configuring the Elastic Stack (Elasticsearch, Logstash, and Kibana) for efficient log management and analytics. This project includes Elasticsearch configurations, Logstash pipelines, and Kibana visualizations, with detailed step-by-step documentation.

Size: 456 KB - Last synced at: 5 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

Sabab080/pyspark-etl-customer-sales

PySpark-based ETL pipeline that extracts transaction data from a MySQL database, cleans and transforms it, aggregates monthly sales per customer, and writes the processed data to an S3 bucket in Parquet format.

Language: Python - Size: 6.84 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

adilkhash/luigi-course-materials

Материалы для курса Введение в Data Engineering: дата пайплайны

Language: Python - Size: 17.6 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 5

Zeekersky/AdventureWorks-SplitPackage

This repository contains an SSIS package that splits employee data from the AdventureWorksDW2017 database into country-specific tables (United States, United Kingdom, Germany, and others). It demonstrates ETL processes using tools like Merge Join, Conditional Split, and OLE DB Destination for efficient data integration.

Size: 862 KB - Last synced at: 5 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

muhammadelfikry/Ecommerce-Data-Pipeline-PySpark

This project aims to develop a data pipeline using PySpark, designed to perform ETL processes, data transformation, and RFM analysis execution.

Language: Python - Size: 3.91 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

epsi10nvn/vn-job-data-crawler

Web scraping project using Scrapy and Selenium to gather job postings in Vietnam (vietnamworks, topcv, linkedin)

Language: Python - Size: 769 KB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

ibnufajar1994/elt-data-warehouse

Build and Orchestrate an ELT Data Pipeline Using Luigi

Language: Python - Size: 40.4 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

haajarsh/data-analytics-dojo

Collection of continuous learning and growth in the world of data analytics. Lifelong Learner! 🚀

Language: Jupyter Notebook - Size: 3.92 MB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

CS2219/Recommender_system

Recommendation system for Stock market

Language: Python - Size: 5.92 MB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

SURESHBEEKHANI/Wine-Quality-Prediction

This project involves the development of a complete ML pipeline with tracking and deployment capabilities.

Language: Jupyter Notebook - Size: 187 KB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

awillis/sluus

a micro batch processing pipeline

Language: Go - Size: 3.67 MB - Last synced at: 1 day ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

divithraju/divith-raju-Immigration-Data-Engineering

A Capstone Project that covers several aspects of Data Engineering (Data Exploration, Cleaning, Modeling, Pipelining, Processing)

Language: Jupyter Notebook - Size: 2.5 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

DataForgeOpenAIHub/.github

GitHub profile of this organization.

Size: 12.7 KB - Last synced at: 5 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Master-Project-Hate-Speech/STITCHeD

A Python-based data tool for Integrating Hate Speech datasets with varying schemas.

Language: Jupyter Notebook - Size: 69.3 MB - Last synced at: 29 days ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

datahangar/datahangar

Network data pipeline stack (pmacct, kafka, DB...) in K8s

Language: Python - Size: 136 KB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

zekeriyyaa/Building-A-Data-Pipeline-For-ROS-Compliant-Robotic-System-Via-Amazon-Web-Services

Language: Python - Size: 959 KB - Last synced at: 4 months ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 2

rishabhpanda/streamliner_beta

Streamliner AI is an advanced AI-powered data cleaning web application, built with the Streamlit framework and integrated with GPT-4 for seamless, intuitive data processing.

Language: Python - Size: 199 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

divithraju/divith-raju-Customer-Sales-ETL-Pipeline

This ETL project was designed to demonstrate the development of a scalable data pipeline for customer sales analysis. It covers all essential steps, from data extraction to transformation and loading into a database, with Apache Airflow used.

Language: Python - Size: 7.81 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

teckkean/GTFS-Data-Pipeline-TfNSW-Bus

GTFS Data Pipeline for TfNSW Bus Datasets

Language: Jupyter Notebook - Size: 12 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 2

Smohanta23/Uber_Data-Engineering_ETL-Project

This project demonstrates a comprehensive data engineering workflow using the Uber information dataset. It covers the full spectrum of data engineering pipelines, from data transformation to deployment on Google Cloud, with a focus on creating a scalable and insightful data model.

Language: Jupyter Notebook - Size: 19.6 MB - Last synced at: 4 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

VisheshData/Automated-AQI-Traffic-Data-Ingestion

Automated-AQI-Traffic-Data Ingestion for free. No cloud service pipeline creation required.

Language: Python - Size: 10.7 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

bilalhameed248/Diagnosis-Effecting-Patients-Recovery-Detection

A DNN Based Diagnosis Impact Detection Model. - Feb 2022 - Jun 2023

Language: Jupyter Notebook - Size: 93.8 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

wri/gfw_pixetl

GFW ETL for raster tiles

Language: Python - Size: 1.6 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 5 - Forks: 1

sravanigodavarthi/Automated-ELT-Pipeline-AWS

An Apache Airflow data pipeline is designed to perform ELT operations, utilizing Amazon S3 and Amazon Redshift Serverless.

Language: Python - Size: 48 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ukokobili/data_aggregator

Automated data pipeline for cryptocurrency exchange analytics. Extracts data from multiple sources, processes it, and visualizes insights via a near real-time dashboard. Built with Python and Docker, featuring modular ETL, comprehensive logging, and automated testing.

Language: Python - Size: 516 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

deshk04/sid

Simple ETL tool for Salesforce

Language: JavaScript - Size: 9.29 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 1

julian-King22/etl_with_mage_ai

An ETL data pipeline that extracts data from source and loads it to destination, automated using mage.ai

Language: Python - Size: 246 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 5

behnamyazdan/PythonForDataEngineeringCourse

This course is designed to provide learners with the fundamental skills needed for data engineering using Python. The objective is to introduce anyone interested in the topic to Python's data engineering-related features.

Language: Python - Size: 1.41 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 0

mihirkudale/Twitter-Data-Pipeline-using-Airflow

This is End-To-End Data Engineering Project using Airflow and Python. In this project, we will extract data using Twitter API, use python to transform data, deploy the code on Airflow/EC2 and save the final result on Amazon S3

Language: Python - Size: 5.86 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

essraahmed/Data-Pipeline-with-Airflow

Data Pipeline with Apache Airflow

Language: Python - Size: 444 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 1

manoharpalanisamy/Image-Classification

Automatic Extraction of image from WhatsApp Image Folder or Customized Folder

Language: Python - Size: 2.51 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

WaylonWalker/kedro-static-viz

kedro cli plugin for generating a static kedro viz site (html, css, js) that can be deployed on many serverless tools.

Language: Python - Size: 13.6 MB - Last synced at: 18 days ago - Pushed at: over 2 years ago - Stars: 27 - Forks: 2

huanwuji/teleporter

Reactive Streams distributed datapipeline for data process. Now support kafka,jdbc,kudu,elasticsearch,hdfs.etc

Language: Scala - Size: 740 KB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 8 - Forks: 1

eldhosejohn/aws_datapipeline_auto_load

Auto load datapipeline from console

Language: JavaScript - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

manyuzhang1996/Data-Pipeline-with-dbt-and-Snowflake

Data Pipeline Project with dbt and snowflake

Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

amararyal/Co-Tags

Language: Python - Size: 582 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 0

ac223028/AlgoTrading

Trend following algorithm written in Golang.

Language: Go - Size: 23.7 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

parth2050/aws-data-pipeline

An End-To-End data pipeline integration from Website Source to analytical dashboard in AWS using Python flask, ML models, DynamoDB and other AWS services.

Language: HTML - Size: 8.79 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

rizkyirw/Pipeline-Project

Resource for ETL & Data Ingestion program using Apache Airflow

Language: Python - Size: 207 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

rizkyirw/Scraping-Pipeline

Scraping Pipeline using Orchestration Tools in Docker Environment

Language: Python - Size: 46.9 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

rashakil-ds/Methods-of-Advanced-Data-Engineering

Methods of Advanced Data Engineering is one of the courses in the department of DATA SCIENCE at the University of Erlangen (FAU)

Size: 4.88 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

madhura711/American-Airlines--Data-Mining-and-Variance-Analysis

Size: 2.23 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

matthewhoung/quantmind

Taiwan's Financial market data pipeline project

Language: Python - Size: 5.86 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

denironyx/snowflake_dbt_lab

Snowflake and dbt labs

Size: 2.93 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

mundra-ankur/MSW_AI_Pipeline

Municipal solid waste (MSW) characterization, AI and Data pipeline to charcterize solid waste in real time into diffrent buckets using Yolo

Language: Python - Size: 41 KB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

kartik4949/TensorPipe

High Performance Tensorflow Data Pipeline with State of Art Augmentations and low level optimizations.

Language: Python - Size: 183 KB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 86 - Forks: 20

elau1004/ETLite

A lightweight framework to host your ETL data-pipeline

Language: Python - Size: 513 KB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

Tanay0510/Data-Pipeline-with-Airflow

Built Data Pipelines with Airflow. Created custom operators to perform tasks such as staging the data, filling the data warehouse, and running checks on the data as the final step

Language: Python - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

teja-goud-kandula/StockDB

Building a database containing daily NSE stocks data

Language: Python - Size: 1.11 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

vvspearlvvs/MusicChatbot

AWS 데이터파이프라인 개발과 음악추천 챗봇

Language: Python - Size: 1.6 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 0

Cosmo-Tech/getting-started-with-data-injection

This sample demonstrate how to create Azure Digital Twins instances from your entreprise data to populate your simulation models.

Language: Shell - Size: 526 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 1

praveendecode/Data_Science_Projects

Executed diverse ML and NLP initiatives, implementing robust data pipelines, thorough data analysis, and seamless Docker projects. Demonstrated versatility in handling complex tasks, ensuring successful project outcomes

Size: 20.5 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

SarahMohammadiNejad/Reddit_Sentiment_Analysis

Language: Python - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

asis-tobe/asis-tobe.github.io

AsIs-ToBe Public Website Repo

Language: HTML - Size: 8.73 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

soham7998/Data-Engineering-Youtube-End-to-End-Project

Youtube Data Engineering End to End pipeline analyzing the data .

Language: Python - Size: 243 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

frasermarlow/fake-stars

A single-file Dagster project for evaluating fake GitHub stars.

Language: Python - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

coder2j/dagster-tutorial

Dagster Tutorial to get you started with Dagster as an absolute Beginner. The tutorial covers various topics like Dagster Installation, Dagster Asset, Dagster Job, Dagster Scheduler, Dagster Ops, and more. It is completely free on YouTube and is beginner-friendly without any prerequisites.

Language: Python - Size: 4.88 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0