An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: airflow-dags

vcovelli/sc-dash

End-to-end supply chain data pipeline using Airflow, MongoDB, and PostgreSQL with Django REST API for analytics.

Language: Python - Size: 249 KB - Last synced at: about 4 hours ago - Pushed at: about 5 hours ago - Stars: 1 - Forks: 0

marlonmoreira1/Skill-em-Dados

O objetivo deste projeto é contribuir com a formação de iniciantes que almejam entrar na área de dados, fornecendo uma visão baseada em dados sobre as habilidades e conhecimentos mais demandados pelo mercado. Através da coleta e análise de vagas de emprego/estágio, o projeto visa responder à pergunta: “Como se tornar um profissional de dados?"

Language: Python - Size: 87.6 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

Hippaho/Sparkify

A music streaming company, Sparkify, has decided that it is time to introduce more automation and monitoring to their data warehouse ETL pipelines and come to the conclusion that the best tool to achieve this is Apache Airflow.

Language: Python - Size: 17.6 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

airflow-laminar/airflow-common-operators

Common Airflow Operators / Tasks

Language: Python - Size: 310 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 4 - Forks: 0

iobruno/data-engineering-examples

Data Engineering examples for Airflow, Prefect; dbt for BigQuery, Redshift, ClickHouse, Postgres, DuckDB; PySpark for Batch processing; Kafka for Stream processing

Language: Python - Size: 5.02 MB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 64 - Forks: 2

95xin/Premier-League-Data-Engineering-Project

Data Engineering Project -Premier League Datasets

Language: Python - Size: 0 Bytes - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 1 - Forks: 0

lukez42/pinterest-data-pipeline542

This project builds a scalable data pipeline for processing Pinterest-like data, integrating AWS services like Amazon Kinesis for streaming, Apache Kafka for batch processing, Apache Airflow for orchestration, and Databricks for data transformation.

Language: Jupyter Notebook - Size: 3.87 MB - Last synced at: 4 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

tulibraries/cob_datapipeline

Airflow Data Processing Pipeline for TUL Catalog on Blacklight Data

Language: Python - Size: 3.46 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 6 - Forks: 0

tulibraries/funcake_dags

Airflow DAGs for PA Digital aggregation processes

Language: Python - Size: 3.06 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 4 - Forks: 1

status-im/airflow-dags

Status BI python DAGs for Airflow

Language: Python - Size: 204 KB - Last synced at: about 4 hours ago - Pushed at: 13 days ago - Stars: 0 - Forks: 2

msaakaash/csv-datawarehouse-etl-airflow

An end-to-end ETL pipeline to extract sales data from CSV, transform and load into PostgreSQL warehouse. Automated and scheduled using Apache Airflow.

Language: Python - Size: 3.91 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

IwasakiYuuki/data-analysis-platform-airflow-dag

A collection of Airflow DAGs for automating data collection into our on-premises data analysis platform.

Language: Python - Size: 88.9 KB - Last synced at: 2 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

DunnBC22/Data_Engineering_Projects

This repository includes data engineering projects using Apache Airflow. I hope to add more projects using different technologies soon!

Language: TSQL - Size: 60.5 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 4 - Forks: 0

gestaogovbr/Ro-dou

Gerador de DAGs no Apache Airflow para fazer clipping do Diário Oficial da União.

Language: Python - Size: 3.96 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 108 - Forks: 31

yuhexiong/airflow-dag-kafka-flink-doris-python

Language: Python - Size: 43 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 1 - Forks: 0

tashi-2004/Apache-Airflow-Kafka-Spark-DeltaLake-Real-Time-Stream-Pipeline

This project implements a real-time data pipeline using Apache Airflow, Kafka, Apache Spark, and Delta Lake. It supports both batch (Coldpath) and real-time (Hotpath) data ingestion, processing, and storage. Airflow is used for orchestrating the data workflows.

Language: Python - Size: 12.5 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

matsudan/airflow-dag-examples

Apache Airflow DAG examples

Language: Python - Size: 212 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 2 - Forks: 1

tulibraries/tulflow

TU Libraries Python Library for functions used in indexing ETL, particularly for Airflow

Language: Python - Size: 2.52 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 3 - Forks: 1

mohiteamit/MLops-lead-scoring-system

Automated machine learning pipeline with Airflow and model experiments using MLflow

Language: Jupyter Notebook - Size: 25 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

wcharczuk/temporalflow

A workflow to process graphs of activities in parallel.

Language: Go - Size: 44.9 KB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

krishna-aditi/nlp-sentiment-analysis-on-stock-news-and-price-monitoring

WebApp to bring together Text Summarization and Sentiment Analysis of the stock related news to better understand the stock price trends.

Language: Jupyter Notebook - Size: 3.07 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 15 - Forks: 6

krishna-aditi/Sevir-Lambda-APIs

SEVIR Lambda functions for Summarization and Named-Entity Recognition. Deployed on AWS ECR.

Language: Python - Size: 2.11 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

astronomer/astro-provider-databricks 📦

Orchestrate your Databricks notebooks in Airflow and execute them as Databricks Workflows

Language: Python - Size: 11.1 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 23 - Forks: 12

nathadriele/dock-financial-data-pipelines

Automated pipeline for generating and processing Dock balance reports using Apache Airflow, SFTP, AWS S3, and Lambda.

Language: Python - Size: 11.7 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 7 - Forks: 0

micheldpd24/cust_review_analysis

End-to-End Review Analysis Pipeline : Scrape, process, and analyze customer reviews from Trustpilot using Apache Airflow and Python | Interactive Dash Dashboard : Visualize sentiment, trends, and textual insights with a dynamic, Dockerized Dash app.

Language: Jupyter Notebook - Size: 315 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

enricogoerlitz/explore-airflow

Apache Airflow project for orchestrating ETL workflows in a data warehouse environment. Implements a medallion architecture (Bronze, Silver, Gold layers) with Postgres integration for scalable and modular data processing pipelines.

Language: Python - Size: 1.33 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

micheldpd24/mlops_air_msr

MLOps Pipeline for Music Recommendation - Spotify playlist continuation

Language: Python - Size: 25.5 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

sangwanamit621/learning-and-experiments

Welcome to the Learning and Experiments Hub—a dynamic repository capturing my journey of exploration and experimentation in the vast world of technology. This space serves as a digital canvas where I document my learning process, experiments, and discoveries.

Language: Jupyter Notebook - Size: 53.7 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

sergio11/lyric_wave_architecture

🎵 LyricWave - Your AI Music Composer 🎶 Compose Unique MP4 Songs Effortlessly! LyricWave uses AI to create personalized music by harmonizing lyrics with captivating melodies and synthetic vocals. Unleash your musical creativity today! 🚀🎶

Language: Python - Size: 29 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 29 - Forks: 6

rahulsamant37/End-to-End-Flask-App

An end-to-end data science workflow that includes a Flask-based machine learning application, an automated ETL pipeline with Airflow for data extraction and transformation, and a comprehensive data analysis module using Pandas for statistical analysis, manipulation, and visualization.

Language: Jupyter Notebook - Size: 928 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

lema-ufpb/xcom-cleaner

This project implements a pipeline to clean the history of XCOM variables stored by Airflow.

Language: Python - Size: 6.84 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

WilsonH918/Data_Pipeline_ETL_Pipeline_Web3_Token

Data Pipeline Project

Language: Python - Size: 17.6 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

Dina-Hosny/ETL-Data-Pipeline-using-AirFlow

An ETL Data Pipelines Project that uses AirFlow DAGs to extract employees' data from PostgreSQL Schemas, load it in AWS Data Lake, Transform it with Python script, and Finally load it into SnowFlake Data warehouse using SCD type 2.

Language: Python - Size: 89.8 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 1

RayVilaca/mini-projeto-ETL-video-game-sales

Este projeto é um pipeline ETL usando Apache Airflow para orquestrar a extração, transformação e carregamento (ETL) de dados de vendas de video games. Os dados são extraídos do Kaggle, transformados para limpar e filtrar os dados, validados usando Soda SQL e, em seguida, carregados em um bucket S3.

Language: Python - Size: 8.79 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

spuerta10/MLOpsCreditRisk

Implementación de técnicas de MLOps para la puesta en producción de un modelo de riesgo crediticio. Contiene pipelines para el entrenamiento, validación, despliegue y monitoreo del modelo, además de documentación sobre la infraestructura utilizada y estrategias para la automatización del ciclo de vida del modelo.

Language: Jupyter Notebook - Size: 1.18 MB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

AlvaroCavalcante/airflow-custom-deferrable-dataflow-operator

Start your Dataflow jobs execution directly from the Triggerer without going to the Worker!

Language: Python - Size: 36.1 KB - Last synced at: 8 days ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

potreic/ETL-Fashion-Trend-Analysis

✨ Automate fashion trend analysis with Apache Airflow! Extract data from X & Pinterest, transform into insights, and load into PostgreSQL. Predict seasonal styles & visualize trends. 💃📊

Language: Python - Size: 168 KB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

Pirate-Emperor/BigData-Pipeline

BigData Pipeline is a local testing environment for experimenting with various storage solutions (RDB, HDFS), query engines (Trino), schedulers (Airflow), and ETL/ELT tools (DBT). It supports MySQL, Hadoop, Hive, Kudu, and more.

Language: Dockerfile - Size: 7.95 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

Kaushik-Puttaswamy/Flight-Booking-Airflow-CI-CD

This project automates flight booking data processing using Apache Airflow, PySpark, and GCP with CI/CD integration for efficient deployment across development and production environments. It orchestrates data transformation, storage in BigQuery, and deployment via GitHub Actions.

Language: Python - Size: 439 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 1

AhmetFurkanDEMIR/airflow-spark-kafka-example

Airflow, Spark and Kafka example

Language: Dockerfile - Size: 532 KB - Last synced at: 26 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

Kaushik-Puttaswamy/Walmart-Sales-Data-Ingestion-and-Transformation-in-BigQuery-using-Airflow

An ETL pipeline that ingests Walmart sales data from Google Cloud Storage into BigQuery, automates table creation, and performs data transformation using SQL MERGE with Apache Airflow.

Language: Python - Size: 2.85 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

mikeroyal/Apache-Airflow-Guide

Apache Airflow Guide

Language: Python - Size: 279 KB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 28 - Forks: 12

kaushal07wick/FinSight

AI-powered analysis and vector search capabilities for JP Morgan Chase & Co. Earnings Call Transcripts.

Language: Python - Size: 6.36 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

kathisnehith/NYC311-requests-ETL-pipeline

The project of end to end ETL pipeline processing NYC 311 service request through API for analysis.

Language: Jupyter Notebook - Size: 2.64 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

choval/devairflow

Simple local development airflow image

Size: 111 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

An4PDM/SportStats-Pipeline

SportStats Pipeline é um projeto de ETL que coleta, processa e armazena dados de eventos esportivos diários utilizando a API TheSportsDB. O pipeline é orquestrado com Apache Airflow e armazena os dados em um banco de dados MySQL.

Language: Python - Size: 1.95 KB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

eea/eea-crawler

EEA Crawler contains the tasks (DAGs) used by Apache Airflow to index content from various EEA-Eionet websites into a central Elasticsearch (aka content hub).

Language: Python - Size: 488 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

abrahamkoloboe27/Random-User-Streaming-Pipeline

Data Engeenering Project - Data Pipeline

Language: Jupyter Notebook - Size: 128 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

dfds-data/dagcellent

Airflow DAG collection and utilities

Language: Python - Size: 1.06 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

arunvelsriram/dag-schedule-graph

Airflow plugin for visualising DAG schedules within 24 hour window of a day.

Language: Python - Size: 1.02 MB - Last synced at: 13 days ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 1

An4PDM/Aprendendo-a-criar-DAGs

Este repositório tem objetivo de organizar todas as DAGs que estou criando para fim de consultas posteriores.

Language: Python - Size: 10.7 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

Sahejpalarneja/news-data-pipeline

An ETL pipeline scraping web articles from various outlets. Orchestrated using airflow on a Azure based VM . Grafana used for monitoring

Language: Python - Size: 1.42 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

EssienCodeCraft/ETL_Pipeline_Airflow

ETL Pipeline Using Python3 & Airflow

Language: Python - Size: 1.95 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ryantanzr/Orchestrated-ETL

An orchestrated ETL pipeline with airflow and pandas

Language: Python - Size: 13.7 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

mrigankomi/End-to-End-GCP-DE-project-spotify-analytics

End-to-End GCP DE project: spotify analytics

Language: Python - Size: 26.4 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

sajonaro/airflow-docker

a quick airflow + dbt + astronomer-cosmos in docker demo

Language: Python - Size: 115 KB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

HakeemSalaudeen/travel-agency-project

Extracting the data from the API | Writing the extracted data to the Data Lake | Extracting the final required attributes to Data Warehouse.

Language: Python - Size: 908 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

RareCompute/airflow-pipelines

Bioinformatics pipelines for Apache Airflow

Language: Python - Size: 3.91 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

angelagonzalezp/airflow-example-dags

Example DAGs

Language: Python - Size: 4.88 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

fajri-yanti/Data-Pipeline-SuperStore

An automated data pipeline that extracts retail data from MySQL and loads it into PostgreSQL.

Language: Python - Size: 3.91 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

AM-Ankitgit/DimondPricePrediction_With-Docker-MLFLOW-and-Dagshub

Language: Jupyter Notebook - Size: 83 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

kishlayjeet/Twitter-Data-Pipeline-using-Airflow-and-AWS-S3

An end-to-end Twitter Data Pipeline that extracts data from Twitter and loads it into AWS S3.

Language: Python - Size: 20.5 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 13 - Forks: 6

RayanGAtech/HR-Roster-Change-Detection-Pipeline-with-Apache-Airflow-and-PostgreSQL

Automated data pipeline tailored for HR systems to process and analyze roster data. The pipeline leverages Apache Airflow to ingest and transform CSV files containing roster information, and PostgreSQL to store, deduplicate, and extract unique data points.

Language: Python - Size: 6.84 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

HakeemSalaudeen/CoreSentiments-project

This project is designed to reinforce my understanding of Data Pipeline Orchestration with Apache Airflow. The focus is on implementing a data pipeline that addresses data ingestion, processing, storage, and analysis. The project challenges me to apply my knowledge of Apache Airflow to solve a practical scenario based problem

Language: Python - Size: 7.81 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

IsmaheneLarbi/data_engineering_beginner

Creating a data pipeline to extract data from spotify and save the songs listened to everyday into a local sqlite db.

Language: Python - Size: 8.79 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

prneidhardt/Apache-Data-Pipeline

Sparkify project

Language: Jupyter Notebook - Size: 264 KB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

UDC-GAC/picsa

This repository gathers a set of projects to increase productivity through automation of serverless services in the cloud.

Language: HCL - Size: 1.02 MB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 3 - Forks: 0

Spock-Analytics/spock-airflow

A scalable Airflow-powered ETL pipeline designed for efficient extraction, transformation, and loading of data from Ethereum, Optimism, Arbitrum, Fantom, and Polygon blockchains.

Language: Python - Size: 215 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 1

raphaelmansuy/mwaa_cli

A simple AirFlow mwaa cli command utility. It can be used to pause all the DAGS for a MWAA environment

Language: Shell - Size: 123 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 3

yosrak5/Data-Streaming

This project involves the development of a robust data engineering pipeline that orchestrates the seamless ingestion, processing, and storage of data .

Language: Python - Size: 7.81 KB - Last synced at: 28 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

yosrak5/ETL-Twitter-Pipeline

Built an ETL Data Pipeline to extract data from Twitter , Preprocess it and load it into an AWS S3 Bucket

Size: 0 Bytes - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

chandulal/airflow-testing

Airflow Unit Tests and Integration Tests

Language: Python - Size: 465 KB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 256 - Forks: 45

cristian-rincon/action-composer-sync

This is a simple action that helps you to fetch your Apache Airflow DAGs to Google Cloud Composer

Language: Shell - Size: 3.91 KB - Last synced at: 15 days ago - Pushed at: about 3 years ago - Stars: 5 - Forks: 0

AnthonyByansi/Airflow-Data-Pipeline-Automation

Automate your data pipelines using Apache Airflow with this ready-to-use DAG for data integration, ETL and workflow automation.

Size: 60 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 8 - Forks: 0

data-burst/airflow-git-sync

Sync DAG changes from Git to Airflow

Size: 122 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 43 - Forks: 7

kandarpagalas/internet-speed-monitoring-airflow

Airflow DAG to test internet speed and send alerts

Language: Python - Size: 7.81 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

AlenaLes/ETL

ETL задача. Ежедневная выгрузка данных и формирование/отправка новой таблицы в tabix

Language: Python - Size: 12.7 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

k0rsakov/scd_dag_factory

Фабрика DAG через SCD-таблицу с конфигурациями

Language: Python - Size: 25.4 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 1

k0rsakov/dag_factory

Фабрика DAG

Language: Python - Size: 17.6 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 2

Sanjay-dev-ds/churn-data-analytics-data-pipeline

This project implements a scalable ETL pipeline using Amazon RDS for data storage, Amazon S3 for intermediate staging, and AWS Glue Crawler for metadata management. Data is efficiently queried through Amazon Redshift. Apache Airflow orchestrates the workflow, automating data extraction, loading, and transformation.

Language: Jupyter Notebook - Size: 1.38 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

VinkeArtunduaga/dataJobs_project

Gracias a este proyecto de ETL se es capaz de analizar las ofertas laborales en el campo de los datos y con el propósito de identificar patrones y tendencias que revelen las habilidades más demandadas en el mercado laboral actual.

Language: HTML - Size: 1.35 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

igorlangoni/online_retail_data_pipeline

An end-to-end pipeline that ingests raw data from CSV files through Airflow DAGS into BigQuery. From there, it uses dbt to normalize and clean the data and afterwards to make the transformations and come up with relevan reports.

Language: Python - Size: 15.4 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

ohadmata/simple-dag-editor

Zero configuration Airflow plugin that let you manage your DAG files.

Language: Python - Size: 394 KB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 38 - Forks: 3

ManoharVit/ECommerce-Dive-Deep-Sales-Analysis

In this project, we developed an ETL pipeline using Apache Airflow to process delivery data and track delayed shipments. The pipeline downloads data from an AWS S3 bucket, cleans it using Spark/Spark SQL to identify missing delivery deadlines, and uploads the cleaned dataset back to S3. This ensures efficient delivery performance tracking.

Language: Jupyter Notebook - Size: 134 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 0 - Forks: 2

john-thuo1/popular_movies_etl

Airflow ETL with AWS, Docker and Postgres consuming TMDb API

Language: Python - Size: 11.7 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

nbdevs/RDWA

Real-time Data Warehousing with Airflow: An events based microservices pipeline.

Size: 14.6 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

KSwaviman/ETL_with_Airbyte

This project showcases an ELT pipeline that extracts JSON data, loads it into a PostgreSQL database, applies transformations using Python scripts, saves the transformed data in a CSV file, and shares it through a FastAPI endpoint.

Language: Python - Size: 10.7 KB - Last synced at: 10 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

DistilledCode/mmrl

Multi-Modal Representational Learning for Social Media Popularity Prediction

Language: Python - Size: 27.3 KB - Last synced at: 10 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

tulibraries/manifold_airflow_dags 📦

Airflow DAGs for the Manifold (TUL Website) application

Language: Python - Size: 1.21 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

jacksonpf1/spotify-user-analysis

ETL process and EDA of user top artists & tracks data in Spotify using Spotipy, Pandas, Airflow and Seaborn

Language: Jupyter Notebook - Size: 466 KB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

AyorindeTayo/MLOPS-Airflow-MLflow-Docker1

Automation of Iris flower classes Mlflow experimental loging and prediction

Language: Jupyter Notebook - Size: 124 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

dzrekenathan/dbt-data-pipeline

A dbt data pipeline capstone project.

Language: Python - Size: 116 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

Turnipdo/Real-Time-BTC-USD-Airflow-DAG-Extract-In-Excel

Using yfinance, we grab minute-by-minute BTC-USD data, dump it into PostgreSQL, and link Excel via ODBC for quick analysis!

Language: Python - Size: 56.6 KB - Last synced at: 30 days ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

bhavanachitragar/zillow-data-analytics

A Python script extracts data from Zillow and stores it in an initial S3 bucket. Then, Lambda functions handle the flow: copying the data to a processing bucket and transforming it from JSON to CSV format. The final CSV data resides in another S3 bucket, ready to be loaded into Amazon Redshift for in-depth analysis. QuickSight for visualizations

Language: Python - Size: 66.4 KB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

kalyani33/twitter-etl-with-airflow

Ochestraction of ETL process with Apache Airflow

Language: Python - Size: 509 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

vitorjpc10/ETL-Pipeline--dbt--Snowflake--Airflow-

This project demonstrates how to build an ELT pipeline using dbt, Snowflake, and Airflow. Follow the steps below to set up your environment, configure dbt, create models, macros, tests, and deploy on Airflow.

Language: Python - Size: 86.9 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

Chinaskidev/ETL-Clima-ElSalvador

MLOps, haciendo un ETL sencillo usando Docker y Airflow y Google Cloud

Language: Python - Size: 50.8 KB - Last synced at: 29 days ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 1

adrianmarino/thesis-paper

Collaborative and hybrid recommendation systems

Language: Jupyter Notebook - Size: 353 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 1

xennen/DataEngineerYP

Data Engineer projects

Language: Python - Size: 513 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

rohithlanka/weatherdatapipeline

A project leveraging Python scripts to fetch weather data from an API, transformed it using DBT in Snowflake, and orchestrated the workflow with Apache Airflow for seamless data integration into reporting tool, ensuring streamlined data-driven insights.(reporting tool- work in progress)

Language: Python - Size: 11.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0