GitHub topics: airflow-dags
vcovelli/sc-dash
End-to-end supply chain data pipeline using Airflow, MongoDB, and PostgreSQL with Django REST API for analytics.
Language: Python - Size: 249 KB - Last synced at: about 4 hours ago - Pushed at: about 5 hours ago - Stars: 1 - Forks: 0

marlonmoreira1/Skill-em-Dados
O objetivo deste projeto é contribuir com a formação de iniciantes que almejam entrar na área de dados, fornecendo uma visão baseada em dados sobre as habilidades e conhecimentos mais demandados pelo mercado. Através da coleta e análise de vagas de emprego/estágio, o projeto visa responder à pergunta: “Como se tornar um profissional de dados?"
Language: Python - Size: 87.6 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

Hippaho/Sparkify
A music streaming company, Sparkify, has decided that it is time to introduce more automation and monitoring to their data warehouse ETL pipelines and come to the conclusion that the best tool to achieve this is Apache Airflow.
Language: Python - Size: 17.6 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

airflow-laminar/airflow-common-operators
Common Airflow Operators / Tasks
Language: Python - Size: 310 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 4 - Forks: 0

iobruno/data-engineering-examples
Data Engineering examples for Airflow, Prefect; dbt for BigQuery, Redshift, ClickHouse, Postgres, DuckDB; PySpark for Batch processing; Kafka for Stream processing
Language: Python - Size: 5.02 MB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 64 - Forks: 2

95xin/Premier-League-Data-Engineering-Project
Data Engineering Project -Premier League Datasets
Language: Python - Size: 0 Bytes - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 1 - Forks: 0

lukez42/pinterest-data-pipeline542
This project builds a scalable data pipeline for processing Pinterest-like data, integrating AWS services like Amazon Kinesis for streaming, Apache Kafka for batch processing, Apache Airflow for orchestration, and Databricks for data transformation.
Language: Jupyter Notebook - Size: 3.87 MB - Last synced at: 4 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

tulibraries/cob_datapipeline
Airflow Data Processing Pipeline for TUL Catalog on Blacklight Data
Language: Python - Size: 3.46 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 6 - Forks: 0

tulibraries/funcake_dags
Airflow DAGs for PA Digital aggregation processes
Language: Python - Size: 3.06 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 4 - Forks: 1

status-im/airflow-dags
Status BI python DAGs for Airflow
Language: Python - Size: 204 KB - Last synced at: about 4 hours ago - Pushed at: 13 days ago - Stars: 0 - Forks: 2

msaakaash/csv-datawarehouse-etl-airflow
An end-to-end ETL pipeline to extract sales data from CSV, transform and load into PostgreSQL warehouse. Automated and scheduled using Apache Airflow.
Language: Python - Size: 3.91 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

IwasakiYuuki/data-analysis-platform-airflow-dag
A collection of Airflow DAGs for automating data collection into our on-premises data analysis platform.
Language: Python - Size: 88.9 KB - Last synced at: 2 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

DunnBC22/Data_Engineering_Projects
This repository includes data engineering projects using Apache Airflow. I hope to add more projects using different technologies soon!
Language: TSQL - Size: 60.5 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 4 - Forks: 0

gestaogovbr/Ro-dou
Gerador de DAGs no Apache Airflow para fazer clipping do Diário Oficial da União.
Language: Python - Size: 3.96 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 108 - Forks: 31

yuhexiong/airflow-dag-kafka-flink-doris-python
Language: Python - Size: 43 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 1 - Forks: 0

tashi-2004/Apache-Airflow-Kafka-Spark-DeltaLake-Real-Time-Stream-Pipeline
This project implements a real-time data pipeline using Apache Airflow, Kafka, Apache Spark, and Delta Lake. It supports both batch (Coldpath) and real-time (Hotpath) data ingestion, processing, and storage. Airflow is used for orchestrating the data workflows.
Language: Python - Size: 12.5 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

matsudan/airflow-dag-examples
Apache Airflow DAG examples
Language: Python - Size: 212 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 2 - Forks: 1

tulibraries/tulflow
TU Libraries Python Library for functions used in indexing ETL, particularly for Airflow
Language: Python - Size: 2.52 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 3 - Forks: 1

mohiteamit/MLops-lead-scoring-system
Automated machine learning pipeline with Airflow and model experiments using MLflow
Language: Jupyter Notebook - Size: 25 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

wcharczuk/temporalflow
A workflow to process graphs of activities in parallel.
Language: Go - Size: 44.9 KB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

krishna-aditi/nlp-sentiment-analysis-on-stock-news-and-price-monitoring
WebApp to bring together Text Summarization and Sentiment Analysis of the stock related news to better understand the stock price trends.
Language: Jupyter Notebook - Size: 3.07 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 15 - Forks: 6

krishna-aditi/Sevir-Lambda-APIs
SEVIR Lambda functions for Summarization and Named-Entity Recognition. Deployed on AWS ECR.
Language: Python - Size: 2.11 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

astronomer/astro-provider-databricks 📦
Orchestrate your Databricks notebooks in Airflow and execute them as Databricks Workflows
Language: Python - Size: 11.1 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 23 - Forks: 12

nathadriele/dock-financial-data-pipelines
Automated pipeline for generating and processing Dock balance reports using Apache Airflow, SFTP, AWS S3, and Lambda.
Language: Python - Size: 11.7 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 7 - Forks: 0

micheldpd24/cust_review_analysis
End-to-End Review Analysis Pipeline : Scrape, process, and analyze customer reviews from Trustpilot using Apache Airflow and Python | Interactive Dash Dashboard : Visualize sentiment, trends, and textual insights with a dynamic, Dockerized Dash app.
Language: Jupyter Notebook - Size: 315 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

enricogoerlitz/explore-airflow
Apache Airflow project for orchestrating ETL workflows in a data warehouse environment. Implements a medallion architecture (Bronze, Silver, Gold layers) with Postgres integration for scalable and modular data processing pipelines.
Language: Python - Size: 1.33 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

micheldpd24/mlops_air_msr
MLOps Pipeline for Music Recommendation - Spotify playlist continuation
Language: Python - Size: 25.5 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

sangwanamit621/learning-and-experiments
Welcome to the Learning and Experiments Hub—a dynamic repository capturing my journey of exploration and experimentation in the vast world of technology. This space serves as a digital canvas where I document my learning process, experiments, and discoveries.
Language: Jupyter Notebook - Size: 53.7 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

sergio11/lyric_wave_architecture
🎵 LyricWave - Your AI Music Composer 🎶 Compose Unique MP4 Songs Effortlessly! LyricWave uses AI to create personalized music by harmonizing lyrics with captivating melodies and synthetic vocals. Unleash your musical creativity today! 🚀🎶
Language: Python - Size: 29 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 29 - Forks: 6

rahulsamant37/End-to-End-Flask-App
An end-to-end data science workflow that includes a Flask-based machine learning application, an automated ETL pipeline with Airflow for data extraction and transformation, and a comprehensive data analysis module using Pandas for statistical analysis, manipulation, and visualization.
Language: Jupyter Notebook - Size: 928 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

lema-ufpb/xcom-cleaner
This project implements a pipeline to clean the history of XCOM variables stored by Airflow.
Language: Python - Size: 6.84 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

WilsonH918/Data_Pipeline_ETL_Pipeline_Web3_Token
Data Pipeline Project
Language: Python - Size: 17.6 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

Dina-Hosny/ETL-Data-Pipeline-using-AirFlow
An ETL Data Pipelines Project that uses AirFlow DAGs to extract employees' data from PostgreSQL Schemas, load it in AWS Data Lake, Transform it with Python script, and Finally load it into SnowFlake Data warehouse using SCD type 2.
Language: Python - Size: 89.8 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 1

RayVilaca/mini-projeto-ETL-video-game-sales
Este projeto é um pipeline ETL usando Apache Airflow para orquestrar a extração, transformação e carregamento (ETL) de dados de vendas de video games. Os dados são extraídos do Kaggle, transformados para limpar e filtrar os dados, validados usando Soda SQL e, em seguida, carregados em um bucket S3.
Language: Python - Size: 8.79 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

spuerta10/MLOpsCreditRisk
Implementación de técnicas de MLOps para la puesta en producción de un modelo de riesgo crediticio. Contiene pipelines para el entrenamiento, validación, despliegue y monitoreo del modelo, además de documentación sobre la infraestructura utilizada y estrategias para la automatización del ciclo de vida del modelo.
Language: Jupyter Notebook - Size: 1.18 MB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

AlvaroCavalcante/airflow-custom-deferrable-dataflow-operator
Start your Dataflow jobs execution directly from the Triggerer without going to the Worker!
Language: Python - Size: 36.1 KB - Last synced at: 8 days ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

potreic/ETL-Fashion-Trend-Analysis
✨ Automate fashion trend analysis with Apache Airflow! Extract data from X & Pinterest, transform into insights, and load into PostgreSQL. Predict seasonal styles & visualize trends. 💃📊
Language: Python - Size: 168 KB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

Pirate-Emperor/BigData-Pipeline
BigData Pipeline is a local testing environment for experimenting with various storage solutions (RDB, HDFS), query engines (Trino), schedulers (Airflow), and ETL/ELT tools (DBT). It supports MySQL, Hadoop, Hive, Kudu, and more.
Language: Dockerfile - Size: 7.95 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

Kaushik-Puttaswamy/Flight-Booking-Airflow-CI-CD
This project automates flight booking data processing using Apache Airflow, PySpark, and GCP with CI/CD integration for efficient deployment across development and production environments. It orchestrates data transformation, storage in BigQuery, and deployment via GitHub Actions.
Language: Python - Size: 439 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 1

AhmetFurkanDEMIR/airflow-spark-kafka-example
Airflow, Spark and Kafka example
Language: Dockerfile - Size: 532 KB - Last synced at: 26 days ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

Kaushik-Puttaswamy/Walmart-Sales-Data-Ingestion-and-Transformation-in-BigQuery-using-Airflow
An ETL pipeline that ingests Walmart sales data from Google Cloud Storage into BigQuery, automates table creation, and performs data transformation using SQL MERGE with Apache Airflow.
Language: Python - Size: 2.85 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

mikeroyal/Apache-Airflow-Guide
Apache Airflow Guide
Language: Python - Size: 279 KB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 28 - Forks: 12

kaushal07wick/FinSight
AI-powered analysis and vector search capabilities for JP Morgan Chase & Co. Earnings Call Transcripts.
Language: Python - Size: 6.36 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

kathisnehith/NYC311-requests-ETL-pipeline
The project of end to end ETL pipeline processing NYC 311 service request through API for analysis.
Language: Jupyter Notebook - Size: 2.64 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

choval/devairflow
Simple local development airflow image
Size: 111 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

An4PDM/SportStats-Pipeline
SportStats Pipeline é um projeto de ETL que coleta, processa e armazena dados de eventos esportivos diários utilizando a API TheSportsDB. O pipeline é orquestrado com Apache Airflow e armazena os dados em um banco de dados MySQL.
Language: Python - Size: 1.95 KB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

eea/eea-crawler
EEA Crawler contains the tasks (DAGs) used by Apache Airflow to index content from various EEA-Eionet websites into a central Elasticsearch (aka content hub).
Language: Python - Size: 488 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

abrahamkoloboe27/Random-User-Streaming-Pipeline
Data Engeenering Project - Data Pipeline
Language: Jupyter Notebook - Size: 128 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

dfds-data/dagcellent
Airflow DAG collection and utilities
Language: Python - Size: 1.06 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

arunvelsriram/dag-schedule-graph
Airflow plugin for visualising DAG schedules within 24 hour window of a day.
Language: Python - Size: 1.02 MB - Last synced at: 13 days ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 1

An4PDM/Aprendendo-a-criar-DAGs
Este repositório tem objetivo de organizar todas as DAGs que estou criando para fim de consultas posteriores.
Language: Python - Size: 10.7 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

Sahejpalarneja/news-data-pipeline
An ETL pipeline scraping web articles from various outlets. Orchestrated using airflow on a Azure based VM . Grafana used for monitoring
Language: Python - Size: 1.42 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

EssienCodeCraft/ETL_Pipeline_Airflow
ETL Pipeline Using Python3 & Airflow
Language: Python - Size: 1.95 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ryantanzr/Orchestrated-ETL
An orchestrated ETL pipeline with airflow and pandas
Language: Python - Size: 13.7 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

mrigankomi/End-to-End-GCP-DE-project-spotify-analytics
End-to-End GCP DE project: spotify analytics
Language: Python - Size: 26.4 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

sajonaro/airflow-docker
a quick airflow + dbt + astronomer-cosmos in docker demo
Language: Python - Size: 115 KB - Last synced at: 7 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

HakeemSalaudeen/travel-agency-project
Extracting the data from the API | Writing the extracted data to the Data Lake | Extracting the final required attributes to Data Warehouse.
Language: Python - Size: 908 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

RareCompute/airflow-pipelines
Bioinformatics pipelines for Apache Airflow
Language: Python - Size: 3.91 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

angelagonzalezp/airflow-example-dags
Example DAGs
Language: Python - Size: 4.88 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

fajri-yanti/Data-Pipeline-SuperStore
An automated data pipeline that extracts retail data from MySQL and loads it into PostgreSQL.
Language: Python - Size: 3.91 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

AM-Ankitgit/DimondPricePrediction_With-Docker-MLFLOW-and-Dagshub
Language: Jupyter Notebook - Size: 83 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

kishlayjeet/Twitter-Data-Pipeline-using-Airflow-and-AWS-S3
An end-to-end Twitter Data Pipeline that extracts data from Twitter and loads it into AWS S3.
Language: Python - Size: 20.5 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 13 - Forks: 6

RayanGAtech/HR-Roster-Change-Detection-Pipeline-with-Apache-Airflow-and-PostgreSQL
Automated data pipeline tailored for HR systems to process and analyze roster data. The pipeline leverages Apache Airflow to ingest and transform CSV files containing roster information, and PostgreSQL to store, deduplicate, and extract unique data points.
Language: Python - Size: 6.84 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

HakeemSalaudeen/CoreSentiments-project
This project is designed to reinforce my understanding of Data Pipeline Orchestration with Apache Airflow. The focus is on implementing a data pipeline that addresses data ingestion, processing, storage, and analysis. The project challenges me to apply my knowledge of Apache Airflow to solve a practical scenario based problem
Language: Python - Size: 7.81 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

IsmaheneLarbi/data_engineering_beginner
Creating a data pipeline to extract data from spotify and save the songs listened to everyday into a local sqlite db.
Language: Python - Size: 8.79 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

prneidhardt/Apache-Data-Pipeline
Sparkify project
Language: Jupyter Notebook - Size: 264 KB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

UDC-GAC/picsa
This repository gathers a set of projects to increase productivity through automation of serverless services in the cloud.
Language: HCL - Size: 1.02 MB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 3 - Forks: 0

Spock-Analytics/spock-airflow
A scalable Airflow-powered ETL pipeline designed for efficient extraction, transformation, and loading of data from Ethereum, Optimism, Arbitrum, Fantom, and Polygon blockchains.
Language: Python - Size: 215 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 1

raphaelmansuy/mwaa_cli
A simple AirFlow mwaa cli command utility. It can be used to pause all the DAGS for a MWAA environment
Language: Shell - Size: 123 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 3

yosrak5/Data-Streaming
This project involves the development of a robust data engineering pipeline that orchestrates the seamless ingestion, processing, and storage of data .
Language: Python - Size: 7.81 KB - Last synced at: 28 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

yosrak5/ETL-Twitter-Pipeline
Built an ETL Data Pipeline to extract data from Twitter , Preprocess it and load it into an AWS S3 Bucket
Size: 0 Bytes - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

chandulal/airflow-testing
Airflow Unit Tests and Integration Tests
Language: Python - Size: 465 KB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 256 - Forks: 45

cristian-rincon/action-composer-sync
This is a simple action that helps you to fetch your Apache Airflow DAGs to Google Cloud Composer
Language: Shell - Size: 3.91 KB - Last synced at: 15 days ago - Pushed at: about 3 years ago - Stars: 5 - Forks: 0

AnthonyByansi/Airflow-Data-Pipeline-Automation
Automate your data pipelines using Apache Airflow with this ready-to-use DAG for data integration, ETL and workflow automation.
Size: 60 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 8 - Forks: 0

data-burst/airflow-git-sync
Sync DAG changes from Git to Airflow
Size: 122 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 43 - Forks: 7

kandarpagalas/internet-speed-monitoring-airflow
Airflow DAG to test internet speed and send alerts
Language: Python - Size: 7.81 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

AlenaLes/ETL
ETL задача. Ежедневная выгрузка данных и формирование/отправка новой таблицы в tabix
Language: Python - Size: 12.7 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

k0rsakov/scd_dag_factory
Фабрика DAG через SCD-таблицу с конфигурациями
Language: Python - Size: 25.4 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 1

k0rsakov/dag_factory
Фабрика DAG
Language: Python - Size: 17.6 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 2

Sanjay-dev-ds/churn-data-analytics-data-pipeline
This project implements a scalable ETL pipeline using Amazon RDS for data storage, Amazon S3 for intermediate staging, and AWS Glue Crawler for metadata management. Data is efficiently queried through Amazon Redshift. Apache Airflow orchestrates the workflow, automating data extraction, loading, and transformation.
Language: Jupyter Notebook - Size: 1.38 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

VinkeArtunduaga/dataJobs_project
Gracias a este proyecto de ETL se es capaz de analizar las ofertas laborales en el campo de los datos y con el propósito de identificar patrones y tendencias que revelen las habilidades más demandadas en el mercado laboral actual.
Language: HTML - Size: 1.35 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

igorlangoni/online_retail_data_pipeline
An end-to-end pipeline that ingests raw data from CSV files through Airflow DAGS into BigQuery. From there, it uses dbt to normalize and clean the data and afterwards to make the transformations and come up with relevan reports.
Language: Python - Size: 15.4 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

ohadmata/simple-dag-editor
Zero configuration Airflow plugin that let you manage your DAG files.
Language: Python - Size: 394 KB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 38 - Forks: 3

ManoharVit/ECommerce-Dive-Deep-Sales-Analysis
In this project, we developed an ETL pipeline using Apache Airflow to process delivery data and track delayed shipments. The pipeline downloads data from an AWS S3 bucket, cleans it using Spark/Spark SQL to identify missing delivery deadlines, and uploads the cleaned dataset back to S3. This ensures efficient delivery performance tracking.
Language: Jupyter Notebook - Size: 134 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 0 - Forks: 2

john-thuo1/popular_movies_etl
Airflow ETL with AWS, Docker and Postgres consuming TMDb API
Language: Python - Size: 11.7 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

nbdevs/RDWA
Real-time Data Warehousing with Airflow: An events based microservices pipeline.
Size: 14.6 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

KSwaviman/ETL_with_Airbyte
This project showcases an ELT pipeline that extracts JSON data, loads it into a PostgreSQL database, applies transformations using Python scripts, saves the transformed data in a CSV file, and shares it through a FastAPI endpoint.
Language: Python - Size: 10.7 KB - Last synced at: 10 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

DistilledCode/mmrl
Multi-Modal Representational Learning for Social Media Popularity Prediction
Language: Python - Size: 27.3 KB - Last synced at: 10 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

tulibraries/manifold_airflow_dags 📦
Airflow DAGs for the Manifold (TUL Website) application
Language: Python - Size: 1.21 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

jacksonpf1/spotify-user-analysis
ETL process and EDA of user top artists & tracks data in Spotify using Spotipy, Pandas, Airflow and Seaborn
Language: Jupyter Notebook - Size: 466 KB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

AyorindeTayo/MLOPS-Airflow-MLflow-Docker1
Automation of Iris flower classes Mlflow experimental loging and prediction
Language: Jupyter Notebook - Size: 124 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

dzrekenathan/dbt-data-pipeline
A dbt data pipeline capstone project.
Language: Python - Size: 116 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

Turnipdo/Real-Time-BTC-USD-Airflow-DAG-Extract-In-Excel
Using yfinance, we grab minute-by-minute BTC-USD data, dump it into PostgreSQL, and link Excel via ODBC for quick analysis!
Language: Python - Size: 56.6 KB - Last synced at: 30 days ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

bhavanachitragar/zillow-data-analytics
A Python script extracts data from Zillow and stores it in an initial S3 bucket. Then, Lambda functions handle the flow: copying the data to a processing bucket and transforming it from JSON to CSV format. The final CSV data resides in another S3 bucket, ready to be loaded into Amazon Redshift for in-depth analysis. QuickSight for visualizations
Language: Python - Size: 66.4 KB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

kalyani33/twitter-etl-with-airflow
Ochestraction of ETL process with Apache Airflow
Language: Python - Size: 509 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

vitorjpc10/ETL-Pipeline--dbt--Snowflake--Airflow-
This project demonstrates how to build an ELT pipeline using dbt, Snowflake, and Airflow. Follow the steps below to set up your environment, configure dbt, create models, macros, tests, and deploy on Airflow.
Language: Python - Size: 86.9 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

Chinaskidev/ETL-Clima-ElSalvador
MLOps, haciendo un ETL sencillo usando Docker y Airflow y Google Cloud
Language: Python - Size: 50.8 KB - Last synced at: 29 days ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 1

adrianmarino/thesis-paper
Collaborative and hybrid recommendation systems
Language: Jupyter Notebook - Size: 353 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 1

xennen/DataEngineerYP
Data Engineer projects
Language: Python - Size: 513 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

rohithlanka/weatherdatapipeline
A project leveraging Python scripts to fetch weather data from an API, transformed it using DBT in Snowflake, and orchestrated the workflow with Apache Airflow for seamless data integration into reporting tool, ensuring streamlined data-driven insights.(reporting tool- work in progress)
Language: Python - Size: 11.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
