An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: airflow-dags

aBandicootCalledSmashes/airflow-logs-cleanup

Clean up old Airflow log files with a script or Airflow DAG. Frees disk space by deleting rotated logs, removing old files, and cleaning up empty directories.

Language: Python - Size: 4.88 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

krik8235/ml-data-pipeline

Data pipeline on Delta Lake Lakehouse architecture using Spark and Airflow DAGs

Language: Python - Size: 72.3 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

Hippaho/Sparkify

A music streaming company, Sparkify, has decided that it is time to introduce more automation and monitoring to their data warehouse ETL pipelines and come to the conclusion that the best tool to achieve this is Apache Airflow.

Language: Python - Size: 17.6 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

gestaogovbr/Ro-dou

Gerador de DAGs no Apache Airflow para fazer clipping do Diário Oficial da União.

Language: Python - Size: 4.03 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 155 - Forks: 53

tulibraries/funcake_dags

Airflow DAGs for PA Digital aggregation processes

Language: Python - Size: 3.09 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 4 - Forks: 1

neorock07/Shopee-ETL-Airflow

Self-project ETL with Airflow + Databricks

Language: Jupyter Notebook - Size: 27.5 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

status-im/airflow-dags

Status BI python DAGs for Airflow

Language: Python - Size: 259 KB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 2

eea/eea-crawler

EEA Crawler contains the tasks (DAGs) used by Apache Airflow to index content from various EEA-Eionet websites into a central Elasticsearch (aka content hub).

Language: Python - Size: 475 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

timothynn/airflow-pipelines

learning airflow pipelines

Language: Python - Size: 15.6 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

astronomer/astro-provider-databricks

Orchestrate your Databricks notebooks in Airflow and execute them as Databricks Workflows

Language: Python - Size: 11.1 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 23 - Forks: 12

iobruno/data-engineering-examples

Data Engineering examples for Airflow, Prefect; dbt for BigQuery, Redshift, ClickHouse, Postgres, DuckDB; PySpark for Batch processing; Kafka for Stream processing

Language: Python - Size: 5.07 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 66 - Forks: 3

gclembo/us-electricity-data-dashboard

Uses the EIA API for US electricity data, forecasts using SARIMAX model, plots dashboard using Matplotlib and Seaborn, workflow managed with Airflow

Language: Python - Size: 362 KB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

camilocbarrera/dst

dag sketch tool (dst) helps you create and design Apache Airflow DAGs visually and programmatically.

Language: TypeScript - Size: 291 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 6 - Forks: 0

tulibraries/tulflow

TU Libraries Python Library for functions used in indexing ETL, particularly for Airflow

Language: Python - Size: 2.67 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 3 - Forks: 1

airflow-laminar/airflow-common

Common Airflow Operators / Tasks

Language: Python - Size: 607 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 4 - Forks: 0

Idrissa08/Yandex-Practicum

Yandex-Practicum repo with projects, exercises, and solutions from Yandex Practicum in Python, data analysis, ML, and web development for portfolio. 🐙

Size: 3.91 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

datatweets/airflow-pyspark-k8s

Run Apache Airflow with KubernetesExecutor and PySpark on Kubernetes using Helm charts and Kind for local development

Language: Python - Size: 270 KB - Last synced at: 14 days ago - Pushed at: 15 days ago - Stars: 2 - Forks: 1

tulibraries/cob_datapipeline

Airflow Data Processing Pipeline for TUL Catalog on Blacklight Data

Language: Python - Size: 3.13 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 6 - Forks: 0

royungar/ETL_Toll_Data_Pipeline_Project

Final project for IBM’s Data Engineering certificate (Course 8). ETL pipeline built with Apache Airflow and Bash to extract, transform, and consolidate toll data from CSV, TSV, and fixed-width files.

Language: Python - Size: 932 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

ritesh-1227/elt-pipeline-gcp-airflow

An ELT (Extract, Load, Transform) data pipeline to process 1 million+ records using Google Cloud Platform (GCP) and Apache Airflow

Language: Python - Size: 279 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

grilled-swampert/dag_practice

Language: Python - Size: 12.7 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

choval/devairflow

Simple local development airflow image

Size: 182 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 1 - Forks: 0

Kiran8053/Airflow-Orchestrated-ETL-with-Snowflake-Tableau-PowerBI

An end to end data analytics work that consumes data from AWS S3, executing ETL process and creating datamart on Snowflake following to that a PowerBI report created by using datamart.

Language: Python - Size: 1.52 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

data-burst/airflow-git-sync

Sync DAG changes from Git to Airflow

Size: 122 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 64 - Forks: 12

AlexYe-MapleLeafs/CEBA_Process

Canada Emergency Business Account (CEBA) Process Automation in GCP

Language: Python - Size: 26.4 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 1 - Forks: 0

AlexYe-MapleLeafs/Automate-Dataproc-Process-in-GCP

This Repo Demonstrate General Process to Automate Process in GCP Dataproc to Leverage Its Processing Power

Language: Python - Size: 118 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

Shirlyngit/ELT-Pipeline-with-gcp-airflow-looker-studio

Scalable ELT pipeline on GCP using Airflow and BigQuery to ingest, validate, and transform 1M+ anonymized medical records and visualized in Looker Studio."

Size: 42.7 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

MarkPhamm/skytrax-reviews

A comprehensive ETL pipeline for analyzing passenger satisfaction data. Features a modern data architecture with Apache Airflow for extraction, dbt/Snowflake for transformation, Python/Pandas for cleaning, and interactive dashboards for visualization with NextJS.

Size: 154 MB - Last synced at: 24 days ago - Pushed at: about 1 month ago - Stars: 11 - Forks: 4

saket1893/Airflow-ETL

Apache Airflow ETL

Language: Python - Size: 5.86 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

abs-hasan/automated-weather-data-pipeline

🌦️ A Dockerized data pipeline using Apache Airflow to fetch daily weather data from the OpenWeather API and store it in AWS S3.

Language: Python - Size: 25.4 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

CamilaJaviera91/gcp-new

This project defines a modern data pipeline architecture using Airflow, DBT, and PostgreSQL. Below you'll find instructions on how to get started and how the repository is structured.

Language: Python - Size: 165 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

lukez42/pinterest-data-pipeline542

This project builds a scalable data pipeline for processing Pinterest-like data, integrating AWS services like Amazon Kinesis for streaming, Apache Kafka for batch processing, Apache Airflow for orchestration, and Databricks for data transformation.

Language: Jupyter Notebook - Size: 3.88 MB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

sergio11/lyric_wave_architecture

🎵 LyricWave – AI Music Composer (Proof of Concept) 🎶 A personal project exploring automatic generation of unique MP4 songs. LyricWave blends lyrics with AI-generated melodies and synthetic vocals to experiment with new forms of musical expression. A creative testbed to push your ideas into sound. 🚀🎧

Language: Python - Size: 29 MB - Last synced at: 26 days ago - Pushed at: 3 months ago - Stars: 35 - Forks: 6

hadiuzzaman524/python-clean-architecture

A scalable COVID-19 ETL pipeline built with Python, Airflow, and BigQuery, following Clean Architecture and Domain-Driven Design principles. Designed for modularity, testability, and production-ready data workflows in a Dockerized environment.

Language: Python - Size: 1.91 MB - Last synced at: 30 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

karthikgarimella37/EPL_Fotmob

EPL Fotmob Data Visualizations

Language: Python - Size: 1.52 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 0

vcovelli/sc-dash

Supply chain management platform designed for real-time analytics and automation.

Language: Python - Size: 1.29 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

godefroylmb/Billboard

Fetching data about billboard's charts and uploading it to kaggle

Language: Python - Size: 8.79 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

kuldeep27396/airflow-projects-deployed

This project contains a comprehensive collection of Apache Airflow DAGs designed for learning Airflow concepts from basics to advanced levels. The project includes 25 different DAGs covering various operators, patterns, and production scenarios, all deployed and tested using Astronomer Cloud.

Language: Python - Size: 4.17 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

wcharczuk/temporalflow

A workflow to process graphs of activities in parallel.

Language: Go - Size: 77.1 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 1

shiningflash/apache-airflow-workflow-manager

Simple Apache Airflow setup with both standalone and Docker-Compose workflows for quick orchestration experiments.

Language: Python - Size: 10.7 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Kenmaaa05/ev-adoption-analysis

EV adoption trends and policy insights in Washington using Python, PostgreSQL, and Power BI.

Language: Jupyter Notebook - Size: 35.6 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

karo23361/etl_pandas_airflow

ETL Using Python and Airflow

Language: Jupyter Notebook - Size: 561 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

chandulal/airflow-testing

Airflow Unit Tests and Integration Tests

Language: Python - Size: 465 KB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 260 - Forks: 44

Data-Bishop/Team5-BuildItAll-Data-Platform

This repository contains the codebase for the BuildItAll Big Data Processing Platform, a case study project designed to manage large daily data for a hypothetical Belgian client.

Language: HCL - Size: 180 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 5 - Forks: 0

AbhijeetDasBakshi/ecommerce-insights

A Dockerized end-to-end project that combines unsupervised machine learning for customer segmentation with scalable data pipelines. It uses MongoDB for data ingestion, Scikit-learn for clustering, Airflow for orchestration, and Streamlit for interactive visualization — enabling actionable insights into e-commerce

Language: Jupyter Notebook - Size: 657 KB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

yuhexiong/airflow-dag-kafka-flink-doris-python

Language: Python - Size: 43 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

IwasakiYuuki/data-analysis-platform-etl

A collection of Airflow DAGs for automating data collection into our on-premises data analysis platform.

Language: Python - Size: 206 KB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

sarabesh/reddit-recsys

This project is a real-time multimodal recommendation system built on top of Reddit data. It processes image-caption pairs using CLIP to create joint embeddings, stores them in Qdrant, and supports semantic retrieval based on text or image input.

Language: Python - Size: 13.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

DunnBC22/Data_Engineering_Projects

This repository includes data engineering projects using Apache Airflow. I hope to add more projects using different technologies soon!

Language: TSQL - Size: 62.7 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 4 - Forks: 0

msaakaash/csv-datawarehouse-etl-airflow

An end-to-end ETL pipeline to extract sales data from CSV, transform and load into PostgreSQL warehouse. Automated and scheduled using Apache Airflow.

Language: Python - Size: 9.77 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

An4PDM/SportStats-Pipeline

SportStats Pipeline é um projeto de ETL que coleta, processa e armazena dados de eventos esportivos diários utilizando a API TheSportsDB. O pipeline é orquestrado com Apache Airflow e armazena os dados em um banco de dados MySQL.

Language: Python - Size: 1.95 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

balaji1233/Data_Engineering_Projects

Data Engineering Learning

Language: Python - Size: 6.84 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

marlonmoreira1/Skill-em-Dados

O objetivo deste projeto é contribuir com a formação de iniciantes que almejam entrar na área de dados, fornecendo uma visão baseada em dados sobre as habilidades e conhecimentos mais demandados pelo mercado. Através da coleta e análise de vagas de emprego/estágio, o projeto visa responder à pergunta: “Como se tornar um profissional de dados?"

Language: Python - Size: 87.6 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

AishwaryaGade02/End_to_End_ETL-Stock-and-Financial-News-Analysis

A project that brings together stock data and news articles to understand how news impacts market trends, helping make better investment decisions through clear and timely insights

Language: Python - Size: 12.7 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

95xin/Premier-League-Data-Engineering-Project

Data Engineering Project -Premier League Datasets

Language: Python - Size: 0 Bytes - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

micheldpd24/cust_review_analysis

End-to-End Customer Review Analysis Pipeline : Scrape, process, and analyze customer reviews from Trustpilot using Apache Airflow and Python | Interactive Dash Dashboard : Visualize sentiment, trends, and textual insights with a dynamic, Dockerized Dash app.

Language: Jupyter Notebook - Size: 2.78 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

micheldpd24/mlops_air_msr

MLOps Pipeline for Music Recommendation - Spotify playlist continuation

Language: Python - Size: 25.5 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 1

tashi-2004/Apache-Airflow-Kafka-Spark-DeltaLake-Real-Time-Stream-Pipeline

This project implements a real-time data pipeline using Apache Airflow, Kafka, Apache Spark, and Delta Lake. It supports both batch (Coldpath) and real-time (Hotpath) data ingestion, processing, and storage. Airflow is used for orchestrating the data workflows.

Language: Python - Size: 12.5 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

matsudan/airflow-dag-examples

Apache Airflow DAG examples

Language: Python - Size: 212 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 1

mohiteamit/MLops-lead-scoring-system

Automated machine learning pipeline with Airflow and model experiments using MLflow

Language: Jupyter Notebook - Size: 25 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

AlokTheDataGuy/etl-pipeline-airflow

How to build an ETL Pipeline Using Airflow ? This readme describes the contents of the project, as well as how to run Apache Airflow on your local machine.

Language: Python - Size: 9.77 KB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

krishna-aditi/nlp-sentiment-analysis-on-stock-news-and-price-monitoring

WebApp to bring together Text Summarization and Sentiment Analysis of the stock related news to better understand the stock price trends.

Language: Jupyter Notebook - Size: 3.07 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 15 - Forks: 6

krishna-aditi/Sevir-Lambda-APIs

SEVIR Lambda functions for Summarization and Named-Entity Recognition. Deployed on AWS ECR.

Language: Python - Size: 2.11 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

nathadriele/dock-financial-data-pipelines

Automated pipeline for generating and processing Dock balance reports using Apache Airflow, SFTP, AWS S3, and Lambda.

Language: Python - Size: 11.7 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 7 - Forks: 0

enricogoerlitz/explore-airflow

Apache Airflow project for orchestrating ETL workflows in a data warehouse environment. Implements a medallion architecture (Bronze, Silver, Gold layers) with Postgres integration for scalable and modular data processing pipelines.

Language: Python - Size: 1.33 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

sangwanamit621/learning-and-experiments

Welcome to the Learning and Experiments Hub—a dynamic repository capturing my journey of exploration and experimentation in the vast world of technology. This space serves as a digital canvas where I document my learning process, experiments, and discoveries.

Language: Jupyter Notebook - Size: 53.7 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

rahulsamant37/End-to-End-Flask-App

An end-to-end data science workflow that includes a Flask-based machine learning application, an automated ETL pipeline with Airflow for data extraction and transformation, and a comprehensive data analysis module using Pandas for statistical analysis, manipulation, and visualization.

Language: Jupyter Notebook - Size: 928 KB - Last synced at: 5 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

lema-ufpb/xcom-cleaner

This project implements a pipeline to clean the history of XCOM variables stored by Airflow.

Language: Python - Size: 6.84 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

WilsonH918/Data_Pipeline_ETL_Pipeline_Web3_Token

Data Pipeline Project

Language: Python - Size: 17.6 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

Dina-Hosny/ETL-Data-Pipeline-using-AirFlow

An ETL Data Pipelines Project that uses AirFlow DAGs to extract employees' data from PostgreSQL Schemas, load it in AWS Data Lake, Transform it with Python script, and Finally load it into SnowFlake Data warehouse using SCD type 2.

Language: Python - Size: 89.8 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 2

RayVilaca/mini-projeto-ETL-video-game-sales

Este projeto é um pipeline ETL usando Apache Airflow para orquestrar a extração, transformação e carregamento (ETL) de dados de vendas de video games. Os dados são extraídos do Kaggle, transformados para limpar e filtrar os dados, validados usando Soda SQL e, em seguida, carregados em um bucket S3.

Language: Python - Size: 8.79 KB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

spuerta10/MLOpsCreditRisk

Implementación de técnicas de MLOps para la puesta en producción de un modelo de riesgo crediticio. Contiene pipelines para el entrenamiento, validación, despliegue y monitoreo del modelo, además de documentación sobre la infraestructura utilizada y estrategias para la automatización del ciclo de vida del modelo.

Language: Jupyter Notebook - Size: 1.18 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

AlvaroCavalcante/airflow-custom-deferrable-dataflow-operator

Start your Dataflow jobs execution directly from the Triggerer without going to the Worker!

Language: Python - Size: 36.1 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

potreic/ETL-Fashion-Trend-Analysis

✨ Automate fashion trend analysis with Apache Airflow! Extract data from X & Pinterest, transform into insights, and load into PostgreSQL. Predict seasonal styles & visualize trends. 💃📊

Language: Python - Size: 168 KB - Last synced at: 5 months ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

Pirate-Emperor/BigData-Pipeline

BigData Pipeline is a local testing environment for experimenting with various storage solutions (RDB, HDFS), query engines (Trino), schedulers (Airflow), and ETL/ELT tools (DBT). It supports MySQL, Hadoop, Hive, Kudu, and more.

Language: Dockerfile - Size: 7.95 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 3 - Forks: 0

Kaushik-Puttaswamy/Flight-Booking-Airflow-CI-CD

This project automates flight booking data processing using Apache Airflow, PySpark, and GCP with CI/CD integration for efficient deployment across development and production environments. It orchestrates data transformation, storage in BigQuery, and deployment via GitHub Actions.

Language: Python - Size: 439 KB - Last synced at: 21 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 2

AhmetFurkanDEMIR/airflow-spark-kafka-example

Airflow, Spark and Kafka example

Language: Dockerfile - Size: 532 KB - Last synced at: 5 months ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 0

Kaushik-Puttaswamy/Walmart-Sales-Data-Ingestion-and-Transformation-in-BigQuery-using-Airflow

An ETL pipeline that ingests Walmart sales data from Google Cloud Storage into BigQuery, automates table creation, and performs data transformation using SQL MERGE with Apache Airflow.

Language: Python - Size: 2.85 MB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

mikeroyal/Apache-Airflow-Guide

Apache Airflow Guide

Language: Python - Size: 279 KB - Last synced at: 9 days ago - Pushed at: over 1 year ago - Stars: 28 - Forks: 14

kaushal07wick/FinSight

AI-powered analysis and vector search capabilities for JP Morgan Chase & Co. Earnings Call Transcripts.

Language: Python - Size: 6.36 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

kathisnehith/NYC311-requests-ETL-pipeline

The project of end to end ETL pipeline processing NYC 311 service request through API for analysis.

Language: Jupyter Notebook - Size: 2.64 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

akhilk2802/BigDataSystems

MlOps and data pipelines

Language: Python - Size: 10.1 MB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

ankitgmishra/AirflowDAG

Weather ETL using Airflow

Language: Python - Size: 7.81 KB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

abrahamkoloboe27/Random-User-Streaming-Pipeline

Data Engeenering Project - Data Pipeline

Language: Jupyter Notebook - Size: 128 KB - Last synced at: 6 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

dfds-data/dagcellent

Airflow DAG collection and utilities

Language: Python - Size: 1.06 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

arunvelsriram/dag-schedule-graph

Airflow plugin for visualising DAG schedules within 24 hour window of a day.

Language: Python - Size: 1.02 MB - Last synced at: about 20 hours ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 1

An4PDM/Aprendendo-a-criar-DAGs

Este repositório tem objetivo de organizar todas as DAGs que estou criando para fim de consultas posteriores.

Language: Python - Size: 10.7 KB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

Sahejpalarneja/news-data-pipeline

An ETL pipeline scraping web articles from various outlets. Orchestrated using airflow on a Azure based VM . Grafana used for monitoring

Language: Python - Size: 1.42 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

ryantanzr/Orchestrated-ETL

An orchestrated ETL pipeline with airflow and pandas

Language: Python - Size: 13.7 KB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

mrigankomi/End-to-End-GCP-DE-project-spotify-analytics

End-to-End GCP DE project: spotify analytics

Language: Python - Size: 26.4 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

sajonaro/airflow-docker

a quick airflow + dbt + astronomer-cosmos in docker demo

Language: Python - Size: 115 KB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

HakeemSalaudeen/travel-agency-project

Extracting the data from the API | Writing the extracted data to the Data Lake | Extracting the final required attributes to Data Warehouse.

Language: Python - Size: 908 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

RareCompute/airflow-pipelines

Bioinformatics pipelines for Apache Airflow

Language: Python - Size: 3.91 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

angelagonzalezp/airflow-example-dags

Example DAGs

Language: Python - Size: 4.88 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

fajri-yanti/Data-Pipeline-SuperStore

An automated data pipeline that extracts retail data from MySQL and loads it into PostgreSQL.

Language: Python - Size: 3.91 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

AM-Ankitgit/DimondPricePrediction_With-Docker-MLFLOW-and-Dagshub

Language: Jupyter Notebook - Size: 83 MB - Last synced at: 5 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

kishlayjeet/Twitter-Data-Pipeline-using-Airflow-and-AWS-S3

An end-to-end Twitter Data Pipeline that extracts data from Twitter and loads it into AWS S3.

Language: Python - Size: 20.5 KB - Last synced at: 5 months ago - Pushed at: about 2 years ago - Stars: 13 - Forks: 6

RayanGAtech/HR-Roster-Change-Detection-Pipeline-with-Apache-Airflow-and-PostgreSQL

Automated data pipeline tailored for HR systems to process and analyze roster data. The pipeline leverages Apache Airflow to ingest and transform CSV files containing roster information, and PostgreSQL to store, deduplicate, and extract unique data points.

Language: Python - Size: 6.84 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

HakeemSalaudeen/CoreSentiments-project

This project is designed to reinforce my understanding of Data Pipeline Orchestration with Apache Airflow. The focus is on implementing a data pipeline that addresses data ingestion, processing, storage, and analysis. The project challenges me to apply my knowledge of Apache Airflow to solve a practical scenario based problem

Language: Python - Size: 7.81 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

IsmaheneLarbi/data_engineering_beginner

Creating a data pipeline to extract data from spotify and save the songs listened to everyday into a local sqlite db.

Language: Python - Size: 8.79 KB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0