An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: apache-airflow

cucumbershaped-flaw475/IRIS-FLOWER-CLASSIFICATION-PROJECT

🌸 Compare multiple classification models on the Iris dataset to evaluate accuracy, precision, recall, and F1-score with clear visualizations.

Language: Python - Size: 1.41 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

astronomer/dag-factory

Construct Apache Airflow DAGs Declaratively via YAML configuration files

Language: Python - Size: 11.3 MB - Last synced at: 3 days ago - Pushed at: 10 days ago - Stars: 1,395 - Forks: 219

kcenon/screener_system

Advanced stock screening platform for Korean markets (KOSPI/KOSDAQ) with 200+ financial indicators

Language: PLpgSQL - Size: 4.84 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 1

moha123wad/airflow-etl-ml-monitoring-pipeline

🚀 Streamline ETL workflows and monitor ML models with a scalable MLOps pipeline for efficient data handling and observability.

Language: Jupyter Notebook - Size: 1.32 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

nic081/Retail-Sales-Analytics-Pipeline

📊 Build a robust retail sales analytics pipeline to transform data into insights, enhancing decision-making and driving business growth.

Language: Jupyter Notebook - Size: 1.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

astronomer/astronomer

Helm Charts for the Astronomer Platform, Apache Airflow as a Service on Kubernetes

Language: Python - Size: 12.3 MB - Last synced at: 1 day ago - Pushed at: 3 days ago - Stars: 487 - Forks: 95

ragztigadi/Real-Time-YieldCurve-Data-Pipeline-Monitoring-System-AWS-KAFKA-SNOWFLAKE-

Real-time data pipeline for stock market and YieldCurve data analytics using Kafka, Airflow, AWS S3, Glue, Snowflake with automated Slack alerts

Language: Python - Size: 8.85 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

HowardZeng123/realtime-cdc-pipeline-docker

An End-to-End Real-time Data Pipeline using Debezium (CDC) to stream changes from PostgreSQL to Kafka, processed by Apache Spark (Structured Streaming), and sunk into ClickHouse for analytics. Orchestrated by Airflow and fully containerized with Docker Compose.

Language: Python - Size: 17.6 KB - Last synced at: 3 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

ninja-van/airflow-boilerplate

A complete development environment setup for working with Airflow

Language: Python - Size: 985 KB - Last synced at: 5 days ago - Pushed at: almost 3 years ago - Stars: 129 - Forks: 55

Pawlo77/maritime

A Comprehensive Approach to Real-Time Vessel Tracking, Historical Analysis, and Environmental Monitoring.

Language: Dockerfile - Size: 58.6 KB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

doitintl/doit-composer-airflow-training 📦

Getting started with Apache Airflow on Cloud Composer

Language: Python - Size: 4.24 MB - Last synced at: 4 days ago - Pushed at: over 3 years ago - Stars: 30 - Forks: 5

gestaogovbr/Ro-dou

Gerador de DAGs no Apache Airflow para fazer clipping do Diário Oficial da União.

Language: Python - Size: 4.19 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 167 - Forks: 58

apache/airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Language: Python - Size: 480 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 43,351 - Forks: 16,010

astronomer/astronomer-cosmos

Run your dbt Core or dbt Fusion projects as Apache Airflow DAGs and Task Groups with a few lines of code

Language: Python - Size: 19.6 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1,091 - Forks: 251

astronomer/astro-cli

CLI that makes it easy to create, test and deploy Airflow DAGs to Astronomer

Language: Go - Size: 17.2 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 426 - Forks: 102

euiyounghwang/Prometheus-monitoring-exporter

Prometheus-monitoring-exporter

Language: Python - Size: 258 MB - Last synced at: 7 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

airflow-laminar/airflow-balancer

Utilities for tracking hosts and ports and load balancing DAGs

Language: Python - Size: 1.47 MB - Last synced at: 7 days ago - Pushed at: 11 days ago - Stars: 4 - Forks: 0

airflow-laminar/airflow-common

Common Airflow Operators / Tasks

Language: Python - Size: 676 KB - Last synced at: 5 days ago - Pushed at: 11 days ago - Stars: 5 - Forks: 1

sibyabin/blogs

Technology blogging website from Siby Abin. Talks about dataengineering, aws, spark, python, airflow and more

Language: SCSS - Size: 6.37 MB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

jghoman/awesome-apache-airflow

Curated list of resources about Apache Airflow

Language: Shell - Size: 550 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 3,856 - Forks: 499

airflow-laminar/airflow-config

A Configuration System for Airflow

Language: Python - Size: 2.31 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 13 - Forks: 3

airflow-laminar/airflow-priority

Priority Tags for Airflow Dags

Language: Python - Size: 2.32 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 7 - Forks: 1

kalluripradeep/ecommerce-data-pipeline

Production-ready data pipeline demonstrating scalable architecture patterns for batch and streaming data processing

Language: Python - Size: 19.5 KB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

airflow-laminar/airflow-supervisor

Airflow utilities for running long-running or always-on jobs with supervisord

Language: Python - Size: 2.02 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 13 - Forks: 3

airflow-laminar/airflow-ha

High Availability (HA) DAG Utility

Language: Python - Size: 2.29 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 9 - Forks: 1

cassimiro3/ottawa-data-pipeline-airflow

🚀 Build an end-to-end data pipeline using Ottawa Building Permits, leveraging Apache Airflow and Docker for reliable data processing and analytics.

Language: Python - Size: 4.55 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

awaisajaz1/airflow_savant

A comprehensive data engineering pipeline has been established to coordinate the ingestion, processing, and storage of data. This pipeline utilizes Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and FastDBs. All these components have been containerized with Docker to facilitate straightforward deployment and scalability

Language: Python - Size: 535 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 5 - Forks: 3

Subrat1920/Titanic-Survival-MLOps

This is a practice set of project for getting started with Model monitoring Metrics...

Language: Python - Size: 633 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

MohamedSamiHdj/realtime-data-pipeline

📊 Build a reliable real-time data pipeline for Windows using PySpark, ensuring quality data flow from raw ingestion to curated datasets.

Language: Python - Size: 1.3 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

nguyennam05/AI.TUTOR

AI Tutor is a chatbot-based web app that answers syllabus-specific queries using Google Gemini API. It integrates Google Drive for eBook storage, MongoDB for chat history, and Clerk for user authentication, ensuring accurate, secure, and curriculum-aligned responses to students.

Language: JavaScript - Size: 7.06 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 2 - Forks: 2

FabsHof/airflow_open_weather_api

An Apache Airflow project to fetch and process weather data from OpenWeather API.

Language: Python - Size: 231 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 0 - Forks: 0

elyra-ai/elyra

Elyra extends JupyterLab with an AI centric approach.

Language: Python - Size: 115 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 1,971 - Forks: 361

apache/airflow-client-python

Apache Airflow - OpenApi Client for Python

Language: Python - Size: 1.89 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 436 - Forks: 63

astronomer/airflow-provider-fivetran-async

A new Airflow Provider for Fivetran, maintained by Astronomer and Fivetran

Language: Python - Size: 241 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 22 - Forks: 10

AnantaXe/ClickML

ClickML - build MLOps workflow (just click, save and use)

Language: TypeScript - Size: 14.3 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 1 - Forks: 2

dmp-labs/dmp-af

Distributed run of dbt models using Airflow

Language: Python - Size: 8.01 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 3 - Forks: 2

astronomer/astro-sdk 📦

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

Language: Python - Size: 7.54 MB - Last synced at: 7 days ago - Pushed at: 7 months ago - Stars: 375 - Forks: 49

rjain52208/airflow-data-platform-starter Fork of apache/airflow

End-to-end data orchestration platform built using Apache Airflow — automates ETL, scheduling, and monitoring with Docker + CI/CD.

Language: Python - Size: 417 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

santoshsaranyan/global-bnb-data-platform

[Work in Progress] This project demonstrates a cloud-native ELT data platform built with Airflow, dbt, Snowflake, and GCS to automate ingestion, transformation, and warehousing of global Airbnb data.

Language: Python - Size: 2.9 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 0 - Forks: 0

astronomer/airflow-chart

A Helm chart to install Apache Airflow on Kubernetes

Language: Python - Size: 4.91 MB - Last synced at: 5 days ago - Pushed at: 8 days ago - Stars: 290 - Forks: 95

subhamay-bhattacharyya/astronomer-airflow-template

📄🎯 GitHub Repository Template for Apache Airflow to be hosted and executed in Astronomer Cloud

Language: Python - Size: 69.3 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

airflow-laminar/airflow-pydantic

Pydantic models for Apache Airflow

Language: Python - Size: 3.38 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 8 - Forks: 1

airflow-laminar/pydantic-airflow

Pydantic models for Apache Airflow

Language: Python - Size: 54.7 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 2 - Forks: 0

guidok91/spark-movies-etl

Spark data pipeline that processes movie ratings data.

Language: Python - Size: 3.85 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 30 - Forks: 12

23210287thongtruong/customer360-risk

A hands-on data engineering project for building a Customer 360 view and risk scoring system using Apache Spark, Airflow, and Metabase.

Language: Python - Size: 186 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 0 - Forks: 0

andreax79/airflow-code-editor

A plugin for Apache Airflow that allows you to edit DAGs in browser

Language: Vue - Size: 15.5 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 453 - Forks: 55

witchyburn/vk_friends_analyzer

ETL-пайплайн для мониторинга друзей ВКонтакте с системой оповещений и аналитикой. Проект автоматически отслеживает изменения в списке друзей и предоставляет удобный дашборд для анализа социальных связей.

Language: Python - Size: 694 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

shivp436/realtime-log-processing

Apache airflow dags built with Apache Kafka Produce & Consumer to log website events into ElasticSearch

Language: Python - Size: 133 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Dhanush-Raj1/Ecommerce-Chatbot-Project

A GenAI-powered chatbot for an ecommerce clothing store that answers user queries, provides recommendations, tracks orders and more.

Language: Python - Size: 26.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 0

imchandanmohan/airflow-etl-ml-monitoring-pipeline

End-to-end MLOps pipeline with Airflow ETL orchestration, Redis feature store, and real-time ML monitoring using Prometheus & Grafana with automated data drift detection

Language: Jupyter Notebook - Size: 23.4 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

ragztigadi/RealTime-Reddit-BigData-ETL-Pipeline-on-AWS-Glue-Redshift

A cloud-native data engineering pipeline that ingests live Reddit data, orchestrates ETL with Apache Airflow, transforms with AWS Glue, stores in Amazon S3, and queries with Redshift & Athena. Includes schema automation with Glue Crawler and dashboard-ready datasets for BI tools.

Language: Python - Size: 18.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

aymane-maghouti/Big-Data-Project

This project aims to predict smartphone prices using a combination of batch and stream processing techniques in a Big Data environment. The architecture follows the Lambda Architecture pattern, providing both real-time and batch processing capabilities to users.

Language: Python - Size: 960 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 18 - Forks: 2

JoseVF5/Data-Mart---TechStyle-Commerce

Este repositório contém o desenvolvimento de um pipeline de dados completo e automatizado, simulando um ambiente corporativo para a empresa fictícia "TechStyle Commerce". O projeto foi criado como um case prático para demonstrar habilidades em engenharia de dados, desde a ingestão de fontes brutas até a disponibilização de dashboards.

Language: Python - Size: 20.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 1

Dpbm/qcop

An AI model to predict the output of a quantum cirucit

Language: Python - Size: 830 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Daniel-jcVv/Daniel-jcVv

👨‍💻 Data Engineer | 3+ years enterprise experience with Telcel & Citi Banamex Develop ETL pipelines, data governance, and cloud solutions. Building scalable data architectures and automated workflows for Fortune 500 clients. Tech Stack: Python, SQL Server, Oracle, Apache Airflow, PySpark

Size: 35.2 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

fuadonates/gcp-data-lake-platform

GCP data lake platform integrating 4 source systems with BigQuery, Airflow, and Dataflow - Bronze-Silver architecture

Language: Python - Size: 0 Bytes - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

fuadonates/data-engineering

A collection of data engineering projects, proofs-of-concept (POCs), and proofs-of-knowledge (POKs) using technologies like Python, Spark, SQL, and cloud platforms.

Size: 5.86 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

GregoryKogan/crypto-trading-data-pipeline

Real-time crypto trading data pipeline using Apache Spark, Kafka, and Airflow. Containerized microservices architecture for streaming analytics.

Language: Python - Size: 21.5 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

HugoBSantos/airflow-intro

Learning to orchestrate data pipelines using Apache Airflow.

Language: Python - Size: 22.5 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

voidrot/dag-sync

Sync Airflow DAG's from S3 to local filesystem

Language: Go - Size: 7.81 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

WHEELYDOS/nasa_etl

ETL pipeline built with Apache Airflow, deployed on Astro, and integrated with AWS RDS (PostgreSQL) for scalable data orchestration and storage

Language: Python - Size: 901 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

eliakimpires/flight-analytics-pipeline

End-to-end ELT data pipeline for US flight performance analysis. Orchestrated with Apache Airflow, transformed with dbt, and containerized with Docker.

Language: Python - Size: 37.1 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

HasibulHasanKhan/Retail-Sales-Analytics-Pipeline

This Retail Sales Analytics Pipeline is a fully modular, end-to-end data analytics project designed for retail businesses to analyze sales performance, customer behavior, and marketing ROI, while generating actionable insights through dashboards and reports.

Language: Jupyter Notebook - Size: 4.88 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

astronomer/ebook-etl-elt

Companion repository to the ETL & ELT Pipelines with Apache Airflow® eBook

Language: Python - Size: 607 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 31 - Forks: 15

Adelllllllll/ottawa-data-pipeline-airflow

A complete data engineering project simulating a modern data lake architecture (Raw → Staging → Curated → Index) using Apache Airflow, LocalStack S3, MySQL, MongoDB, and Elasticsearch.

Language: Python - Size: 3.25 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Hsooni491/cafe_sales_etl

☕ Production-grade ETL pipeline for café sales analytics using Apache Airflow, Python, and PostgreSQL. Automates data extraction, transformation, quality validation, and BI reporting with visual analytics.

Language: Python - Size: 174 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

abhishekbhakat/airflow-mcp-server

MCP Server for Apache Airflow

Language: Python - Size: 25.6 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 24 - Forks: 4

johnryanmcnally/tile_tracker_pipeline

A data pipeline for a RAG LLM powered by enriched Tile Tracker location data, built with Langchain, Apache Airflow and PostgreSQL.

Language: Jupyter Notebook - Size: 98.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

aws-ia/terraform-aws-mwaa

Terraform module for Amazon MWAA(Apache Airflow)

Language: HCL - Size: 3.23 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 54 - Forks: 64

DHANA5982/Big_Data_Engineering_Azure_GCP_AWS

Comprehensive Big Data Engineering learning repository featuring hands-on projects with Hadoop, Spark, Kafka, Docker, Airflow, and Azure Cloud. Includes end-to-end data pipelines, real-time streaming, and distributed processing implementations.

Language: Jupyter Notebook - Size: 44.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

Toloka/dbt-af

Distributed run of dbt models using Airflow

Language: Python - Size: 3.51 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 166 - Forks: 13

call518/MCP-Airflow-API

🔍Model Context Protocol (MCP) server for Apache Airflow API integration. Provides comprehensive tools for managing Airflow clusters including service operations, configuration management, status monitoring, and request tracking.

Language: Python - Size: 1.18 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 41 - Forks: 9

WordPress/openverse-catalog 📦

Identifies and collects data on cc-licensed content across web crawl data and public apis.

Language: Python - Size: 92.6 MB - Last synced at: 13 days ago - Pushed at: over 2 years ago - Stars: 61 - Forks: 52

arturLMoretti/BEES-Data-Engineering---Breweries-Case

Language: Python - Size: 98.6 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

MahajanPreksha/Birthday-Reminder-Pipeline

A data pipeline built using Python and Apache Airflow that checks MySQL database every morning and sends personalized birthday reminders to a Discord channel, complete with the person's age and celebratory formatting.

Language: Python - Size: 17.6 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

blackbass64/data-orchestration-with-apache-airflow

Class materials and setup guide for Data Orchestration with Apache Airflow

Size: 5.86 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 40

RaphCodec/airflow-azure-starter

A starting point for production Airflow Deployment on an Azure VM running the LocalExecutor. Small team setup.

Language: Shell - Size: 1.09 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

kanhaiya-gupta/DataEngineering-metrify-smart-metering

Real-time smart meter data pipeline: Kafka + Snowflake + Airflow + dbt for scalable energy data processing with clean architecture and enterprise monitoring

Language: Python - Size: 800 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Zerohertz/airflow-dags

🍃 [Apache Airflow] DAGs 🍃

Language: Python - Size: 144 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

George-Njuguna/Spotify-ETL-Pipeline

This is an ETL pipeline that uses Spotify API , Docker and Airflow

Language: Jupyter Notebook - Size: 2.05 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

rejsafranko/Continuous-Learning-Infrastructure

AWS infrastructure for deep learning model re-training.

Language: Python - Size: 307 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

apache/airflow-client-go

Apache Airflow - OpenApi Client for Go

Language: Go - Size: 543 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 214 - Forks: 26

AlvaroCavalcante/airflow-parse-bench

Stop creating bad DAGs! Use this tool to measure and compare the parse time of your DAGs, identify bottlenecks, and optimize your Airflow environment for better performance.

Language: Python - Size: 192 KB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 20 - Forks: 0

oxylabs/building-scraping-pipeline-apache-airflow

Using Apache Airflow to Build a Pipeline for Scraped Data

Language: Python - Size: 128 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 1

PRagan/netflix-subscription-data-pipeline-orchestration

A big data pipeline/analysis project. Orchestrated using Apache Airflow, the project also utilizes Kaggle, AWS S3, Glue, RedShift and Zoho Analytics to perform data scrubbing, ETL and visualization of Netflix subscription and title data.

Size: 1.95 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Narius2030/Setup-Big-Data-Services

Document of basic setup for Big Data services by Docker - Implement on premise (don't use cloud platform)

Language: Jupyter Notebook - Size: 28.9 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

Phadate/Wikipedia-football-data-engineering-pipeline

End-to-end data engineering pipeline that extracts Wikipedia data, processes it with Apache Airflow, stores in Azure Data Lake, and analyzes with Azure Synapse & Power BI

Language: Jupyter Notebook - Size: 203 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

rajibsalui/ETL-Airflow

It is a ETL pipeline with sheduled and orchestrated workflows using Apache Airflow. The raw data is ingested from a weather api, which is the processed and loaded into the PostgreSQL DB

Language: Python - Size: 199 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

astronomer/apache-airflow-providers-transfers

Language: Python - Size: 933 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 11 - Forks: 3

astronomer/astro-provider-databricks 📦

Orchestrate your Databricks notebooks in Airflow and execute them as Databricks Workflows

Language: Python - Size: 11.1 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 23 - Forks: 12

chiefnarx/Bakerite_Foods

Bakerite Foods is a data engineering project designed to orchestrate and automate data workflows for a food distribution company, using Apache Airflow and Azure Cloud Storage.

Language: Jupyter Notebook - Size: 217 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

1AyaNabil1/E-Commerce_ML_Data_Engineering_Pipeline_Snowflake

Built a full-stack data engineering pipeline on Snowflake to process and transform 10M+ daily e-commerce records for ML model training.

Language: Python - Size: 4.78 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 1

mdzaheerjk/Advanced_MLOPS_Project9_Iris-Flower-Classification2

🌸 Iris Flower Classification using End-to-End MLOps 🤖 Automated ML pipeline for predicting Iris species from measurements 🐳 Dockerized & Kubernetes-ready for scalable deployment 🌐 Flask web app for real-time species inference with modular code

Language: Jupyter Notebook - Size: 4.99 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 7 - Forks: 0

mdzaheerjk/Advanced_Mlops_Project6_Australia_Weather_Rain_Predection

🌦️ Australia Weather Rain Prediction with Advanced MLOps 🤖 End-to-end ML pipeline for rain forecasting using Australian weather data 🐳 Dockerized and Kubernetes-ready for scalable deployment 🌐 Flask web app for real-time weather prediction with modular, reproducible code

Language: Jupyter Notebook - Size: 14.7 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 7 - Forks: 0

mdzaheerjk/Advanced_Mlops_Project4_Custom_Guns_Object_Detection

🔫 Custom Guns Object Detection with MLOps 🤖 End-to-end pipeline: training, inference & experiment tracking 🗃️ DVC-powered data versioning and reproducible experiments 📊 Modular, configurable, and ready for research or security applications

Language: Jupyter Notebook - Size: 13.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 7 - Forks: 0

supakunz/Book-Revenue-Pipeline

A ready-to-use Docker-based template for data engineering projects, featuring a complete stack with Apache Airflow, Spark, and MinIO for building scalable data pipelines.

Language: Jupyter Notebook - Size: 2.28 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

meetzaveri29/etl-weather-pipeline

Automated ETL pipeline built with Apache Airflow that extracts real-time weather data from Open-Meteo API, transforms it into structured format, and loads it into PostgreSQL database. Features Docker containerization, AWS deployment support, and comprehensive monitoring. Perfect example of modern data engineering practices.

Language: Python - Size: 1.65 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

idealista/airflow-role

Ansible role to install Apache Airflow

Language: YAML - Size: 311 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 85 - Forks: 52

tkp-archive/paperboy

A web frontend for scheduling Jupyter notebook reports

Language: Python - Size: 12.5 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 254 - Forks: 25

kaxil/airflowctl 📦

A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects

Language: Python - Size: 313 KB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 220 - Forks: 17