Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub / WALIDAADI / ETL_using_Airflow
This project presents a robust data pipeline using Apache Airflow for orchestration, Apache Kafka for real-time data streaming, and MongoDB for data storage. It automates the process of web scraping to collect large companies' data, transforms and processes this data, and then stores it efficiently.
JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WALIDAADI%2FETL_using_Airflow
Stars: 0
Forks: 0
Open Issues: 0
License: None
Language:
Repo Size: 69.3 KB
Dependencies:
0
Created: 3 months ago
Updated: 3 months ago
Last pushed: 3 months ago
Last synced: 3 months ago
Topics: airflow-dags, docker, pipeline
Files
No dependencies found