Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub / WALIDAADI / ETL_using_Airflow

This project presents a robust data pipeline using Apache Airflow for orchestration, Apache Kafka for real-time data streaming, and MongoDB for data storage. It automates the process of web scraping to collect large companies' data, transforms and processes this data, and then stores it efficiently.

JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WALIDAADI%2FETL_using_Airflow

Stars: 0
Forks: 0
Open Issues: 0

License: None
Language:
Repo Size: 69.3 KB
Dependencies: 0

Created: 3 months ago
Updated: 3 months ago
Last pushed: 3 months ago
Last synced: 3 months ago

Topics: airflow-dags, docker, pipeline

Files
    Loading...
    Readme
    Loading...

    No dependencies found