GitHub / ZeroTwoDataRW / DE-Stream-Project-Random-Generated-User-Data
An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All components are containerized with Docker for easy deployment and scalability.
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ZeroTwoDataRW%2FDE-Stream-Project-Random-Generated-User-Data
PURL: pkg:github/ZeroTwoDataRW/DE-Stream-Project-Random-Generated-User-Data
Stars: 1
Forks: 0
Open issues: 0
License: None
Language: Python
Size: 393 MB
Dependencies parsed at: Pending
Created at: over 1 year ago
Updated at: over 1 year ago
Pushed at: over 1 year ago
Last synced at: about 1 year ago
Topics: airflow, apachespark, cassandra-database, docker, kafka, postgesql, python