GitHub / christopherkindl / twitter-data-pipeline-using-airflow-and-apache-spark
Data pipeline to process and analyse Twitter data in a distributed fashion using Apache Spark and Airflow in AWS environment
Stars: 7
Forks: 1
Open issues: 0
License: None
Language: Python
Size: 5.16 MB
Dependencies parsed at: Pending
Created at: about 4 years ago
Updated at: about 1 year ago
Pushed at: almost 4 years ago
Last synced at: about 1 year ago
Topics: airflow, apache-spark, aws, hadoop-filesystem, python3
Loading...