An open API service providing repository metadata for many open source software ecosystems.

GitHub / christopherkindl / twitter-data-pipeline-using-airflow-and-apache-spark

Data pipeline to process and analyse Twitter data in a distributed fashion using Apache Spark and Airflow in AWS environment

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/christopherkindl%2Ftwitter-data-pipeline-using-airflow-and-apache-spark

Stars: 7
Forks: 1
Open issues: 0

License: None
Language: Python
Size: 5.16 MB
Dependencies parsed at: Pending

Created at: about 4 years ago
Updated at: about 1 year ago
Pushed at: almost 4 years ago
Last synced at: about 1 year ago

Topics: airflow, apache-spark, aws, hadoop-filesystem, python3

    Loading...