An open API service providing repository metadata for many open source software ecosystems.

GitHub / mikecerton / UserInsight-Streaming-Data-Pipeline

UserInsight-Streaming-Data-Pipeline is a real-time pipeline that ingests API data into Kafka, processes it with Spark, stores it in S3, and uses AWS Lambda to load it into Redshift. The data is then used to create a dashboard in Looker. [Data Engineer]

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mikecerton%2FUserInsight-Streaming-Data-Pipeline
PURL: pkg:github/mikecerton/UserInsight-Streaming-Data-Pipeline

Stars: 1
Forks: 0
Open issues: 0

License: None
Language: Python
Size: 325 KB
Dependencies parsed at: Pending

Created at: 3 months ago
Updated at: about 2 months ago
Pushed at: about 2 months ago
Last synced at: about 2 months ago

Topics: apache-kafka, apache-spark, aws, data-engineer, data-pipeline, docker-compose

    Loading...