An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-lak

NitinSPatil15/Project-4-Data-Lake-with-AWS-EMR

An ETL pipeline that extracts data from S3, processes them using Spark, and loads the data back into S3 as a set of dimensional tables

Language: Python - Size: 601 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 4