Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub / Tanay0510 / Data-Lake-with-Spark
Load data from S3, process the data into analytics tables using Spark and load them back into S3. Deployed this Spark process on a cluster using AWS EMR
JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tanay0510%2FData-Lake-with-Spark
Stars: 1
Forks: 0
Open Issues: 0
License: None
Language: Python
Repo Size: 418 KB
Dependencies:
0
Created: almost 3 years ago
Updated: 5 months ago
Last pushed: almost 3 years ago
Last synced: 5 months ago
Topics: datalake, emr-cluster, etl-pipeline, s3, spark
Files
No dependencies found