Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub / Tanay0510 / Data-Lake-with-Spark

Load data from S3, process the data into analytics tables using Spark and load them back into S3. Deployed this Spark process on a cluster using AWS EMR

JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tanay0510%2FData-Lake-with-Spark

Stars: 1
Forks: 0
Open Issues: 0

License: None
Language: Python
Repo Size: 418 KB
Dependencies: 0

Created: almost 3 years ago
Updated: 5 months ago
Last pushed: almost 3 years ago
Last synced: 5 months ago

Topics: datalake, emr-cluster, etl-pipeline, s3, spark

Files
    Loading...
    Readme
    Loading...

    No dependencies found