GitHub / NitinSPatil15 / Project-4-Data-Lake-with-AWS-EMR
An ETL pipeline that extracts data from S3, processes them using Spark, and loads the data back into S3 as a set of dimensional tables
Stars: 2
Forks: 4
Open issues: 0
License: None
Language: Python
Size: 601 KB
Dependencies parsed at: Pending
Created at: almost 5 years ago
Updated at: about 1 year ago
Pushed at: almost 5 years ago
Last synced at: about 1 year ago
Topics: aws-emr, data-lak, etl-pipeline, pyspark, s3-bucket
Loading...