GitHub / DadaNanjesha / Redshift-ETL-Project
The project covers the complete data pipeline—from importing data from an RDS source to HDFS using Sqoop, processing data with Spark, to executing analytical queries on an AWS Redshift cluster.
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DadaNanjesha%2FRedshift-ETL-Project
PURL: pkg:github/DadaNanjesha/Redshift-ETL-Project
Stars: 1
Forks: 0
Open issues: 0
License: mit
Language: Jupyter Notebook
Size: 833 KB
Dependencies parsed at: Pending
Created at: 5 months ago
Updated at: 5 months ago
Pushed at: 5 months ago
Last synced at: 5 months ago
Topics: apache-spark, aws, data-engineering-etl-assignment, data-ingestion, data-pipeline, etl-processes, hdfs, rds, redshift, spark, sqoop