An open API service providing repository metadata for many open source software ecosystems.

GitHub / DadaNanjesha / Redshift-ETL-Project

The project covers the complete data pipeline—from importing data from an RDS source to HDFS using Sqoop, processing data with Spark, to executing analytical queries on an AWS Redshift cluster.

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DadaNanjesha%2FRedshift-ETL-Project
PURL: pkg:github/DadaNanjesha/Redshift-ETL-Project

Stars: 1
Forks: 0
Open issues: 0

License: mit
Language: Jupyter Notebook
Size: 833 KB
Dependencies parsed at: Pending

Created at: 5 months ago
Updated at: 5 months ago
Pushed at: 5 months ago
Last synced at: 5 months ago

Topics: apache-spark, aws, data-engineering-etl-assignment, data-ingestion, data-pipeline, etl-processes, hdfs, rds, redshift, spark, sqoop

    Loading...