An open API service providing repository metadata for many open source software ecosystems.

GitHub / MuhammadHasaanWahid / Data-Filtering-Pipeline-ETL

This Project Extracts supply chain data from csv file having 180k records and more than 40 columns from the Azure Datalake Gen2 storage account and do some dataanalysis with Python(Pandas) to find the top 3 countries and filtered the data for top 3 countries and finally transferred it to 3 files in datalake again by creating ETL pipeline in ADF.

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MuhammadHasaanWahid%2FData-Filtering-Pipeline-ETL
PURL: pkg:github/MuhammadHasaanWahid/Data-Filtering-Pipeline-ETL

Stars: 0
Forks: 0
Open issues: 0

License: None
Language: Jupyter Notebook
Size: 8.79 KB
Dependencies parsed at: Pending

Created at: about 2 years ago
Updated at: about 2 years ago
Pushed at: about 2 years ago
Last synced at: about 2 years ago

Topics: azure, dataengineering, datafactory, datafiltering, datalake, etl, microsoft, microsoft-azure

    Loading...