An open API service providing repository metadata for many open source software ecosystems.

GitHub / MuhammadHasaanWahid 1 Repository

Azure Data Engineer | ETL | Databricks | Python | SQL | Pyspark

MuhammadHasaanWahid/Datalake-To-Database-Via-DataBricks

This project extracts data from Datalake and then transfer to Azure SQL Database via Azure DataBricks in Python(Pyspark).

Language: Jupyter Notebook - Size: 8.22 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

MuhammadHasaanWahid/Csv-To-Parquet-And-Data-Reporting-Pipeline

This project extracts data having 800k records from CSV in the data factory and convert it to parquet based data and finally create a PowerBI report of that parquet based data.

Size: 5.86 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

MuhammadHasaanWahid/Data-Filtering-Pipeline-ETL

This Project Extracts supply chain data from csv file having 180k records and more than 40 columns from the Azure Datalake Gen2 storage account and do some dataanalysis with Python(Pandas) to find the top 3 countries and filtered the data for top 3 countries and finally transferred it to 3 files in datalake again by creating ETL pipeline in ADF.

Language: Jupyter Notebook - Size: 8.79 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

MuhammadHasaanWahid/Data-Cleaning-Pipeline-ETL

This project extracts data from Azure datalake gen 2 storage, transforming it and then transferring it to SQL database.

Size: 127 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0