GitHub topics: azure-data-lake
gargmukul91066/Adventure-Works-Azure-Data-Engineering-Project
Azure End To End Data Engineering Project
Language: Jupyter Notebook - Size: 4.04 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

Phadate/Wikipedia-football-data-engineering-pipeline
End-to-end data engineering pipeline that extracts Wikipedia data, processes it with Apache Airflow, stores in Azure Data Lake, and analyzes with Azure Synapse & Power BI
Size: 2.93 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

robinrodricks/FluentStorage
A polycloud .NET cloud storage abstraction layer. Provides Blob storage (AWS S3, GCP, FTP, SFTP, Azure Blob/File/Event Hub/Data Lake) and Messaging (AWS SQS, Azure Queue/ServiceBus). Supports .NET 5+ and .NET Standard 2.0+. Pure C#.
Language: C# - Size: 36.8 MB - Last synced at: 10 days ago - Pushed at: about 2 months ago - Stars: 366 - Forks: 56

cloudyr/AzureStor
Interface to Azure storage accounts. Submit issues and PRs at https://github.com/Azure/AzureStor
Language: R - Size: 754 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 22 - Forks: 2

Azure/AzureStor
R interface to Azure storage accounts
Language: R - Size: 789 KB - Last synced at: about 24 hours ago - Pushed at: 13 days ago - Stars: 69 - Forks: 21

ashwin-patil/threat-hunting-with-notebooks
Repository with Sample threat hunting notebooks on Security Event Log Data Sources
Language: Jupyter Notebook - Size: 1.35 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 65 - Forks: 11

ewdlop/AzureNote.md
AzureNote. https://azure.status.microsoft/en-us/status
Size: 29.3 KB - Last synced at: 26 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 1

s-yazhini/Hexa-DE-Main-Project
Data engineering main project 1
Language: Jupyter Notebook - Size: 15.5 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

jotstolu/Azure-Data-WareHouse-Project-Using-Azure-Synapse-Analytics
This Project involves building an e-commerce order data warehouse on Azure Synapse Analytic, leveraging the power of Azure Data Lake Storage Gen2, Synapse Pipelines, Data Flows, and Serverless SQL Pools.
Size: 7.81 KB - Last synced at: 22 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

oleewere/fluent-plugin-azurestorage-gen2
Fluentd output plugin for Azure Datalake Storage Gen2 (append support)
Language: Ruby - Size: 95.7 KB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 9 - Forks: 5

MicrosoftCloudEssentials-LearningHub/MS-Fabric-Essentials-Workshop
Fabric Basic Workshop, these guides will elaborate on the standard architecture or features commonly used across industries.
Language: Jupyter Notebook - Size: 554 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

kahing/goofys
a high-performance, POSIX-ish Amazon S3 file system written in Go
Language: Go - Size: 4.69 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 5,327 - Forks: 530

Mohitsai/future-of-hiring
Automated ETL pipeline in Azure for job market analysis using Terraform, Azure Functions, Azure Databricks, Azure Data Lake and PowerBI
Language: HCL - Size: 20.5 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

dataplat/AzureDataPipelineTools
A collection of Azure Function to make building Azure Data Factory pipeline simpler and easier.
Language: C# - Size: 212 KB - Last synced at: 4 months ago - Pushed at: almost 4 years ago - Stars: 12 - Forks: 5

justBlindbaek/TraditionalModernDW
Simple cloud only DWH solution architecture.
Language: TSQL - Size: 108 KB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 40 - Forks: 8

tahir007malik/ecommerceDataStreamingAnalytics
This repository features a production-grade data pipeline leveraging Confluent Kafka for real-time collection of e-commerce clickstream and user activity data.
Language: Jupyter Notebook - Size: 558 KB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

syedhassaanahmed/databricks-notebooks
Collection of Databricks and Jupyter Notebooks
Language: Jupyter Notebook - Size: 742 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 21 - Forks: 15

shudhanshurp/News_Recommendation_System
This repository presents a News Recommendation System using Azure Data Factory, Azure Databricks, and Azure Data Lake to create a data pipeline for ML models. It uses BERT for content-based filtering, Neural Collaborative Filtering for user behaviors, and a hybrid model that combines both to enhance news recommendations.
Language: Jupyter Notebook - Size: 55.9 MB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

Sivaprasad-V/Tokyo-Olympics-Azure-Data-Engineering-Project
Azure End To End Data Engineering Project
Language: Jupyter Notebook - Size: 358 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Sivaprasad-V/NYC-TAXI-Azure-Data-Engineering-Project
Azure End To End Data Engineering Project
Language: Jupyter Notebook - Size: 17.4 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Sivaprasad-V/Adventure-Works-Azure-Data-Engineering-Project
Azure End To End Data Engineering Project
Language: Jupyter Notebook - Size: 2.92 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

gudashashank/tokyo-olympics-analysis
An Azure cloud-based data analytics solution that processes and visualizes the 2021 Tokyo Olympics dataset. This end-to-end pipeline leverages Azure Data Factory for data ingestion, Data Lake Storage Gen2 for secure storage, Databricks for data transformation, Synapse Analytics for SQL querying, and Power BI for interactive visualization
Size: 1.18 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

lamiaaali/DEPI-Graduation-Project
SkinCare Sentiment Analysis Reviews
Language: Jupyter Notebook - Size: 7.72 MB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 2

arsenvlad/docker-presto-adls-wasb
Example of a single node Presto with Azure Data Lake Store (ADLS) and Azure Storage Blob (WASB) access via Hive metastore
Language: Dockerfile - Size: 170 KB - Last synced at: 13 days ago - Pushed at: about 5 years ago - Stars: 19 - Forks: 16

zBalachandar/Tokyo-Olympic-Data-Analytics-Azure-End-To-End-Data-Engineering-Project-12
Tokyo-olympic-azure-data-engineering-end-to-end-project
Language: HTML - Size: 44.5 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

randyroac/azure-databricks-etl-project
ETL motor racing data project using Azure Databricks, Pyspark and Azure Date Lakes
Language: Python - Size: 1.52 MB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 2

segovoni/azure-data-lake-store-delphi
Microsoft Azure Data Lake Store Library for Delphi
Language: Pascal - Size: 72.3 KB - Last synced at: 6 months ago - Pushed at: almost 5 years ago - Stars: 15 - Forks: 4

Jcardif/SerengetiDataLab
An E2E solution of the Data Resources on Azure using the Snapshot Serengeti dataset. This E2E solution focuses Azure Synapse Analytics, Power Bi & the Azure Data Factory.
Language: Bicep - Size: 13 MB - Last synced at: 6 months ago - Pushed at: about 2 years ago - Stars: 9 - Forks: 8

rheaacharya77/ETL-Olympics
ETL pipeline tailored for Olympics data
Language: Python - Size: 606 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ahmedlrashed/E2E-Azure-Pipeline
Databricks ETL Pipeline for retrieving and processing NI TestStand test results, featuring a well-documented notebook for ETL operations, Data Lake for storage, Spark SQL+Python for transformations, and Power BI as the final visualization of factory metrics.
Language: Jupyter Notebook - Size: 1.22 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

mihirkudale/Olympic-data-analysis-azure-data-engineering-project
Language: Jupyter Notebook - Size: 143 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

philnandreoli/metadataingestion
This is a event driven meta data ingestion tool that I am building with Azure leveraging several of Azure PaaS services.
Language: JavaScript - Size: 703 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

mganta/adls-spark-examples
spark adls read write
Language: Scala - Size: 893 KB - Last synced at: over 1 year ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0

sakethmukkanti/Machinery-Moniter-Iot-Streaming-With-Azure
An application developed to give real-time insights on machine health using Iot sensors by tracking and monitoring parameters such as temperature, pressure, current and humidity.
Language: Jupyter Notebook - Size: 210 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sakethmukkanti/Demand-Navigator-Real-Time-Streaming-with-Azure
A real-time application to guide cab drivers looking for ride towards the areas of the cities experiencing higher demand
Language: Jupyter Notebook - Size: 156 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

RJ-Raj/IoT-Data-Pipeline
This repository contains code for an end-to-end IoT data pipeline using Azure services. It ingests, processes, and stores IoT device data from AWS S3 to Azure Data Lake Storage and Azure SQL Database, leveraging Azure Data Factory and Azure Functions for seamless integration and automation.
Language: Python - Size: 14.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sakethmukkanti/Movielens-Dataset-Analysis-Azure-Data-Engineering-Project
Created a movie recommendation system on Azure utilizing Spark SQL for analyzing the MovieLens dataset.
Language: Jupyter Notebook - Size: 1.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

softwaresalt/blog
Data Engineering & Software Blog
Size: 4.96 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

vrup0408/IPL-Data-Analytics
We have dataset of IPL from 2008 to 2020 and we have to visualize analytics on Power BI dashboard. We have to upload that dataset into data lake. After that we have to process that data through pipeline and produce modeled data in warehouse. So, that we will be able to analyze the data in Power BI through pre-defined dashboards.
Size: 3.62 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

eminencegrs/azure-integration
A list of samples for integration of a .NET application with various Azure cloud services.
Language: C# - Size: 20.5 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

just-modeling/jupyterhub-k8s-apache-spark
Deploy apache spark in client mode on Kubernetes cluster, integrate with Jupyter notebook through Jupyterhub server.
Language: Shell - Size: 612 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

ROBROICH/SAP_AND_COMMON_DATA_MODEL_DEMO
This demo describes the basic integration between S/4HANA and the Microsoft Common Data Model (Model)
Size: 4.24 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 16 - Forks: 2

OptimChain/Cloud_Hydroponics
Cloud Based Sensoring Solution for flow telemetry. Shown here is an early stage sensoring prototype with azure based alerting and app deployment.
Language: Python - Size: 83 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 1

poojatripathi06/Covid-reporting-adf
building a real-world data pipeline in Azure Data Factory (ADF) dataset provided by https://www.ecdc.europa.eu/ ingesting data from sources such as HTTP and Azure Blob Storage into Azure Data Lake Gen2 using ADF. transformed data and loaded transformed data using Databricks Notebook Activity in Azure Data Factory (ADF) and load into Azure Data Lake Storage Gen2.
Size: 121 KB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

AdamPaternostro/Azure-Big-Data-and-Machine-Learning-Architecture
A ready to use architecture for processing data and performing machine learning in Azure
Language: C# - Size: 10.7 MB - Last synced at: 5 days ago - Pushed at: about 5 years ago - Stars: 8 - Forks: 3

tomkerkhove/gdpr-with-azure 📦
Scenarios on how you can be GDPR compliant by using Azure services
Language: C# - Size: 1.19 MB - Last synced at: 6 days ago - Pushed at: over 6 years ago - Stars: 6 - Forks: 2

SurajSomani14/Read-And-Filter-Datalake-Files-Data
This azure function reads multiple files from given datalake folder, deserialize data and merge data from all files together. It can apply filters on data and respond with filtered data in requested format.
Language: C# - Size: 159 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

anjijava16/Spark_Multi_Cloud_Storage_Utils
Spark Read/Write data from/to Multi Cloud utils (GCP, Azure and AWS)
Language: HTML - Size: 5.59 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

venkatakamaiah46/Azure
POC projects working on Cloud Platforms
Language: HTML - Size: 208 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

AnthonyByansi/Azure-Data-Fundamentals-Guide
A comprehensive guide to understanding and implementing data management and analytics solutions in the Azure ecosystem using Azure Data Fundamentals.
Language: Mermaid - Size: 74.2 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 10 - Forks: 3

Abdelrahman13-coder/Data-Integration-Pipelines-for-NYC-Payroll-Data-Analytics
Size: 3.97 MB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Jayvardhan-Reddy/Azure-Certification-DP-201
Road to Azure Data Engineer Part-II: DP-201 - Designing an Azure Data Solution
Size: 3.59 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 17 - Forks: 15

epomatti/az-datalake
Azure Data Lake Gen2 with azcopy
Language: HCL - Size: 3.91 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

haxxorsid/flink-datalake-bulk-upload 📦
Bulk image streaming and upload using Flink (+ Kubernetes), Kafka, Data Lake, and SQL (Provided with React UI and Node server for Demo).
Language: Java - Size: 417 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

jksinghpro/kafka-connect-adl
Kafka Connect Connector for ADLS(Azure Data Lake Store)
Language: Java - Size: 23.4 KB - Last synced at: over 2 years ago - Pushed at: almost 7 years ago - Stars: 7 - Forks: 0

Watts-Energy/Watts.Azure
A collection of utilities for working with Azure Batch, Azure Data Factory, Azure Table Storage and Azure Blob Storage.
Language: C# - Size: 281 KB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 4 - Forks: 1

amynic/TechHer
Repo containing files for TechHer event and 'Let your Data tell you the Real Story: Advanced Analytics on Azure' hands on lab
Size: 38.9 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 10 - Forks: 6

xpertdev/tdameritrade-streaming
Streaming order book data using TD Ameritrade API
Language: Python - Size: 71.3 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 5

semashkinvg/Azure.HowTos
Language: C# - Size: 2.64 MB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 0

briandenicola/azure-data-services
A repository to continue my education on Azure Data Services.
Language: Python - Size: 1.18 MB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Inmapg/data-lake-compaction
Batch process that compacts different parquet files stored at Azure Data Lake Storage following the requirements specified at README.
Language: Scala - Size: 14.6 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 1

navicore/navilake
An Akka Streams source of Azure Data Lake data
Language: Scala - Size: 280 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

AdamPaternostro/Azure-Databricks-External-Hive-and-ADLS
Shows how to use an External Hive (SQL Server) along with ADLS Gen 1 as part of a Databricks initialization script that runs when the cluster is created.
Language: PowerShell - Size: 13.7 KB - Last synced at: 5 months ago - Pushed at: almost 7 years ago - Stars: 2 - Forks: 1

Data-Culpa/dataculpa-azure-datalake-gen2
Azure Data Lake Gen2 storage connectors for Data Culpa - monitor data quality automatically with Data Culpa Validator
Language: Python - Size: 28.3 KB - Last synced at: 1 day ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

mail4hafij/Azure-DataLake-DataBricks
The idea is to connect to ADL storage (Azure Data Lake) from Databricks cluster and perform some Scala script on the ADL data.
Language: Scala - Size: 165 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

amperity/blocks-adl
Content-addressable Azure Data Lake block store
Language: Clojure - Size: 36.1 KB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 1

gostranger/Streaming-Web-UI
Language: CSS - Size: 4.44 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

syedhassaanahmed/azure-kafka-spark-adls
Azure ARM template to deploy Kafka and Spark clusters in same VNet with ADLS
Language: Shell - Size: 8.79 KB - Last synced at: 5 months ago - Pushed at: over 7 years ago - Stars: 5 - Forks: 0

AdamPaternostro/Azure-HDInsight-ARM-Template
Creates an HDInsight cluster that has an external Hive metastore and access to Azure Data Lake Store
Size: 63.5 KB - Last synced at: 3 months ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

AdamPaternostro/Azure-HDI-DistCP
Creates a HDInsight cluster then runs distcp remotely to copy data between blob and/or data lake (ADLS)
Language: Shell - Size: 27.3 KB - Last synced at: 5 months ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

AdamPaternostro/Azure-Spark-Livy-Application-Insights-External-Dependency
Use Spark with Livy along with Application Insights. Learn to host your external dependencies in data lake.
Language: Java - Size: 4.2 MB - Last synced at: 5 months ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 0

lrakai/azure-u-sql-data-lake-analytics
Submitting a U-SQL Job to Azure Data Lake Analytics
Language: PowerShell - Size: 9.77 KB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

lrakai/arm-template-custom-resources
Azure function to set the permission of an Azure Data Lake Store in ARM template deploy (~custom resource)
Language: C# - Size: 4.64 MB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

AdamPaternostro/Azure-ADLS-Blob-Data-Copy
Queues up files to copy from one ADLS account to be copied to another ADLS account. You can also use this for on-prem and/or blob.
Language: PowerShell - Size: 3.91 KB - Last synced at: 5 months ago - Pushed at: over 8 years ago - Stars: 1 - Forks: 0

sindhudweep/Orcneas
Read and Extract from ORC files for U-SQL
Language: C# - Size: 112 KB - Last synced at: over 2 years ago - Pushed at: about 8 years ago - Stars: 3 - Forks: 0

AdamPaternostro/Azure-DataLake-Folder-Upload
Upload a folder to Azure Data Lake Store
Language: PowerShell - Size: 1.95 KB - Last synced at: 5 months ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 0

AdamPaternostro/Azure-Spark-Livy
Run a job in Spark 2.x with HDInsight and submit the job through Livy
Language: Scala - Size: 168 KB - Last synced at: 5 months ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 1

AdamPaternostro/Azure-DataLakeCopy
Powershell to copy data lake file from local computer
Language: PowerShell - Size: 1.95 KB - Last synced at: 5 months ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0

AdamPaternostro/Azure-Lock-All-Data-Lake-Stores
Places a resource lock on your ADLS resources so you cannot accidently delete.
Language: PowerShell - Size: 1000 Bytes - Last synced at: 5 months ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0
