An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: azure-data-lake

gargmukul91066/Adventure-Works-Azure-Data-Engineering-Project

Azure End To End Data Engineering Project

Language: Jupyter Notebook - Size: 4.04 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

Phadate/Wikipedia-football-data-engineering-pipeline

End-to-end data engineering pipeline that extracts Wikipedia data, processes it with Apache Airflow, stores in Azure Data Lake, and analyzes with Azure Synapse & Power BI

Size: 2.93 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

robinrodricks/FluentStorage

A polycloud .NET cloud storage abstraction layer. Provides Blob storage (AWS S3, GCP, FTP, SFTP, Azure Blob/File/Event Hub/Data Lake) and Messaging (AWS SQS, Azure Queue/ServiceBus). Supports .NET 5+ and .NET Standard 2.0+. Pure C#.

Language: C# - Size: 36.8 MB - Last synced at: 10 days ago - Pushed at: about 2 months ago - Stars: 366 - Forks: 56

cloudyr/AzureStor

Interface to Azure storage accounts. Submit issues and PRs at https://github.com/Azure/AzureStor

Language: R - Size: 754 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 22 - Forks: 2

Azure/AzureStor

R interface to Azure storage accounts

Language: R - Size: 789 KB - Last synced at: about 24 hours ago - Pushed at: 13 days ago - Stars: 69 - Forks: 21

ashwin-patil/threat-hunting-with-notebooks

Repository with Sample threat hunting notebooks on Security Event Log Data Sources

Language: Jupyter Notebook - Size: 1.35 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 65 - Forks: 11

ewdlop/AzureNote.md

AzureNote. https://azure.status.microsoft/en-us/status

Size: 29.3 KB - Last synced at: 26 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 1

s-yazhini/Hexa-DE-Main-Project

Data engineering main project 1

Language: Jupyter Notebook - Size: 15.5 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

jotstolu/Azure-Data-WareHouse-Project-Using-Azure-Synapse-Analytics

This Project involves building an e-commerce order data warehouse on Azure Synapse Analytic, leveraging the power of Azure Data Lake Storage Gen2, Synapse Pipelines, Data Flows, and Serverless SQL Pools.

Size: 7.81 KB - Last synced at: 22 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

oleewere/fluent-plugin-azurestorage-gen2

Fluentd output plugin for Azure Datalake Storage Gen2 (append support)

Language: Ruby - Size: 95.7 KB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 9 - Forks: 5

MicrosoftCloudEssentials-LearningHub/MS-Fabric-Essentials-Workshop

Fabric Basic Workshop, these guides will elaborate on the standard architecture or features commonly used across industries.

Language: Jupyter Notebook - Size: 554 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

kahing/goofys

a high-performance, POSIX-ish Amazon S3 file system written in Go

Language: Go - Size: 4.69 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 5,327 - Forks: 530

Mohitsai/future-of-hiring

Automated ETL pipeline in Azure for job market analysis using Terraform, Azure Functions, Azure Databricks, Azure Data Lake and PowerBI

Language: HCL - Size: 20.5 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

dataplat/AzureDataPipelineTools

A collection of Azure Function to make building Azure Data Factory pipeline simpler and easier.

Language: C# - Size: 212 KB - Last synced at: 4 months ago - Pushed at: almost 4 years ago - Stars: 12 - Forks: 5

justBlindbaek/TraditionalModernDW

Simple cloud only DWH solution architecture.

Language: TSQL - Size: 108 KB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 40 - Forks: 8

tahir007malik/ecommerceDataStreamingAnalytics

This repository features a production-grade data pipeline leveraging Confluent Kafka for real-time collection of e-commerce clickstream and user activity data.

Language: Jupyter Notebook - Size: 558 KB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

syedhassaanahmed/databricks-notebooks

Collection of Databricks and Jupyter Notebooks

Language: Jupyter Notebook - Size: 742 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 21 - Forks: 15

shudhanshurp/News_Recommendation_System

This repository presents a News Recommendation System using Azure Data Factory, Azure Databricks, and Azure Data Lake to create a data pipeline for ML models. It uses BERT for content-based filtering, Neural Collaborative Filtering for user behaviors, and a hybrid model that combines both to enhance news recommendations.

Language: Jupyter Notebook - Size: 55.9 MB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

Sivaprasad-V/Tokyo-Olympics-Azure-Data-Engineering-Project

Azure End To End Data Engineering Project

Language: Jupyter Notebook - Size: 358 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Sivaprasad-V/NYC-TAXI-Azure-Data-Engineering-Project

Azure End To End Data Engineering Project

Language: Jupyter Notebook - Size: 17.4 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Sivaprasad-V/Adventure-Works-Azure-Data-Engineering-Project

Azure End To End Data Engineering Project

Language: Jupyter Notebook - Size: 2.92 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

gudashashank/tokyo-olympics-analysis

An Azure cloud-based data analytics solution that processes and visualizes the 2021 Tokyo Olympics dataset. This end-to-end pipeline leverages Azure Data Factory for data ingestion, Data Lake Storage Gen2 for secure storage, Databricks for data transformation, Synapse Analytics for SQL querying, and Power BI for interactive visualization

Size: 1.18 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

lamiaaali/DEPI-Graduation-Project

SkinCare Sentiment Analysis Reviews

Language: Jupyter Notebook - Size: 7.72 MB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 2

arsenvlad/docker-presto-adls-wasb

Example of a single node Presto with Azure Data Lake Store (ADLS) and Azure Storage Blob (WASB) access via Hive metastore

Language: Dockerfile - Size: 170 KB - Last synced at: 13 days ago - Pushed at: about 5 years ago - Stars: 19 - Forks: 16

zBalachandar/Tokyo-Olympic-Data-Analytics-Azure-End-To-End-Data-Engineering-Project-12

Tokyo-olympic-azure-data-engineering-end-to-end-project

Language: HTML - Size: 44.5 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

randyroac/azure-databricks-etl-project

ETL motor racing data project using Azure Databricks, Pyspark and Azure Date Lakes

Language: Python - Size: 1.52 MB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 2

segovoni/azure-data-lake-store-delphi

Microsoft Azure Data Lake Store Library for Delphi

Language: Pascal - Size: 72.3 KB - Last synced at: 6 months ago - Pushed at: almost 5 years ago - Stars: 15 - Forks: 4

Jcardif/SerengetiDataLab

An E2E solution of the Data Resources on Azure using the Snapshot Serengeti dataset. This E2E solution focuses Azure Synapse Analytics, Power Bi & the Azure Data Factory.

Language: Bicep - Size: 13 MB - Last synced at: 6 months ago - Pushed at: about 2 years ago - Stars: 9 - Forks: 8

rheaacharya77/ETL-Olympics

ETL pipeline tailored for Olympics data

Language: Python - Size: 606 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ahmedlrashed/E2E-Azure-Pipeline

Databricks ETL Pipeline for retrieving and processing NI TestStand test results, featuring a well-documented notebook for ETL operations, Data Lake for storage, Spark SQL+Python for transformations, and Power BI as the final visualization of factory metrics.

Language: Jupyter Notebook - Size: 1.22 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

mihirkudale/Olympic-data-analysis-azure-data-engineering-project

Language: Jupyter Notebook - Size: 143 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

philnandreoli/metadataingestion

This is a event driven meta data ingestion tool that I am building with Azure leveraging several of Azure PaaS services.

Language: JavaScript - Size: 703 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

mganta/adls-spark-examples

spark adls read write

Language: Scala - Size: 893 KB - Last synced at: over 1 year ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0

sakethmukkanti/Machinery-Moniter-Iot-Streaming-With-Azure

An application developed to give real-time insights on machine health using Iot sensors by tracking and monitoring parameters such as temperature, pressure, current and humidity.

Language: Jupyter Notebook - Size: 210 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sakethmukkanti/Demand-Navigator-Real-Time-Streaming-with-Azure

A real-time application to guide cab drivers looking for ride towards the areas of the cities experiencing higher demand

Language: Jupyter Notebook - Size: 156 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

RJ-Raj/IoT-Data-Pipeline

This repository contains code for an end-to-end IoT data pipeline using Azure services. It ingests, processes, and stores IoT device data from AWS S3 to Azure Data Lake Storage and Azure SQL Database, leveraging Azure Data Factory and Azure Functions for seamless integration and automation.

Language: Python - Size: 14.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sakethmukkanti/Movielens-Dataset-Analysis-Azure-Data-Engineering-Project

Created a movie recommendation system on Azure utilizing Spark SQL for analyzing the MovieLens dataset.

Language: Jupyter Notebook - Size: 1.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

softwaresalt/blog

Data Engineering & Software Blog

Size: 4.96 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

vrup0408/IPL-Data-Analytics

We have dataset of IPL from 2008 to 2020 and we have to visualize analytics on Power BI dashboard. We have to upload that dataset into data lake. After that we have to process that data through pipeline and produce modeled data in warehouse. So, that we will be able to analyze the data in Power BI through pre-defined dashboards.

Size: 3.62 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

eminencegrs/azure-integration

A list of samples for integration of a .NET application with various Azure cloud services.

Language: C# - Size: 20.5 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

just-modeling/jupyterhub-k8s-apache-spark

Deploy apache spark in client mode on Kubernetes cluster, integrate with Jupyter notebook through Jupyterhub server.

Language: Shell - Size: 612 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

ROBROICH/SAP_AND_COMMON_DATA_MODEL_DEMO

This demo describes the basic integration between S/4HANA and the Microsoft Common Data Model (Model)

Size: 4.24 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 16 - Forks: 2

OptimChain/Cloud_Hydroponics

Cloud Based Sensoring Solution for flow telemetry. Shown here is an early stage sensoring prototype with azure based alerting and app deployment.

Language: Python - Size: 83 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 1

poojatripathi06/Covid-reporting-adf

building a real-world data pipeline in Azure Data Factory (ADF) dataset provided by https://www.ecdc.europa.eu/ ingesting data from sources such as HTTP and Azure Blob Storage into Azure Data Lake Gen2 using ADF. transformed data and loaded transformed data using Databricks Notebook Activity in Azure Data Factory (ADF) and load into Azure Data Lake Storage Gen2.

Size: 121 KB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

AdamPaternostro/Azure-Big-Data-and-Machine-Learning-Architecture

A ready to use architecture for processing data and performing machine learning in Azure

Language: C# - Size: 10.7 MB - Last synced at: 5 days ago - Pushed at: about 5 years ago - Stars: 8 - Forks: 3

tomkerkhove/gdpr-with-azure 📦

Scenarios on how you can be GDPR compliant by using Azure services

Language: C# - Size: 1.19 MB - Last synced at: 6 days ago - Pushed at: over 6 years ago - Stars: 6 - Forks: 2

SurajSomani14/Read-And-Filter-Datalake-Files-Data

This azure function reads multiple files from given datalake folder, deserialize data and merge data from all files together. It can apply filters on data and respond with filtered data in requested format.

Language: C# - Size: 159 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

anjijava16/Spark_Multi_Cloud_Storage_Utils

Spark Read/Write data from/to Multi Cloud utils (GCP, Azure and AWS)

Language: HTML - Size: 5.59 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

venkatakamaiah46/Azure

POC projects working on Cloud Platforms

Language: HTML - Size: 208 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

AnthonyByansi/Azure-Data-Fundamentals-Guide

A comprehensive guide to understanding and implementing data management and analytics solutions in the Azure ecosystem using Azure Data Fundamentals.

Language: Mermaid - Size: 74.2 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 10 - Forks: 3

Abdelrahman13-coder/Data-Integration-Pipelines-for-NYC-Payroll-Data-Analytics

Size: 3.97 MB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Jayvardhan-Reddy/Azure-Certification-DP-201

Road to Azure Data Engineer Part-II: DP-201 - Designing an Azure Data Solution

Size: 3.59 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 17 - Forks: 15

epomatti/az-datalake

Azure Data Lake Gen2 with azcopy

Language: HCL - Size: 3.91 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

haxxorsid/flink-datalake-bulk-upload 📦

Bulk image streaming and upload using Flink (+ Kubernetes), Kafka, Data Lake, and SQL (Provided with React UI and Node server for Demo).

Language: Java - Size: 417 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

jksinghpro/kafka-connect-adl

Kafka Connect Connector for ADLS(Azure Data Lake Store)

Language: Java - Size: 23.4 KB - Last synced at: over 2 years ago - Pushed at: almost 7 years ago - Stars: 7 - Forks: 0

Watts-Energy/Watts.Azure

A collection of utilities for working with Azure Batch, Azure Data Factory, Azure Table Storage and Azure Blob Storage.

Language: C# - Size: 281 KB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 4 - Forks: 1

amynic/TechHer

Repo containing files for TechHer event and 'Let your Data tell you the Real Story: Advanced Analytics on Azure' hands on lab

Size: 38.9 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 10 - Forks: 6

xpertdev/tdameritrade-streaming

Streaming order book data using TD Ameritrade API

Language: Python - Size: 71.3 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 5

semashkinvg/Azure.HowTos

Language: C# - Size: 2.64 MB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 0

briandenicola/azure-data-services

A repository to continue my education on Azure Data Services.

Language: Python - Size: 1.18 MB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Inmapg/data-lake-compaction

Batch process that compacts different parquet files stored at Azure Data Lake Storage following the requirements specified at README.

Language: Scala - Size: 14.6 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 1

navicore/navilake

An Akka Streams source of Azure Data Lake data

Language: Scala - Size: 280 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

AdamPaternostro/Azure-Databricks-External-Hive-and-ADLS

Shows how to use an External Hive (SQL Server) along with ADLS Gen 1 as part of a Databricks initialization script that runs when the cluster is created.

Language: PowerShell - Size: 13.7 KB - Last synced at: 5 months ago - Pushed at: almost 7 years ago - Stars: 2 - Forks: 1

Data-Culpa/dataculpa-azure-datalake-gen2

Azure Data Lake Gen2 storage connectors for Data Culpa - monitor data quality automatically with Data Culpa Validator

Language: Python - Size: 28.3 KB - Last synced at: 1 day ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

mail4hafij/Azure-DataLake-DataBricks

The idea is to connect to ADL storage (Azure Data Lake) from Databricks cluster and perform some Scala script on the ADL data.

Language: Scala - Size: 165 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

amperity/blocks-adl

Content-addressable Azure Data Lake block store

Language: Clojure - Size: 36.1 KB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 1

gostranger/Streaming-Web-UI

Language: CSS - Size: 4.44 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

syedhassaanahmed/azure-kafka-spark-adls

Azure ARM template to deploy Kafka and Spark clusters in same VNet with ADLS

Language: Shell - Size: 8.79 KB - Last synced at: 5 months ago - Pushed at: over 7 years ago - Stars: 5 - Forks: 0

AdamPaternostro/Azure-HDInsight-ARM-Template

Creates an HDInsight cluster that has an external Hive metastore and access to Azure Data Lake Store

Size: 63.5 KB - Last synced at: 3 months ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

AdamPaternostro/Azure-HDI-DistCP

Creates a HDInsight cluster then runs distcp remotely to copy data between blob and/or data lake (ADLS)

Language: Shell - Size: 27.3 KB - Last synced at: 5 months ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

AdamPaternostro/Azure-Spark-Livy-Application-Insights-External-Dependency

Use Spark with Livy along with Application Insights. Learn to host your external dependencies in data lake.

Language: Java - Size: 4.2 MB - Last synced at: 5 months ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 0

lrakai/azure-u-sql-data-lake-analytics

Submitting a U-SQL Job to Azure Data Lake Analytics

Language: PowerShell - Size: 9.77 KB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

lrakai/arm-template-custom-resources

Azure function to set the permission of an Azure Data Lake Store in ARM template deploy (~custom resource)

Language: C# - Size: 4.64 MB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

AdamPaternostro/Azure-ADLS-Blob-Data-Copy

Queues up files to copy from one ADLS account to be copied to another ADLS account. You can also use this for on-prem and/or blob.

Language: PowerShell - Size: 3.91 KB - Last synced at: 5 months ago - Pushed at: over 8 years ago - Stars: 1 - Forks: 0

sindhudweep/Orcneas

Read and Extract from ORC files for U-SQL

Language: C# - Size: 112 KB - Last synced at: over 2 years ago - Pushed at: about 8 years ago - Stars: 3 - Forks: 0

AdamPaternostro/Azure-DataLake-Folder-Upload

Upload a folder to Azure Data Lake Store

Language: PowerShell - Size: 1.95 KB - Last synced at: 5 months ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 0

AdamPaternostro/Azure-Spark-Livy

Run a job in Spark 2.x with HDInsight and submit the job through Livy

Language: Scala - Size: 168 KB - Last synced at: 5 months ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 1

AdamPaternostro/Azure-DataLakeCopy

Powershell to copy data lake file from local computer

Language: PowerShell - Size: 1.95 KB - Last synced at: 5 months ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0

AdamPaternostro/Azure-Lock-All-Data-Lake-Stores

Places a resource lock on your ADLS resources so you cannot accidently delete.

Language: PowerShell - Size: 1000 Bytes - Last synced at: 5 months ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0

Related Keywords