An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: azure-databricks

srimantapal205/Subject-Wise-Question---Answer

This branch focuses on building Data Engineering Interview Question and Answer

Size: 395 KB - Last synced at: about 2 hours ago - Pushed at: about 4 hours ago - Stars: 0 - Forks: 0

mysticrenji/az-databricks-pipelines

Repository contains TF files pertaining to Azure pipelines and Azure databricks

Language: HCL - Size: 110 KB - Last synced at: 1 day ago - Pushed at: 4 days ago - Stars: 0 - Forks: 2

Sasuke565/rmodel

rModel is a framework for building LLM applications with agentic workflow agent, agentic, ai, flow, framework, graph, llm, multi-agent, workflow

Size: 1000 Bytes - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

productiveAnalytics/mlops_with_databricks

CI/CD pipeline and MLOps with Databricks (Azure Databricks & Azure DevOps)

Language: Python - Size: 10.7 KB - Last synced at: 19 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

bhavink/databricks

Databricks Platform - Architecture, Security, Automation and much more!!

Language: Jupyter Notebook - Size: 14.4 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 50 - Forks: 27

microsoft/Purview-ADB-Lineage-Solution-Accelerator

A connector to ingest Azure Databricks lineage into Microsoft Purview

Language: C# - Size: 12.8 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 93 - Forks: 58

airscholar/FootballDataEngineering

An end-to-end data engineering pipeline that fetches data from Wikipedia, cleans and transforms it with Apache Airflow and saves it on Azure Data Lake. Other processing takes place on Azure Data Factory, Azure Synapse and Tableau.

Language: Python - Size: 469 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 23 - Forks: 19

adithyavk9/DataEngineeringProjectAdventureWorks

An end to end data engineering project built on Azure

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

rafaelpierre/pyjaws

PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows

Language: Python - Size: 3.46 MB - Last synced at: 22 days ago - Pushed at: 10 months ago - Stars: 43 - Forks: 3

Mohitsai/future-of-hiring

Automated ETL pipeline in Azure for job market analysis using Terraform, Azure Functions, Azure Databricks, Azure Data Lake and PowerBI

Language: HCL - Size: 20.5 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Azure/azure-cosmosdb-spark 📦

Apache Spark Connector for Azure Cosmos DB

Size: 192 MB - Last synced at: 5 days ago - Pushed at: 2 months ago - Stars: 203 - Forks: 121

microsoft/Azure-Databricks-NYC-Taxi-Workshop

An Azure Databricks workshop leveraging the New York Taxi and Limousine Commission Trip Records dataset

Language: Scala - Size: 42.3 MB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 108 - Forks: 108

BlueGranite/DatabricksTraining

Repository for Microsoft Databricks Training Events - Hosted by BlueGranite

Language: Python - Size: 14 MB - Last synced at: 1 day ago - Pushed at: over 5 years ago - Stars: 15 - Forks: 8

retkowsky/Azure-Databricks-Workshop

Azure Databricks workshop

Language: Jupyter Notebook - Size: 2.03 MB - Last synced at: 22 days ago - Pushed at: almost 5 years ago - Stars: 8 - Forks: 6

zBalachandar/Sales-Data-Analytics-Azure-Data-Engineering-End-to-End-Project-13

This project builds an End-to-End Azure Data Engineering Pipeline, performing ETL and Analytics Reporting on the AdventureWorks2017LT Database.

Language: Jupyter Notebook - Size: 23.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 9 - Forks: 4

Dimitrov-S-Dev/resume

Dimitrov-S-Dev Resume/ Portfolio

Language: CSS - Size: 23.1 MB - Last synced at: 2 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

MohssineSERRAJI/azure-data-lake

A lightweight toolkit for Azure Data Lake Storage Gen2 operations, featuring AzCopy commands and Databricks integration examples. Includes sample data and notebooks for quick experimentation with data lake architectures.

Language: Jupyter Notebook - Size: 449 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

enricogoerlitz/explore-azure-databricks

End-to-end backend and data hub architecture on Azure, integrating Databricks and a suite of Azure services for seamless data processing, analytics, and deployment.

Language: Jupyter Notebook - Size: 16.8 MB - Last synced at: 16 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

AdamPaternostro/Azure-Databricks-Dev-Ops

Complete end to end sample of doing DevOps with Azure Databricks

Language: Shell - Size: 887 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 69 - Forks: 102

tahir007malik/ecommerceDataStreamingAnalytics

This repository features a production-grade data pipeline leveraging Confluent Kafka for real-time collection of e-commerce clickstream and user activity data.

Language: Jupyter Notebook - Size: 558 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

syedhassaanahmed/databricks-notebooks

Collection of Databricks and Jupyter Notebooks

Language: Jupyter Notebook - Size: 742 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 21 - Forks: 15

BlueGranite/azure-synapse-vcf-analysis

Sample code for analyzing VCF files (converted to Parquet) in Azure Databricks and Synapse.

Size: 14.8 MB - Last synced at: 2 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

shudhanshurp/News_Recommendation_System

This repository presents a News Recommendation System using Azure Data Factory, Azure Databricks, and Azure Data Lake to create a data pipeline for ML models. It uses BERT for content-based filtering, Neural Collaborative Filtering for user behaviors, and a hybrid model that combines both to enhance news recommendations.

Language: Jupyter Notebook - Size: 55.9 MB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

AzureCosmosDB/scenario-based-labs

Cosmos DB oriented labs for IoT and Retail scenarios

Language: JavaScript - Size: 316 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 112 - Forks: 88

SayamAlt/TMDB-Movies-End-to-End-ETL-and-ML-Pipeline

This project encompasses end-to-end ETL and ML pipeline development. Data ingestion from TMDB API covered top-rated, current, upcoming, and popular movies with genres. Performed EDA to derive several valuable insights and observations. Developed a regression model with 97% r2 score to predict average movie ratings accurately.

Language: Python - Size: 15.6 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

najmaelboutaheri/Patents_analysis

This repository contains code and resources for analyzing patents using Apache Spark, Python, and AWS services. The objective of this project is to extract insights and trends from patent data to inform business decisions and intellectual property strategies.

Language: Jupyter Notebook - Size: 7.79 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

tomaztk/Azure-Databricks

Azure Databricks - Advent of 2020 Blogposts

Language: Jupyter Notebook - Size: 44.9 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 60 - Forks: 49

microsoft/A-TALE-OF-THREE-CITIES

Analyzing the safety (311) dataset published by Azure Open Datasets for Chicago, Boston and New York City using SparkR, SParkSQL, Azure Databricks, visualization using ggplot2 and leaflet. Focus is on descriptive analytics, visualization, clustering, time series forecasting and anomaly detection.

Language: R - Size: 21.8 MB - Last synced at: 7 days ago - Pushed at: about 4 years ago - Stars: 86 - Forks: 34

Sivaprasad-V/Tokyo-Olympics-Azure-Data-Engineering-Project

Azure End To End Data Engineering Project

Language: Jupyter Notebook - Size: 358 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Sivaprasad-V/NYC-TAXI-Azure-Data-Engineering-Project

Azure End To End Data Engineering Project

Language: Jupyter Notebook - Size: 17.4 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Sivaprasad-V/Adventure-Works-Azure-Data-Engineering-Project

Azure End To End Data Engineering Project

Language: Jupyter Notebook - Size: 2.92 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Redgerd/Reddit-Post-Analysis-Workflow

This Reddit Post Analysis Workflow collects and processes Reddit data using Apache Spark and Delta Lake. It transforms raw data, applies sentiment analysis, and extracts TF-IDF features. The pipeline ensures reliable, high-quality data storage and supports continuous analytics.

Language: HTML - Size: 193 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 1

tfayyaz/awesome-azure-databricks

Awesome content all about Azure Databricks

Size: 87.9 KB - Last synced at: 7 days ago - Pushed at: over 3 years ago - Stars: 16 - Forks: 6

gudashashank/tokyo-olympics-analysis

An Azure cloud-based data analytics solution that processes and visualizes the 2021 Tokyo Olympics dataset. This end-to-end pipeline leverages Azure Data Factory for data ingestion, Data Lake Storage Gen2 for secure storage, Databricks for data transformation, Synapse Analytics for SQL querying, and Power BI for interactive visualization

Size: 1.18 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

s-yazhini/PySpark-and-SparkSQL

In Azure DataBricks

Language: Jupyter Notebook - Size: 13.7 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

SayamAlt/Amazon-Products-API-ETL-and-ML-pipeline

In this project, I've created an end-to-end ETL pipeline and subsequently developed a machine learning model to predict the price of Amazon products based on several product-related features.

Language: Python - Size: 2.95 MB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

arnabsaha7/TechRetail-Sales-Analysis

TechRetail Azure Data Pipeline Analysis, provides a robust analysis of retail data via an Azure-based data pipeline.

Language: Jupyter Notebook - Size: 5.21 MB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

uminskib/Toronto_traffic_collisions_and_weather_Azure_Data_Engineering

Comprehensive data engineering solution using Azure platform tools such as Data Factory and Databricks, completed with analysis and dashboard in Power BI.

Language: Jupyter Notebook - Size: 20.5 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

chayansraj/Microsoft-Azure-Medallion-Data-pipeline

In this project we are going to create an end-to-end data platform right from Data Ingestion, Data Transformation, Data Loading and Reporting.

Language: Jupyter Notebook - Size: 11.5 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 7 - Forks: 6

bennyaustin/pyspark-utils

Reusable Python classes that extend open source PySpark capabilities. Examples of implementation is available under notebooks of repo https://github.com/bennyaustin/synapse-dataplatform

Language: Python - Size: 36.1 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 9 - Forks: 4

syedhassaanahmed/spark-with-engineering-fundamentals

E2E Spark data pipelines with engineering fundamentals

Language: HCL - Size: 1.22 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 2 - Forks: 2

lamiaaali/DEPI-Graduation-Project

SkinCare Sentiment Analysis Reviews

Language: Jupyter Notebook - Size: 7.72 MB - Last synced at: 23 days ago - Pushed at: 7 months ago - Stars: 1 - Forks: 2

clouddrove/terraform-azure-databricks

This terraform module is designed to create Azure Databricks resources. Azure Databricks is a fully managed first-party service that enables an open data lakehouse in Azure.

Language: HCL - Size: 56.6 KB - Last synced at: 3 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 3

laismeuchi/dados-databricks-base-cnpj

Projeto utilizando a base de CNPJ da Receita Federal

Language: Python - Size: 84 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

ssanthosh010303/collection-data-training

A collection of challenges exercised during data training program.

Size: 1000 Bytes - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

bennyaustin/synapse-dataplatform

A modern data platform implemented on Azure Synapse Analytics using ELT Framework - https://github.com/bennyaustin/elt-framework. Data platform infrastructure provisioned using https://github.com/bennyaustin/iac-synapse-dataplatform

Language: TSQL - Size: 1.79 MB - Last synced at: 5 months ago - Pushed at: 9 months ago - Stars: 7 - Forks: 6

aymane-maghouti/HR-Data-Pipeline-Azure

This project is a comprehensive data engineering solution that extracts HR data from a GitHub repository, performs data transformations using Azure services, and creates an interactive HR dashboard using Power BI. The goal is to enable HR professionals and decision-makers to gain insights from the HR data for better workforce management.

Language: Jupyter Notebook - Size: 3 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

alexanderbean/E2E-Data-Engineering-in-Azure

End-to-end ETL pipeline in the Microsoft Azure cloud - (Jun '24 - Jul '24)

Language: Python - Size: 1.95 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

sandeep-khr/Tokyo-Olympics-Data-Insights-using-Azure

Size: 343 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

zBalachandar/Tokyo-Olympic-Data-Analytics-Azure-End-To-End-Data-Engineering-Project-12

Tokyo-olympic-azure-data-engineering-end-to-end-project

Language: HTML - Size: 44.5 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

Intellipaat-software-solution-official/Azure-Data-Engineering-Capstone-Project

This Capstone Project includes an End to End Data Engineering Pipeline right from Ingesting the data from HTTPs server to cleaning and transforming the data in Azure Databricks and finally reporting the data on Power BI Desktop

Size: 6.87 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

randyroac/azure-databricks-etl-project

ETL motor racing data project using Azure Databricks, Pyspark and Azure Date Lakes

Language: Python - Size: 1.52 MB - Last synced at: 10 months ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 2

cheukhin1024/Financial-Data-Project-in-Azure

Free High-Quality Financial Data in Azure

Language: Python - Size: 848 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 8 - Forks: 5

ahmedlrashed/E2E-Azure-Pipeline

Databricks ETL Pipeline for retrieving and processing NI TestStand test results, featuring a well-documented notebook for ETL operations, Data Lake for storage, Spark SQL+Python for transformations, and Power BI as the final visualization of factory metrics.

Language: Jupyter Notebook - Size: 1.22 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

mihirkudale/Olympic-data-analysis-azure-data-engineering-project

Language: Jupyter Notebook - Size: 143 KB - Last synced at: about 2 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

philnandreoli/metadataingestion

This is a event driven meta data ingestion tool that I am building with Azure leveraging several of Azure PaaS services.

Language: JavaScript - Size: 703 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Annielytix/Ready2019_AA_AI_200

A Beginner's Guide to Azure Databricks

Size: 24.7 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 9 - Forks: 14

Annielytix/008-DatabricksIntroML

Ready2019_WTH_DatabricksIntroML

Language: Scala - Size: 43.6 MB - Last synced at: about 1 year ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 3

Annielytix/Advanced-Databricks-for-ML-Build-2019

Using Azure Databricks (Spark) for ML, this is the //build 2019 repository with homework examples, code and notebooks

Language: Jupyter Notebook - Size: 14.4 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 14 - Forks: 18

matgonz/azure_databricks_mlops_e2e

[👩‍🏫] In this repository I'll show you how to use Azure Databricks for development and training machine learning models, and build a MLOps pipeline to serving them with CI/CD process.

Language: Jupyter Notebook - Size: 2.9 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

sakethmukkanti/Machinery-Moniter-Iot-Streaming-With-Azure

An application developed to give real-time insights on machine health using Iot sensors by tracking and monitoring parameters such as temperature, pressure, current and humidity.

Language: Jupyter Notebook - Size: 210 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

RJ-Raj/IoT-Data-Pipeline

This repository contains code for an end-to-end IoT data pipeline using Azure services. It ingests, processes, and stores IoT device data from AWS S3 to Azure Data Lake Storage and Azure SQL Database, leveraging Azure Data Factory and Azure Functions for seamless integration and automation.

Language: Python - Size: 14.6 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

sakethmukkanti/Movielens-Dataset-Analysis-Azure-Data-Engineering-Project

Created a movie recommendation system on Azure utilizing Spark SQL for analyzing the MovieLens dataset.

Language: Jupyter Notebook - Size: 1.6 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

rohitkulkarni08/Azure-ETL-AmazonSalesAnalysis

A comprehensive ETL pipeline and sales analysis project leveraging Microsoft Azure and PySpark, designed to optimize e-commerce sales by providing actionable insights through detailed data analysis.

Language: Jupyter Notebook - Size: 8.04 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

rohitkulkarni08/Azure-ETL-Pipeline-MovieAnalytics

This project demonstrates an ETL pipeline using Microsoft Azure for IMDb Movie Rating Dataset analysis. It covers data extraction from Azure Blob Storage, transformation with Azure Databricks, and loading into Azure SQL using Azure Data Factory. The pipeline automates insights generation and is a practical example of cloud-based data engineering.

Language: Jupyter Notebook - Size: 15.9 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Annielytix/Ready2019_AA_AI319

What the Hack Challenge format of the Advanced Databricks Workshop

Language: Jupyter Notebook - Size: 39.3 MB - Last synced at: about 1 year ago - Pushed at: about 6 years ago - Stars: 17 - Forks: 16

Srking501/csc8101_coursework

A summative coursework for CSC8101 Engineering for AI

Language: Jupyter Notebook - Size: 168 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

aycignl/CodingEnvironment

Notes about platforms such as Azure-Databricks, Apache Spark, Azure-DevOps etc.

Size: 7.81 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Amrit-Hub/DP-203-Data-Engineer-Associate-Questions

This repo contains "Azure Data Engineer Associate" Questions and related docs.

Size: 29.8 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

Shashi42/Azure-End-to-End-Sales-Data-Analytics-Pipeline

This project builds an End-to-End Azure Data Engineering Pipeline, performing ETL and Analytics Reporting on the AdventureWorks2022LT Database.

Language: Jupyter Notebook - Size: 501 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

just-modeling/jupyterhub-k8s-apache-spark

Deploy apache spark in client mode on Kubernetes cluster, integrate with Jupyter notebook through Jupyterhub server.

Language: Shell - Size: 612 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Philippos01/mlops-energy-forecast-thesis

Automated pipeline for energy consumption forecasting across Europe using Azure cloud and Databricks.

Language: Python - Size: 1.37 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 4 - Forks: 1

giufalcao/Formula-1

A data pipeline project build on databricks and azure to demostrate lifecycle of a cloud data project.

Language: Python - Size: 5.21 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

aessing/demo-mdwh

Modern Dataware House Demos with Azure Databricks, Azure Data Factory & Azure Dedicated SQL pool (formerly SQL DW)

Size: 48.3 MB - Last synced at: 6 days ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 2

Annielytix/DevOpsforDatabricks

Are you like me , a Senior Data Scientist, wanting to learn more about how to approach DevOps, specifically when you using Databricks (workspaces, notebooks, libraries etc) ? Set up using @Azure @Databricks

Language: PowerShell - Size: 4.1 MB - Last synced at: about 1 year ago - Pushed at: almost 6 years ago - Stars: 12 - Forks: 14

ciaran28/dstoolkit-mlops-databricks Fork of microsoft/dstoolkit-mlops-databricks

ML Ops Accelerator: Databricks & Azure Machine Learning Unification

Language: Python - Size: 82.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

AndreasGwosdz/azure-databricks-lakehouse

Building a Data Lakehouse with Azure Databricks 🚧

Size: 1000 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ezwiefel/azure-databricks-api

A wrapper for the Azure Databricks REST API

Language: Python - Size: 62.5 KB - Last synced at: 4 days ago - Pushed at: over 4 years ago - Stars: 7 - Forks: 9

dishadas168/demand-forecasting-ebay

A demand forecasting pipeline deployed on Azure and AWS

Language: Python - Size: 1.31 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 2

manaswipatil/Tokyo-Olympics-Data-Analytics-in-Azure

Azure pipeline for data analytics on Tokyo Olympics data

Size: 507 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

easonlai/eda_for_prudential_life_insurance_sample_data

Notebook sample of Exploratory Data Analysis (EDA) for Prudential Life Insurance Sample Data

Language: Jupyter Notebook - Size: 4.18 MB - Last synced at: 3 months ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

poojatripathi06/Covid-reporting-adf

building a real-world data pipeline in Azure Data Factory (ADF) dataset provided by https://www.ecdc.europa.eu/ ingesting data from sources such as HTTP and Azure Blob Storage into Azure Data Lake Gen2 using ADF. transformed data and loaded transformed data using Databricks Notebook Activity in Azure Data Factory (ADF) and load into Azure Data Lake Storage Gen2.

Size: 121 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

epomatti/az-databricks-etl

Sample notebooks on Azure Databricks for ETL

Language: Scala - Size: 7.81 KB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

sfrechette/azure-databricks-citibike-nyc-analysis

Analyzing NYC bike data with Azure Databricks

Language: Jupyter Notebook - Size: 5.86 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 0

AdamPaternostro/Azure-Databricks-CI-CD-Initial-Token

How to do CI/CD with Azure Databricks and get the initial Databricks token.

Language: C# - Size: 501 KB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 4

dazfuller/whats-new-databricks-talk

Content for What's new in Databricks talk

Language: PowerShell - Size: 85.9 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

anderl80/aml-vs-adb

End-to-end ML pipelines in Azure Machine Learning and Azure Databricks.

Language: Jupyter Notebook - Size: 188 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 1

thedatanerdz/DEP-4

Pyspark tutorials

Language: Jupyter Notebook - Size: 40 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

venkatakamaiah46/Azure

POC projects working on Cloud Platforms

Language: HTML - Size: 208 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

AnthonyByansi/Azure-Data-Fundamentals-Guide

A comprehensive guide to understanding and implementing data management and analytics solutions in the Azure ecosystem using Azure Data Fundamentals.

Language: Mermaid - Size: 74.2 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 10 - Forks: 3

aminekaabachi/azure-databricks-sdk-python 📦

[archived] A Python SDK for the Azure Databricks REST API 2.0

Language: Python - Size: 94.7 KB - Last synced at: 22 days ago - Pushed at: almost 2 years ago - Stars: 13 - Forks: 6

iBalajiShanmugam/formual1

"Explore Formula 1 data analytics with this project. Leveraging the Ergast API, it utilizes Databricks Spark for ingestion, transformation, and analysis. ADLS acts as the storage layer, while Power BI visualizes the ADLS presentation layer. Uncover insights in the world of Formula 1 through powerful data analytics."

Language: Python - Size: 33.2 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Abdelrahman13-coder/Data-Integration-Pipelines-for-NYC-Payroll-Data-Analytics

Size: 3.97 MB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Jayvardhan-Reddy/Azure-Certification-DP-200

Road to Azure Data Engineer Part-I: DP-200 - Implementing an Azure Data Solution

Size: 4.42 MB - Last synced at: almost 2 years ago - Pushed at: almost 5 years ago - Stars: 57 - Forks: 46

Abdelrahman13-coder/Building-an-Azure-Data-Lake-for-Bike-Share-Data-Analytics

Language: Jupyter Notebook - Size: 983 KB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 1

iammustafatz/Mlflow-Diabetes-Prediction-Pipeline

This repository showcases how to build a machine learning pipeline for predicting diabetes in patients using PySpark and MLflow, and how to deploy it using Azure Databricks.

Language: Jupyter Notebook - Size: 1.89 MB - Last synced at: 5 months ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

yefanli0310/DE-Projects-on-Azure-Databricks

Mount file from Azure to Databricks and ETL transformation on the datasets

Language: Jupyter Notebook - Size: 3.66 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

tomarv2/terraform-databricks-azure-workspace

Terraform module to create Databricks Azure workspace

Language: HCL - Size: 387 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

PujitH-V/ETL_with_Pyspark_-_SparkSQL

A sample project designed to demonstrate ETL process using Pyspark & Spark SQL API in Apache Spark.

Language: HTML - Size: 305 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 4

raksit31667/azure-devops-databricks-rest-api 📦

Azure DevOps extension for interacting with Azure Databricks via REST API

Language: TypeScript - Size: 921 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

Related Keywords
azure-databricks 120 azure-data-factory 49 azure 47 pyspark 33 databricks 29 azure-data-lake 25 spark 24 azure-synapse-analytics 21 python 20 data-engineering 15 apache-spark 14 databricks-notebooks 11 azure-devops 11 etl-pipeline 10 powerbi 10 machine-learning 10 sql 10 azure-storage 10 delta-lake 9 azure-pipelines 8 azure-data-lake-gen2 8 data-science 7 spark-sql 6 terraform 6 azure-sql-database 6 azure-blob-storage 6 data-analytics 6 scala 5 data-visualization 5 mlops 5 python3 5 etl 5 mlflow 5 power-bi 4 data 4 jupyter-notebook 4 azure-machine-learning 4 azure-functions 4 data-transformation 4 microsoft 4 microsoft-azure 4 azure-key-vault 4 parquet 3 data-ingestion 3 cosmos-db 3 github-actions 3 azure-cosmos-db 3 adlsgen2 3 azure-active-directory 3 blob-storage 3 azure-synapse 3 microsoft-power-bi 3 pyspark-notebook 3 datalake 3 airflow 3 azure-sql 3 azuredatabricks 2 azure-eventhub 2 analytics 2 big-data 2 api-client 2 sparksql 2 automl 2 cloud-computing 2 data-pipelines 2 hacktoberfest 2 database 2 azure-data-lake-storage-gen2 2 tableau 2 feature-engineering 2 databricks-api 2 confluent-kafka 2 azure-event-hubs 2 azure-iothub 2 data-pipeline 2 timeseries-forecasting 2 synapse 2 sparkr 2 data-analysis-python 2 azure-delta-lake 2 azure-data-lake-storage 2 spark-structured-streaming 2 azure-ml 2 extract-transform-load 2 model-training-and-evaluation 2 apache-airflow 2 bigdata 2 devops 2 lineage 2 etl-automation 2 pyspark-mllib 2 exploratory-data-analysis 2 spark-streaming 2 azure-arm-template 2 deltalake 2 regression-models 2 data-lake 2 azure-synapse-serverless-sql 2 ssms 2 azuredatafactory 2