GitHub topics: databricks-notebooks

inspera/blackbricks

Black for Databricks notebooks

Language: Python - Size: 264 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 45 - Forks: 9

DarrenDavy12/Databricks_Projects

topic-specific projects and end-to-end project

Size: 211 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

gchandra10/gchandra10

Cloud Solutions Architect & Big Data Strategist | Adjunct Professor | Mentor | Driving Digital Transformation & Data-Driven Solutions

Size: 121 KB - Last synced at: 8 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

jaceklaskowski/learn-databricks

Notebooks to learn Databricks Lakehouse Platform

Language: Jupyter Notebook - Size: 6.74 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 29 - Forks: 16

microsoft/nutter

Testing framework for Databricks notebooks

Language: Python - Size: 208 KB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 300 - Forks: 44

Seifo321/Microsoft-Data-Engineer-Project

Leveraging Microsoft AZURE Services , DEVELOPING a high performance ETL pipeline that extracts and transform the BikeStores data and loads it to Azure data warehouse

Language: Python - Size: 8.06 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 1

mexmarv/powerbi-databricks-semantic-gen

Power BI to Databricks Semantic Layer Generator

Language: Jupyter Notebook - Size: 39.1 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 4 - Forks: 1

RHDZMOTA/databrickstools-cli

A simple commandline application to keep in sync between databricks and your local filesystem.

Language: Python - Size: 31.3 KB - Last synced at: 12 days ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 13

edisedis777/PySpark-ML-Features

A PySpark implementation of 6 lesser-known Scikit-Learn features optimized for Azure Databricks. This project translates powerful machine learning techniques from Scikit-Learn into PySpark's distributed computing framework.

Language: Python - Size: 39.1 KB - Last synced at: 4 days ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

databrickslabs/splunk-integration

Databricks Add-on for Splunk

Language: Python - Size: 71.5 MB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 27 - Forks: 18

swapnadb/Nyc-Taxi-Data-Engineering-Project

Language: Jupyter Notebook - Size: 2.94 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

sumit-sinha9/IPL-Data-Analysis-Using-Apache-Spark-on-Databricks

This project focuses on performing an end-to-end analysis of IPL data using Apache Spark on Databricks. It begins with setting up a Databricks environment, followed by ingesting and exploring the IPL dataset.

Language: Jupyter Notebook - Size: 2.43 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

DarrenDavy12/Earthquake-Events-and-Risks-Project---Azure-Data-Pipeline---API-Connection-

Earthquake Events and Risks Project - Azure Data Pipeline - API Connection

Size: 3.94 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

bytebyrajeev/IPL-Data-Analysis-Project-Using-Apache-Spark-on-Databricks

This project focuses on performing an end-to-end analysis of IPL data using Apache Spark on Databricks. It begins with setting up a Databricks environment, followed by ingesting and exploring the IPL dataset.

Language: Jupyter Notebook - Size: 2.43 MB - Last synced at: 18 days ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

bytebyrajeev/Flipkart-Data-Analysis-Using-PySpark-on-Databricks

The project focuses on building an end-to-end data engineering pipeline using PySpark to address real-world business scenarios. Key steps include exploring and understanding the dataset structure, performing data cleaning to handle inconsistencies, and applying transformations to prepare the data for analysis.

Language: Jupyter Notebook - Size: 871 KB - Last synced at: about 19 hours ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

astronomer/astro-provider-databricks 📦

Orchestrate your Databricks notebooks in Airflow and execute them as Databricks Workflows

Language: Python - Size: 11.1 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 23 - Forks: 12

easonlai/Samples_for_Azure_Databricks_Orientation

Samples for Azure Databricks Orientation

Language: HTML - Size: 6.78 MB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 2

Saikesana31/Netflix

Azure Data engineering project

Language: Python - Size: 1.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

VandanaBhumireddygari/Open-Table-Formats-with-Databricks-and-Delta-Lake

This project demonstrates the use of Open Table Formats with Databricks, PySpark, and Delta Lake. It covers data ingestion, transformation, querying, and storage management using Delta tables. The project includes code for loading data, writing it to Delta format, querying, and utilizing Delta Lake

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Azure-Samples/azure-databricks-mlops-mlflow

Azure Databricks MLOps sample for Python based source code using MLflow without using MLflow Project.

Language: Jupyter Notebook - Size: 3.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 84 - Forks: 53

Azure/azure-cosmosdb-spark 📦

Apache Spark Connector for Azure Cosmos DB

Size: 192 MB - Last synced at: 2 days ago - Pushed at: 3 months ago - Stars: 203 - Forks: 120

julianolaurentino/Databricks-notebooks-vendidos

Utilizando Databricks para analisar uma base em SQL de vendas de notebooks

Language: SQL - Size: 43 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

Aaryan-Agr/2015-Yellow-Taxi-Data-Analysis

Exploring NYC Yellow Cab trips through comprehensive EDA techniques to uncover usage patterns and insights.

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

santiagortiiz/Advanced-Data-Engineering-with-Databricks

Databricks. Incremental data processing, task orchestration, and production job monitoring.

Language: Python - Size: 121 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 18 - Forks: 35

mensenvau/data_migration_validation

Data Validation Documentation for Source and Target Tables in Databricks

Language: Python - Size: 4.88 KB - Last synced at: about 19 hours ago - Pushed at: 4 months ago - Stars: 1 - Forks: 1

santoshshinde2012/medallion-architecture-databrics

Medallion Architecture: Principles and Practical Exploration

Language: Jupyter Notebook - Size: 18.8 MB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 1

NHSDigital/sde_example_analysis

Example of what you can do in Databricks in the Secure Data Environment (SDE) using Python, SQL, and R.

Language: Python - Size: 83 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

ThaiTechTales/databricks

This repository is dedicated to showcasing projects built on Databricks, focusing on big data analytics, data engineering, and machine learning workflows.

Size: 25.4 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

hmiladhia/nbmanips

nbmanips allows you easily manipulate ipynb files

Language: Python - Size: 1.01 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 10 - Forks: 1

mananabbasi/Data-Science-Complete-Project-using-Big-Data-Tools-Techniques-

This repository contains Databricks projects utilizing RDDs, DataFrames, and SQL to process and analyze various real-world datasets. Data cleaning and analysis have been performed using PySpark functions to handle challenges such as inconsistent formats, missing values, and complex data structures. The project ensures efficient data transformation

Language: HTML - Size: 3.71 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Majdi-Akrmi/ELT-IPL

This is an End-to-End Data Engineering Project that using the IPL Dataset.

Language: Jupyter Notebook - Size: 1.67 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 5

Databricks-BR/Genie_MS_Teams Fork of carrossoni/DatabricksGenieBOT

Genie Spaces API on Microsoft Teams

Language: Python - Size: 45.9 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 1

analyticalmonk/pyspark_nlp_workshop

Instructions and code for the workshop "From Big Data to NLP Insights: Unlocking the Power of PySpark and Spark NLP"

Language: Jupyter Notebook - Size: 622 KB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 13 - Forks: 2

tknishh/olympic-data-analysis-azure

End-to-End data engineering project with Azure Databricks as cloud service and Tokyo olympic data

Language: Jupyter Notebook - Size: 1.07 MB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 3

tomaztk/Azure-Databricks

Azure Databricks - Advent of 2020 Blogposts

Language: Jupyter Notebook - Size: 44.9 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 60 - Forks: 49

microsoft/A-TALE-OF-THREE-CITIES

Analyzing the safety (311) dataset published by Azure Open Datasets for Chicago, Boston and New York City using SparkR, SParkSQL, Azure Databricks, visualization using ggplot2 and leaflet. Focus is on descriptive analytics, visualization, clustering, time series forecasting and anomaly detection.

Language: R - Size: 21.8 MB - Last synced at: 5 days ago - Pushed at: about 4 years ago - Stars: 86 - Forks: 34

ajaxbarcelonacruyff/databricks_bigquery

Extract BigQuery tables in Databricks Notebook

Language: Jupyter Notebook - Size: 391 KB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

nasirkadri2601/Live_Cricket_Data_Pipeline

This project captures live cricket data in raw JSON format, cleans and transforms it, and stores it in a centralized data warehouse. The data is then used for analysis, including match outcome predictions, player performance, and team strategy insights, enabling data-driven decisions.

Language: Jupyter Notebook - Size: 25.6 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Databricks-BR/startkit

Pacote de aceleradores para os primeiros passos no Databricks.

Language: Python - Size: 1.62 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

TravelXML/APACHE-SPARK-PYSPARK-DATABRICKS

APACHE SPARK: Data Analysis, Transformation, and Visualisation with PySpark, IPL Data Analysis

Language: Jupyter Notebook - Size: 2.25 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 1 - Forks: 1

euiyounghwang/euiyounghwang.github.io

Software Engineer: Euiyoung Hwang

Size: 3.33 MB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

Phelipe-Sempreboni/databricks

Repository for tutorials, information and notes about databricks.

Language: Jupyter Notebook - Size: 19.5 MB - Last synced at: 4 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

bhavanachitragar/Flipkart-Data-Analysis-Using-PySpark-on-Databricks

The project focuses on building an end-to-end data engineering pipeline using PySpark to address real-world business scenarios. Key steps include exploring and understanding the dataset structure, performing data cleaning to handle inconsistencies, and applying transformations to prepare the data for analysis.

Language: Jupyter Notebook - Size: 880 KB - Last synced at: 3 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

ac-gomes/data-engineering-with-databricks

A simple boilerplate for data engineering and data analysis training in Databricks.

Size: 72.3 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 3

fvaleye/delta-buddy

Introducing Delta-Buddy: Your ultimate Delta Lake companion! 🚀 Streamline your data journey with an AI-powered chatbot. Ask Delta-Buddy anything about your Delta Lake.

Language: Python - Size: 106 KB - Last synced at: 7 months ago - Pushed at: almost 2 years ago - Stars: 9 - Forks: 1

RenanBjj/Databricks-PowerBI-OpticalSales

Databricks optical sales

Language: Jupyter Notebook - Size: 2.08 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

pereldegla/twitter-trend-sentiment-analysis-world-cup-use-case

How to get closer to the audience using Twitter: an use case following the France football team run during the 2022 World Cup

Language: Jupyter Notebook - Size: 28.3 KB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

bhavanachitragar/IPL-Data-Analysis-Project-Using-Apache-Spark-on-Databricks

This project focuses on performing an end-to-end analysis of IPL data using Apache Spark on Databricks. It begins with setting up a Databricks environment, followed by ingesting and exploring the IPL dataset.

Language: Jupyter Notebook - Size: 1.64 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

prateekmaj21/Big-Data-Engineering

Code files for Databricks

Language: Jupyter Notebook - Size: 10 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 0 - Forks: 1

matcrg/MVP-Engenharia-de-Dados-PUC-Rio

Esse repositório tem o objetivo de armazenar os arquivos referentes à avaliação final da Sprint de Engenharia de Dados do curso de Pós-Graduação em Ciência de Dados e Analytics da PUC-Rio

Language: HTML - Size: 509 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

Naga-Manohar-Y/PySpark_Sales_Data_Analysis

Unlock hidden insights in your sales data with PySpark! Explore customer spending patterns, product popularity, and sales trends to drive better business decisions.

Language: HTML - Size: 2.65 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

rhejos/ipl_data_analysis

This project explores data analysis of the Indian Premier League utilizing AWS S3, Apache Spark, python, and SQL.

Language: Python - Size: 276 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

stivenramireza/spark-text-mining

Big data processing of news with Text Mining in Apache Spark through 3 fundamental processes: data preparation, searching based on the inverted index and grouping of news by similarity.

Language: Python - Size: 161 KB - Last synced at: about 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 1

ahmedlrashed/E2E-Azure-Pipeline

Databricks ETL Pipeline for retrieving and processing NI TestStand test results, featuring a well-documented notebook for ETL operations, Data Lake for storage, Spark SQL+Python for transformations, and Power BI as the final visualization of factory metrics.

Language: Jupyter Notebook - Size: 1.22 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

brennerh1/databricks-demos

Repository of notebooks and related collateral used in the Databricks Demo Hub, showing how to use Databricks, Delta Lake, MLflow, and more.

Language: Python - Size: 1.06 MB - Last synced at: 26 days ago - Pushed at: about 4 years ago - Stars: 25 - Forks: 52

Giray18/de_task

data engineering task solution

Language: Jupyter Notebook - Size: 675 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

javiizz/SparkProjects-Healthcare_Analysis

Language: Jupyter Notebook - Size: 12.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

sana1410/NYPD-Arrest-Data-Year-to-Date

This repository is used to perform data analysis using Databricks and Tableau on NYC crime datasets

Language: HTML - Size: 1.77 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Betico1928/Talleres-ProcesamientoDeDatosAGranEscala

Exploración los principios del Procesamiento de Datos a Gran Escala con talleres de Databricks y Spark. Aprender herramientas como Pandas y PySpark para el análisis eficiente de grandes conjuntos de datos. Impartidos por John Corredor en la Pontificia Universidad Javeriana.

Language: Jupyter Notebook - Size: 203 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

nabojyoti/ELT-IPL

This is an End-to-End Data Engineering Project that using the IPL Dataset.

Language: Jupyter Notebook - Size: 1.67 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Annielytix/Advanced-Databricks-for-ML-Build-2019

Using Azure Databricks (Spark) for ML, this is the //build 2019 repository with homework examples, code and notebooks

Language: Jupyter Notebook - Size: 14.4 MB - Last synced at: about 1 year ago - Pushed at: almost 6 years ago - Stars: 14 - Forks: 18

newrelic-experimental/nri-spark

This New Relic standalone integration polls the Apache Spark REST API for metrics and pushes them into New Relic using Metrics API It uses the New Relic Telemetry sdk for go

Language: Go - Size: 16.2 MB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 8

dogucanelci/Azure_e2e_data_engineering_project_1

Language: Jupyter Notebook - Size: 11.6 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

AbdelmajidLh/spark-functionality-repo

Ce dépôt GitHub contient un document détaillé sur les bases du langage Scala.

Size: 847 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

devallasaitej/Databricks_DataShift_Shuttle

Databricks DataShift Shuttle

Language: Jupyter Notebook - Size: 101 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

irfanghat/ADVANCED_DATABRICKS_CONCEPTS

This repository contains advanced Databricks concepts

Size: 5.86 KB - Last synced at: about 22 hours ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

itsfelipe-dev/ETL_Azure_Databricks_F1

An Formula One ( F1 ) ETL project in Azure – Azure Data Lake - Data Factory - Azure Databricks

Language: Python - Size: 4.88 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Srking501/csc8101_coursework

A summative coursework for CSC8101 Engineering for AI

Language: Jupyter Notebook - Size: 168 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

retkowsky/Cloud_Workshop_AzureDatabricks

Cloud Workshop Azure Databricks

Language: Jupyter Notebook - Size: 39.2 MB - Last synced at: 21 days ago - Pushed at: almost 6 years ago - Stars: 4 - Forks: 3

RhythmBear/wizeline-capstone-project

My Wizeline Academy Data Engineering Capstone Project Using Astronomer for my managed Airflow Instance.

Language: Jupyter Notebook - Size: 459 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

anilkulkarni87/databricks_notebooks

A collection of Databricks notebooks for testing and learning

Language: HTML - Size: 4.22 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 1

sanogotech/Pyspark-With-Python Fork of krishnaik06/Pyspark-With-Python

PySpark

Language: Jupyter Notebook - Size: 34.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Akash8K/Stocks-Data-Analysis-In-DataBricks

Stocks Data Analysis In DataBricks - Using SQL and Pyspark

Language: HTML - Size: 1.84 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

marigroc/pinterest-data-pipeline53

Creation of the almost-real time data processing pipeline for the Pintrest posts.

Language: Jupyter Notebook - Size: 2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sangwanamit621/sql-solutions-in-pyspark-dataframe-api-and-spark-sql

This repository contains my solutions to various SQL problems from LeetCode, implemented using PySpark DataFrame API and Spark SQL. The goal is to provide alternative solutions and insights for SQL enthusiasts who want to explore the power of PySpark and Spark SQL.

Language: Jupyter Notebook - Size: 71.3 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

hannah0wang/end-to-end-data-reporting

End to end data reporting project using Azure services like Azure Data Factory for data orchestration, Azure Synapse Analytics for data warehousing, Databricks for data transformations, and Power BI for intuitive data visualization and reporting.

Language: Jupyter Notebook - Size: 515 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

microsoft/OnDemandMLflowTrainAndServe 📦

A solution for on-demand training and serving of Machine Learning models, using Azure Databricks and MLflow

Language: Python - Size: 429 KB - Last synced at: 5 days ago - Pushed at: almost 5 years ago - Stars: 18 - Forks: 12

hjh17/dbloy

Continuous Delivery tool for PySpark Notebooks based jobs on Databricks

Language: Python - Size: 591 KB - Last synced at: 28 days ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 1

BlueprintTechnologies/Blueprint-Databricks-Anomaly-Detection-Accelerator

Accelerator code for an anomaly detection module leveraging Databricks for use as part of a Network Threat Detection System

Language: Python - Size: 11.4 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 2 - Forks: 0

aessing/demo-mdwh

Modern Dataware House Demos with Azure Databricks, Azure Data Factory & Azure Dedicated SQL pool (formerly SQL DW)

Size: 48.3 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 2

Annielytix/DevOpsforDatabricks

Are you like me , a Senior Data Scientist, wanting to learn more about how to approach DevOps, specifically when you using Databricks (workspaces, notebooks, libraries etc) ? Set up using @Azure @Databricks

Language: PowerShell - Size: 4.1 MB - Last synced at: about 1 year ago - Pushed at: almost 6 years ago - Stars: 12 - Forks: 14

arnoldchrisoduor1/LinearRegression-Model-with-ApacheSpark-and-DataBricks

Using Apache pySpark on DataBricks, I was able to do feature Engineering on Customer Data, trained and used a Linear Regression Model to predict their bill based on previous customer trends.

Language: Jupyter Notebook - Size: 3.91 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

shubhammirajkar/uber_etl_data_engineering_project

An ETL Pipeline built over GCP and orchestrated by Mage, which involves Extracting Data from GCS Bucket, building Dimensional Model, loading the Data into BigQuery and a Looker Dashboard for further analysis.

Language: Jupyter Notebook - Size: 5.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

quadrantofsola/PySpark_RDD

Analysis of Clinical Trial Dataset using PySpark RDD implementation.

Size: 3.91 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

quadrantofsola/PySpark_Dataframes

Analysis of Clinical Trial Dataset using Dataframes on PySpark

Size: 2.93 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

dg1223/explainable-ai

Model interpretability for Explainable Artificial Intelligence

Language: Jupyter Notebook - Size: 50.8 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

easonlai/eda_for_prudential_life_insurance_sample_data

Notebook sample of Exploratory Data Analysis (EDA) for Prudential Life Insurance Sample Data

Language: Jupyter Notebook - Size: 4.18 MB - Last synced at: 4 months ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

BlueprintTechnologies/Blueprint-Databricks-Demand-Forecasting-Accelerator

Accelerator code for demand planning for retail supply chain, leveraging Databricks

Language: Jupyter Notebook - Size: 7.77 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

Luizfelz/genomics_advances_monitoring_system

📑 System for Monitoring Advances in the field of Genomics

Size: 151 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

mathewsrc/machine-failure-prediction

Predicting machine failure

Language: Jupyter Notebook - Size: 6.34 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

prasanth-m/databricks

Language: HTML - Size: 444 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

Pratikdomadiya/Databricks_workspace

Data exploration, Preprocessing, Analysis, and visualization using PySpark, SparkSQL, Pandas, and Python.

Language: Jupyter Notebook - Size: 4.22 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Muhyd33n/Formula1RacingProject

Real World Project on Formula1 Racing using Azure Databricks, Delta Lake, Unity Catalog, Azure Data Factory [DP203]

Language: Python - Size: 22.5 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sfrechette/azure-databricks-citibike-nyc-analysis

Analyzing NYC bike data with Azure Databricks

Language: Jupyter Notebook - Size: 5.86 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 0

JL200/Big-data-project-ADA

Language: Jupyter Notebook - Size: 2.84 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

ritamghoshgds/DnA-F1-POC

The project harnessed an ETL multi-hop architecture, ingesting data from the Ergast API into a storage backed by Azure Data Lake. The process involved weekly ingestion of bronze layer data as cutover and delta files. Raw data, in varied formats, was transformed using Azure Databricks PySpark notebooks into enriched Silver and Gold layers.

Language: Python - Size: 5.67 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0