GitHub topics: databricks-industry-solutions
databricks-industry-solutions/digital-pathology
Help augment diagnostic workflows with this Databricks Solution Accelerator for pathology image analysis. Now you can rapidly process thousands of whole slide images in minutes and use machine learning to automate the detection of metastasis.
Language: Python - Size: 6.91 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 16 - Forks: 12

databricks-industry-solutions/segmentation
Create advanced customer segments to drive better purchasing predictions based on behaviors. Using sales data, campaigns and promotions systems, this solution helps derive a number of features that capture the behavior of various households. Build useful customer clusters to target with different promos and offers.
Language: Python - Size: 188 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 8 - Forks: 6

ricardolsmendes/fine-grained-demand-forecasting-infra
Infrastructure provisioning for a customized approach to the Databricks Fine-grained Demand Forecasting accelerator
Language: HCL - Size: 16.6 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

ricardolsmendes/fine-grained-demand-forecasting Fork of databricks-industry-solutions/fine-grained-demand-forecasting
Customized approach to the Databricks Fine-grained Demand Forecasting accelerator, adapted for the Medallion Architecture and Unity Catalog
Language: R - Size: 54.7 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

mananabbasi/Data-Science-Complete-Project-using-Big-Data-Tools-Techniques-
This repository contains Databricks projects utilizing RDDs, DataFrames, and SQL to process and analyze various real-world datasets. Data cleaning and analysis have been performed using PySpark functions to handle challenges such as inconsistent formats, missing values, and complex data structures. The project ensures efficient data transformation
Language: HTML - Size: 3.71 MB - Last synced at: 19 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Databricks-BR/startkit
Pacote de aceleradores para os primeiros passos no Databricks.
Language: Python - Size: 1.62 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

databricks-industry-solutions/interop
From FHIR ingestion to patient outcomes analysis
Language: Python - Size: 124 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 12 - Forks: 6

databricks-industry-solutions/context-graph-analytics
Time series knowledge graphs for cybersecurity
Language: Python - Size: 20 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 18 - Forks: 6

databricks-industry-solutions/smolder-solacc π¦
Burning Through Electronic Health Records in Real Time With Smolder
Language: Scala - Size: 54.7 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 4 - Forks: 3

databricks-industry-solutions/csrd_assistant
In this solution accelerator co-developped with Deloitte France, we demonstrate how generative AI, retrieval augmented generation (RAG) and multi stage reasoning can be used to better navigate through the complexities of regulatory filings, bringing more transparency for companies to disclose their societal and environmental impacts.
Language: Python - Size: 5.16 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

databricks-industry-solutions/ioc-matching
IOC matching for incident responders, threat hunters, detection engineers, and security engineers.
Language: Python - Size: 144 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 10 - Forks: 5

federicopfund/data-engineer
Proceso ETL
Language: Jupyter Notebook - Size: 84.5 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

databricks-industry-solutions/esg-scoring
In this solution, we offer a novel approach to sustainable finance by combining NLP techniques and news analytics to extract key strategic ESG initiatives and learn companies' commitments to corporate responsibility
Language: Python - Size: 6.35 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 35 - Forks: 23

databricks-industry-solutions/nasdaq-crypto
Nasdaq Data Link Digital Assets is a part of Nasdaq's Investment Intelligence suite of products, designed to provide significant value to customers in making informed decisions. As the creator of the world's first electronic stock market, Nasdaq technology powers more than 70 marketplaces in 50 countries, and one in ten of the world's securities tr
Language: Python - Size: 639 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 3

databricks-industry-solutions/geospatial-kanonymity
We demonstrate how FSI can leverage geospatial and graph analytics to anonymize card transaction data and monetize their assets to potential clients securely via delta sharing capability
Language: Python - Size: 21.5 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

databricks-industry-solutions/transaction-embedding
In this solution accelerator, we build a data asset that captures a full picture of the consumer and goes beyond traditional demographics, income, product and services (who you are) and extends to transactional behavior and shopping preferences (how you bank)
Language: Python - Size: 475 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

databricks-industry-solutions/regular-payments
In this solution accelerator, we demonstrate a novel approach to consumer analytics by combining core mathematical concepts with engineering best practices and state of the art optimizations techniques to better model customers' behaviors and provide millions of customers with personalized insights
Language: Python - Size: 1.02 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

databricks-industry-solutions/car-classification
By applying transfer learning on pre-trained neural networks, we demonstrate how Databricks helps insurance companies kickstart their AI/Computer Vision journey towards claim assessment and damage estimation.
Language: Python - Size: 2.05 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 1

databricks-industry-solutions/reg-reporting
In this regulatory reporting solution accelerator, we demonstrate how Delta Live Tables can guarantee the acquisition and processing of regulatory data in real time to accommodate regulatory SLAs. With Delta Sharing and Delta Live Tables combined, analysts gain real-time confidence in the quality of regulatory data being transmitted.
Language: Python - Size: 440 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 1

databricks-industry-solutions/geoscan-fraud
In this series of notebooks centered around geospatial analytics, we demonstrate how Lakehouse enables organizations to better understand customers behaviours, no longer based on who they are, but how they bank, no longer using a one-size-fits-all rule but a truly personalized AI
Language: Python - Size: 3.19 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 4

databricks-industry-solutions/merchant-classification
This series of notebooks shows how the Lakehouse for Financial Services enables banks, open banking aggregators and payment processors to address the challenge of merchant classification
Language: Python - Size: 2.64 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 5

databricks-industry-solutions/predicting-implied-volatility
Language: Python - Size: 58.6 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

databricks-industry-solutions/digitization-documents
Using Apache tika and tesseract to extact text from any document
Language: Python - Size: 1.87 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 4

databricks-industry-solutions/quant-beta-capm
Equity Beta Calculation and CAPM
Language: Python - Size: 2.64 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 7 - Forks: 1

databricks-industry-solutions/value-at-risk
Shows how banks can modernize their risk management practices by back-testing, aggregating and scaling simulations by using a unified approach to data analytics with the Lakehouse.
Language: Python - Size: 1.13 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 7

databricks-industry-solutions/ocr-phi-masking
Our joint Solution Accelerator with John Snow Labs automates the detection of sensitive information contained within unstructured data using NLP models for healthcare. Extracted data is stored within the Lakehouse, where teams can use the pre-trained models to easily remove, obfuscate or mask data for downstream analytics at massive scale.
Language: Python - Size: 77.1 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 3

databricks-industry-solutions/ship2ship-transfers
Ship-to-Ship Transfer Identification using Geospatial Analytics
Language: Python - Size: 63.5 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 1

databricks-industry-solutions/smart-claims
Use Databricks to improve the Claims Management process for faster claims settlement, lower claims processing costs and quicker identification of possible fraud
Language: Python - Size: 194 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 3

databricks-industry-solutions/customer-lifetime-value
Ingest sample retail data, build visualizations to explore past purchase behavior and use machine learning to predict the likelihood of future purchases
Language: Python - Size: 134 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 13 - Forks: 5

databricks-industry-solutions/glow-solution-accelerator
Genome-wide association studies identify genetic variations associated with a target disease or trait. Researchers and clinicians can use this information to better detect, treat and prevent chronic health conditions. This Solution Accelerator notebook builds on top of Glow
Language: Python - Size: 95.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 6

databricks-industry-solutions/toxicity-detection-in-gaming
Build a lakehouse for all your gamer data and use natural language processing techniques to flag questionable comments for moderation.
Language: Python - Size: 125 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

databricks-industry-solutions/customer-er
Translating text attributes (like name, address, phone number) into quantifiable numerical representations Training ML models to determine if these numerical labels form a match Scoring the confidence of each match
Language: Python - Size: 137 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 15 - Forks: 6

databricks-industry-solutions/safety-stock
Create fine-grained and viable estimates of buffer stock for raw material, work-in-progress or finished goods inventory items that can be scaled across the supply chain. Free up working capital that would be tied up in inventory and reallocate to more productive uses.
Language: Python - Size: 80.1 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

databricks-industry-solutions/wide-and-deep
Build a wide-and-deep recommender with collaborative filters that takes advantage of patterns of repeat purchases to suggest both previously purchased and related products.
Language: Python - Size: 89.8 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

databricks-industry-solutions/social-determinants-of-health
Using Delta Sharing to Democratize Insights Into Social Determinants of Health
Language: Python - Size: 85 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 4

databricks-industry-solutions/psm
How to use the Machine Learning Runtime and MLflow on top of a health Delta Lake to predict patient disease
Language: Python - Size: 60.5 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 2

databricks-industry-solutions/real-time-bidding
From display to video, the value of an impression can only be realized if an ad is viewed by a user. Therefore, when using programmatic advertising to buy inventory, itβs important to take viewability into account. In this Solution Accelerator, learn how to predict ad viewability to optimize your real-time bidding strategy.
Language: Python - Size: 64.5 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 2

databricks-industry-solutions/oncology
Generate oncology insights from real-world data using NLP. Once extracted, oncology data is enriched with useful information like ICD-10 codes and used to build powerful visualizations
Language: Python - Size: 265 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 2

databricks-industry-solutions/legend-getting-started
Language: Python - Size: 276 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

databricks-industry-solutions/optimized-picking
Get started with our Solution Accelerator for Order Picking to apply optimization logic to each order to: Avoid unexpected delivery outcomes and assess the impact of small variations on order picking Minimize total store travel time to increase profitability
Language: Python - Size: 83 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

databricks-industry-solutions/jsl-kg-cohorts
Building Patient Cohorts with NLP and Knowledge Graphs
Language: Python - Size: 197 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 1

databricks-industry-solutions/fraud-orchestration
Preempt fraud with rule-based patterns and select ML algorithms for reliable fraud detection. Use anomaly detection and fraud prediction to respond to bad actors rapidly.
Language: Python - Size: 152 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 4 - Forks: 6

databricks-industry-solutions/medicare-risk-adjustment
Databricks and John Snow Labs Solution Accelerator for Medicare Risk Adjustment automates the extraction of undiagnosed member conditions from unstructured clinical notes with NLP models, improving downstream reimbursements.
Language: Python - Size: 81.1 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 5

databricks-industry-solutions/computer-vision-foundations
Enabling Computer Vision Applications With the Data Lakehouse
Language: Python - Size: 95.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 2

databricks-industry-solutions/churn
Develop an understanding of how a customer lifetime should progress and examine where in that lifetime journey customers are likely to churn so you can effectively manage retention and reduce your churn rate.
Language: Python - Size: 142 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 1

databricks-industry-solutions/ab-testing π¦
Language: Python - Size: 114 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 2

databricks-industry-solutions/als-recommender
Products We Think You Might Like: Generating Personalized Recommendations Using Matrix Factorization
Language: Python - Size: 43 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 2

databricks-industry-solutions/anti-money-laundering
AML Solutions at Scale Using Databricks Lakehouse Platform
Language: Python - Size: 69.3 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 2

databricks-industry-solutions/routing
Get started with our Solution Accelerator for Scalable Route Generation to optimize delivery routes and increase profitability
Language: Python - Size: 137 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 2

databricks-industry-solutions/multi-touch-attribution
Connect the impact of marketing and your ad spend to sales. Efficiently pinpoint the impact of various revenue-generating marketing activities to understand what works best. Focus on the best-performing channels to optimize media mix and drive revenue.
Language: Python - Size: 87.9 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 4

databricks-industry-solutions/parts-demand-forecasting
Perform demand forecasting at the part level rather than the aggregate level to minimize disruptions in your supply chain and increase sales. Manage material shortages and predict overplanning
Language: Python - Size: 138 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 2

databricks-industry-solutions/pos-dlt
Get started with our Solution Accelerator to rapidly ingesting all data sources and types at scale, build highly scalable streaming data pipelines with Delta Live Tables to obtain a real-time view of operation, and leverage real-time insights to tackle your most pressing in-store information needs
Language: Python - Size: 111 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 4

databricks-industry-solutions/omop-cdm
Unlocking the Power of Health Data With a Modern Data Lakehouse
Language: Python - Size: 97.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 8 - Forks: 5

databricks-industry-solutions/factory-optimization
Overall Equipment Effectiveness: Performant and Scalable End-to-End Equipment Monitoring
Language: Python - Size: 48.8 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 3

databricks-industry-solutions/survival-analysis
Survival analysis is a collection of statistical methods used to examine and predict the time until an event of interest occurs. In this Solution Accelerator, learn how to use different survival analysis techniques for predicting churn and calculating lifetime value.
Language: Python - Size: 73.2 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 4 - Forks: 3

databricks-industry-solutions/edge-ml-for-manufacturing
Deploying and Maintaining Models on the Edge in Manufacturing
Language: Python - Size: 1.67 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 1

databricks-industry-solutions/digital-twin
Digital twins are created using data derived from sensors (often IoT or IIoT) that are attached to or embedded in the original object. This data provides both structural and operational views of what happens to the object in real time, allowing engineers to monitor systems and model systems dynamics.
Language: Python - Size: 52.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 4

databricks-industry-solutions/dns-analytics
Leverage the Databricks Solution Accelerator for DNS analytics to accelerate time to detection and response across petabytes of data. Tap into DNS traffic logs, enrich streaming threat intelligence, and apply advanced analytics to detect DNS abnormalities and prevent malicious attacks.
Language: Python - Size: 79.1 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 6

databricks-industry-solutions/fuzzy-item-matching
Use machine learning and the Databricks Lakehouse Platform for product matching that can be used by marketplaces and suppliers for various purposes. Resolve differences between product definitions and descriptions and determine which items are likely pairs and which are distinct across disparate data sets.
Language: Python - Size: 61.5 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 3

databricks-industry-solutions/fine-grained-demand-forecasting
Perform fine-grained forecasting at the store-item level in an efficient manner, leveraging the distributed computational power of the Databricks Lakehouse Platform.
Language: R - Size: 145 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 6

databricks-industry-solutions/image-based-recommendations
Build a similarity-based image recommendation system for e-commerce that takes into account the visual similarity of items as an input for making product recommendations.
Language: Python - Size: 72.3 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 4

databricks-industry-solutions/adverse-drug-events
To ensure ongoing drug safety, pharma companies need to monitor and report adverse drug events post-market launch. This accelerator extracts, processes and analyzes adverse drug events from real-world text data using NLP
Language: Python - Size: 105 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 4

databricks-industry-solutions/survival
Preempt churn with the Databricks Solution Accelerator for predicting subscriber attrition. Learn how to analyze behavioral data to identify subscribers with an increased risk of cancellation. Then use machine learning to quantify the likelihood to churn as well as indicate which factors explain that risk.
Language: Python - Size: 83 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 4

databricks-industry-solutions/propensity
Get started with our Solution Accelerator for Propensity Scoring to build effective propensity scoring pipelines that: Enable the persistence, discovery and sharing of features across various model training exercises Quickly generate models by leveraging industry best practices Track and analyze the various model iterations generated
Language: Python - Size: 101 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 1

databricks-industry-solutions/on-shelf-availability
This Solution Accelerator shows how OOS can be solved with real-time data and analytics by using the Databricks Lakehouse Platform to solve on-shelf availability in real time to increase retail sales. The accelerator can also be used for supply chain solutions.
Language: Python - Size: 53.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 2

databricks-industry-solutions/video-streaming-qoe
Increase viewer retention through data-driven engagement strategies: analyze both streaming and batch data sets to ensure a performant streaming content experience that drives engagement and loyalty.
Language: Python - Size: 46.9 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

databricks-industry-solutions/market-basket-analysis
Increase conversion with personalized recommendations: Build a recommender that leverages product affinities to suggest additional items
Language: Python - Size: 71.3 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

databricks-industry-solutions/campaign-effectiveness
Identifying Campaign Effectiveness For Forecasting Foot Traffic
Language: Python - Size: 203 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 1
