GitHub topics: sparknlp
Dirkster99/PyNotes
My notebook on using Python with Jupyter Notebook, PySpark etc
Language: Jupyter Notebook - Size: 84.6 MB - Last synced at: 2 months ago - Pushed at: almost 4 years ago - Stars: 11 - Forks: 7

sarahboal/System-Design-for-TripAdvisor-Restaurant-Reviews
System Design and Sentiment Analysis of Restaurant Reviews using Natural Language Processing
Language: Python - Size: 15.7 MB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

uche-madu/deb-application
This repository contains application code for the Wizeline Data Engineering Bootcamp (DEB) 2023. It is one of two repositories for the DEB. The other houses the infrastructure code.
Language: Python - Size: 125 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 1

chuyu-c/NLP-with-Reddit-Comment
This project focuses on the use of big data platforms, specifically Spark (PySpark, SparkML, Spark NLP). We will use the comment text a user posted, categorize the sentiment and predict scores of each comment. Our objective is to understand the dynamics of the Reddit online community and how the way people communicate online leads to different reactions from the community.
Language: Jupyter Notebook - Size: 20.2 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

saadkh1/Bert_Spark_Example
This repository provides examples of using pre-trained BERT models from SparkNLP with PySpark for Natural Language Processing task.
Language: Jupyter Notebook - Size: 290 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

AjaySurya-018/Emotion_Detection_in-text
Web application to detect emotion in text
Language: Jupyter Notebook - Size: 1.61 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

databricks-industry-solutions/ocr-phi-masking
Our joint Solution Accelerator with John Snow Labs automates the detection of sensitive information contained within unstructured data using NLP models for healthcare. Extracted data is stored within the Lakehouse, where teams can use the pre-trained models to easily remove, obfuscate or mask data for downstream analytics at massive scale.
Language: Python - Size: 77.1 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 3

doshiharmish/PoliticalIdeologiesPredictioninNewsArticles
Media diversity shapes perspectives, yet biased news distorts reality, fostering misinformation. 'Political Ideologies Prediction in News Articles' aims to forecast bias using PySpark, NLP, and ML for adaptable, swift inference. Integrated with NYT API, it predicts bias in top political articles, fostering better understanding of subjective content
Language: Jupyter Notebook - Size: 15.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sdarjunwadkar/Political-Idealogies-Prediction-in-News-Articles
Media diversity shapes perspectives, yet biased news distorts reality, fostering misinformation. 'Political Ideologies Prediction in News Articles' aims to forecast bias using PySpark, NLP, and ML for adaptable, swift inference. Integrated with NYT API, it predicts bias in top political articles, fostering better understanding of subjective content
Language: Jupyter Notebook - Size: 16.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

databricks-industry-solutions/toxicity-detection-in-gaming
Build a lakehouse for all your gamer data and use natural language processing techniques to flag questionable comments for moderation.
Language: Python - Size: 125 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

databricks-industry-solutions/oncology
Generate oncology insights from real-world data using NLP. Once extracted, oncology data is enriched with useful information like ICD-10 codes and used to build powerful visualizations
Language: Python - Size: 265 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 2

databricks-industry-solutions/jsl-medical-risk-factors
Automated Extraction of Medical Risk Factors For Life Insurance Underwriting
Language: Python - Size: 49.8 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

databricks-industry-solutions/jsl-financial-nlp
Drawing a Company Ecosystem Graph
Language: Python - Size: 346 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

databricks-industry-solutions/medicare-risk-adjustment
Databricks and John Snow Labs Solution Accelerator for Medicare Risk Adjustment automates the extraction of undiagnosed member conditions from unstructured clinical notes with NLP models, improving downstream reimbursements.
Language: Python - Size: 81.1 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 5

databricks-industry-solutions/adverse-drug-events
To ensure ongoing drug safety, pharma companies need to monitor and report adverse drug events post-market launch. This accelerator extracts, processes and analyzes adverse drug events from real-world text data using NLP
Language: Python - Size: 105 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 4

cmsptcp/tsmp
Twitter based stock market prediction using Pyspark, project for Big Data PW 2020L
Language: Jupyter Notebook - Size: 2.43 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

VirtualRoyalty/spark-nlp-project
Micro project on big data technologies via spark
Language: Jupyter Notebook - Size: 5.12 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 0

prabhupavitra/Text-Summarization-PySpark
Text summarization algorithms using PySpark
Language: Jupyter Notebook - Size: 3.39 MB - Last synced at: 4 months ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

gympohnpimol/Spark
Language: Python - Size: 13.7 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0
