An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: sparknlp

Dirkster99/PyNotes

My notebook on using Python with Jupyter Notebook, PySpark etc

Language: Jupyter Notebook - Size: 84.6 MB - Last synced at: 2 months ago - Pushed at: almost 4 years ago - Stars: 11 - Forks: 7

sarahboal/System-Design-for-TripAdvisor-Restaurant-Reviews

System Design and Sentiment Analysis of Restaurant Reviews using Natural Language Processing

Language: Python - Size: 15.7 MB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

uche-madu/deb-application

This repository contains application code for the Wizeline Data Engineering Bootcamp (DEB) 2023. It is one of two repositories for the DEB. The other houses the infrastructure code.

Language: Python - Size: 125 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 1

chuyu-c/NLP-with-Reddit-Comment

This project focuses on the use of big data platforms, specifically Spark (PySpark, SparkML, Spark NLP). We will use the comment text a user posted, categorize the sentiment and predict scores of each comment. Our objective is to understand the dynamics of the Reddit online community and how the way people communicate online leads to different reactions from the community.

Language: Jupyter Notebook - Size: 20.2 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

saadkh1/Bert_Spark_Example

This repository provides examples of using pre-trained BERT models from SparkNLP with PySpark for Natural Language Processing task.

Language: Jupyter Notebook - Size: 290 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

AjaySurya-018/Emotion_Detection_in-text

Web application to detect emotion in text

Language: Jupyter Notebook - Size: 1.61 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

databricks-industry-solutions/ocr-phi-masking

Our joint Solution Accelerator with John Snow Labs automates the detection of sensitive information contained within unstructured data using NLP models for healthcare. Extracted data is stored within the Lakehouse, where teams can use the pre-trained models to easily remove, obfuscate or mask data for downstream analytics at massive scale.

Language: Python - Size: 77.1 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 3

doshiharmish/PoliticalIdeologiesPredictioninNewsArticles

Media diversity shapes perspectives, yet biased news distorts reality, fostering misinformation. 'Political Ideologies Prediction in News Articles' aims to forecast bias using PySpark, NLP, and ML for adaptable, swift inference. Integrated with NYT API, it predicts bias in top political articles, fostering better understanding of subjective content

Language: Jupyter Notebook - Size: 15.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

sdarjunwadkar/Political-Idealogies-Prediction-in-News-Articles

Media diversity shapes perspectives, yet biased news distorts reality, fostering misinformation. 'Political Ideologies Prediction in News Articles' aims to forecast bias using PySpark, NLP, and ML for adaptable, swift inference. Integrated with NYT API, it predicts bias in top political articles, fostering better understanding of subjective content

Language: Jupyter Notebook - Size: 16.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

databricks-industry-solutions/toxicity-detection-in-gaming

Build a lakehouse for all your gamer data and use natural language processing techniques to flag questionable comments for moderation.

Language: Python - Size: 125 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

databricks-industry-solutions/oncology

Generate oncology insights from real-world data using NLP. Once extracted, oncology data is enriched with useful information like ICD-10 codes and used to build powerful visualizations

Language: Python - Size: 265 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 2

databricks-industry-solutions/jsl-medical-risk-factors

Automated Extraction of Medical Risk Factors For Life Insurance Underwriting

Language: Python - Size: 49.8 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

databricks-industry-solutions/jsl-financial-nlp

Drawing a Company Ecosystem Graph

Language: Python - Size: 346 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

databricks-industry-solutions/medicare-risk-adjustment

Databricks and John Snow Labs Solution Accelerator for Medicare Risk Adjustment automates the extraction of undiagnosed member conditions from unstructured clinical notes with NLP models, improving downstream reimbursements.

Language: Python - Size: 81.1 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 5

databricks-industry-solutions/adverse-drug-events

To ensure ongoing drug safety, pharma companies need to monitor and report adverse drug events post-market launch. This accelerator extracts, processes and analyzes adverse drug events from real-world text data using NLP

Language: Python - Size: 105 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 4

cmsptcp/tsmp

Twitter based stock market prediction using Pyspark, project for Big Data PW 2020L

Language: Jupyter Notebook - Size: 2.43 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

VirtualRoyalty/spark-nlp-project

Micro project on big data technologies via spark

Language: Jupyter Notebook - Size: 5.12 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 0

prabhupavitra/Text-Summarization-PySpark

Text summarization algorithms using PySpark

Language: Jupyter Notebook - Size: 3.39 MB - Last synced at: 4 months ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

gympohnpimol/Spark

Language: Python - Size: 13.7 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0