An open API service providing repository metadata for many open source software ecosystems.

Topic: "stopwords-removal"

nishi1612/Email-Spam-Classification-using-SVM

The uploaded codes help to classify emails into spam and non spam classes by using Support Vector Machine classifier.

Language: Python - Size: 463 KB - Last synced at: 4 months ago - Pushed at: about 5 years ago - Stars: 31 - Forks: 11

eklem/stopword-trainer

A module for creating stopword lists for any language, based on a set of documents.

Language: JavaScript - Size: 6.16 MB - Last synced at: 10 days ago - Pushed at: 10 months ago - Stars: 15 - Forks: 0

kennedyCzar/NLP-PROJECT-BOOK-INSIGHTS-WITH-PLOTLY

Plotly-Dash NLP project. Document similarity measure using Latent Dirichlet Allocation, principal component analysis and finally follow with KMeans clustering. Project is completed with dynamic visual interaction.

Language: Python - Size: 171 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 10 - Forks: 5

juanantoniodelgado/StopWords

PHP StopWords removal library with support for multiple languages.

Language: PHP - Size: 142 KB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 9 - Forks: 8

Foysal87/bn_nlp

Bangla NLP toolkit.

Language: Python - Size: 6.84 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 9

JoyeBright/stopwords_guilannlp

A python package to be used in removing stopwords in different languages.

Language: Python - Size: 89.8 KB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 1

bryanchw/Traditional-Chinese-Stopwords-and-Punctuations-Library

Created a Python library specifically for Traditional Chinese stopwords and punctuations removal

Language: Python - Size: 43.9 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 1

Vishal1999-33/Online-News-Popularity

Crawling news and information website and anticipating the likelihood of its virality.

Language: Jupyter Notebook - Size: 8.49 MB - Last synced at: 6 months ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 1

share424/Android-Sastrawi

Android Sastrawi is a Natural Language Processing Toolkit for Bahasa Indonesia

Language: Kotlin - Size: 362 KB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

Isa1asN/plagiarism-detector

Plagiarism detection for Amharic language text

Language: Jupyter Notebook - Size: 635 KB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

aarryasutar/Hate_Speech_Detection

This project aims to detect hate speech on Twitter using advanced NLP and machine learning techniques, exploring feature extraction methods like TF-IDF and sentiment analysis, and evaluating models such as Logistic Regression and SVM.

Language: Jupyter Notebook - Size: 1.83 MB - Last synced at: about 2 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

pictureinthenoise/gotstopwords

Python package that makes it easy to use stop words lists in Python projects.

Language: Python - Size: 299 KB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

kevinmastascusa/CORD_19_Research

"KZM COVID Informatics: A repository for data analysis and insight extraction from the CORD-19 dataset, focused on advancing our understanding of the COVID-19 pandemic."

Language: Jupyter Notebook - Size: 88.2 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

Naren-7701/DEVICE-RECOMMENDATION-SYSTEM

Device Recommendation System using Cosine Similarity. It will recommend Electronic Gadgets based on Similar Configuration.

Language: Jupyter Notebook - Size: 38.1 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

chiaszu/youtube-comments-sentiment-analysis

YouTube Comments as a Corpus of Sentiment Analysis is the final project of DFLL672 Corpus Linguistics.

Language: Jupyter Notebook - Size: 860 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

antononcube/Raku-Lingua-StopwordsISO

Raku package for stop words of different languages and stop words deletion. Provides corresponding CLI scripts.

Language: Raku - Size: 94.7 KB - Last synced at: 29 days ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

elmurod1202/survey-clustering

K-means clustering of texts (survey answers) using word-embeddings, finding optimal elbow-point, and averaging multiple-word expressions.

Language: Python - Size: 1.23 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

ananyaroy1011/Fake-News-Classification

Given the title of a fake news article A and the title of a coming news article B, program classifies B into agree, disagree, and unrelated.

Language: HTML - Size: 410 KB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

okzapradhana/stopword-analysis

Stopword Analysis on Text Mining - With dataset from Kaggle: https://www.kaggle.com/nltkdata/web-text-corpus

Language: Jupyter Notebook - Size: 1.71 MB - Last synced at: over 2 years ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

wahki/email-spam-classifier

📩 Email spam classiffier with Multinomial NB & TFIDF Vectorizer and using Streamlit for Modern UI.

Language: Jupyter Notebook - Size: 2.51 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

SridharYadav07/AI--Powered-Task-Management-System

An intelligent Task Management System that integrates Sentiment Analysis, Task Optimization, and Forecasting to streamline project and task handling. This AI-powered tool is designed to assist teams and project managers in making data-driven decisions by understanding emotional context, forecasting productivity, and optimizing workload distribution

Language: Jupyter Notebook - Size: 62.5 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

udaykiran9392/fakenews_detection_using_ML

Implemented a machine learning model to detect fake news using Natural Language Processing techniques like TF-IDF and stemming. Trained multiple classifiers including Logistic Regression and PassiveAggressiveClassifier for accurate classification. This project showcases practical NLP skills for tackling misinformation in media.

Language: Jupyter Notebook - Size: 10.6 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Harshit2807161/SIH-2022-DR710-PROSOL Fork of Ashutosh-Kumar-Singh-IIT-Patna/SIH-2022-DR710-PROSOL

Smart India Hackathon 2022

Language: CSS - Size: 61.7 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

abeed04/Sentiment-Analysis-using-Recurrent-Neural-Networks

Bidirectional RNNs are used to analyze the sentiment (positive, negative, neutral) of movie reviews. .

Language: Jupyter Notebook - Size: 24.8 MB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

faustinalazarus/Data-Preprocessing-Application

Data Pre-processing Application/UI is a simple UI which can automate repitive tasks, while ensuring consistency and efficiency in NLP data preprocessing.

Language: Python - Size: 11 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Aalaa4444/Text_Processing-and-Unique_Word_Extraction_fromHTML

Extract text content from an HTML page, process it, and extract unique words from the processed text. This notebook utilizes various text processing techniques including cleaning, normalization, tokenization, lemmatization or stemming, and stop words removal.

Language: Jupyter Notebook - Size: 12.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

JanviSK/SMS-Spam-Detection-using-NLP

This project implements NLP and Classification models for Spam SMS detection

Language: Jupyter Notebook - Size: 242 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Blitz464/Call-Transcript-classification

This was a hackathon project that I worked on for BestBuy around classifying the call transcripts using ML & NLP techniques

Language: Jupyter Notebook - Size: 10.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

AqilaFadia/app-Restaurant-Review

This application serves as a powerful tool for categorizing restaurant reviews as either negative or positive. Its primary purpose is to provide restaurateurs and managers with an efficient means of evaluating customer feedback. By distinguishing between negative and positive comments, this app aims to enhance the quality of service.

Language: Jupyter Notebook - Size: 413 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

D4rkisek/Sentiment_Classification_NLP

NLP methods for distinguishing positive and negative reviews written about movies.

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

UtkarshTiwari123/Information-Retrieval-System

The aim of the code is to present a solution for retrieving specific passages or paragraphs from documents along with the document names based on user queries.

Language: Jupyter Notebook - Size: 659 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Shakiba-Alipour/Information-Retrieval-on-CISI

Implementation and evaluation an information retrieval system

Language: Jupyter Notebook - Size: 1.8 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Balajirvp/Sentiment-Analysis-of-Movie-Reviews

Performed Sentiment Analysis of Movie reviews using Bag of Words and TF-IDF Vectorizers.

Language: Jupyter Notebook - Size: 501 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Kheem-Dh/Text-Data-Preprocessing-

Text Data Preprocessing

Language: Jupyter Notebook - Size: 15.6 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

manasik29/Named-Entity-Recognition-Emotion-Mining-on-Apple-reviews

Named Entity recognition and emotion mining on Apple Macbook reviews.

Language: Jupyter Notebook - Size: 3.04 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

manasik29/Sentiment_Analysis_on_Elon_Musk_Tweets

Performed Sentiment Analysis on Elon Musk's Tweets. Extracting Positive or Negative Sentiment.

Language: Jupyter Notebook - Size: 442 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 1

atul04/TopicClassificationChallenge

Long english text passages are given, a genuine topic is needed to be assigned to the particular text passage. After cleaning the dataset, features were learnt using thidf approach, Linear SVC is used to get the final prediction

Language: Python - Size: 15.8 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

irfandythalib/python-indonesia-stopwords-remover

This code is used to remove stopwords using Tala stopwords library for Indonesia. Very useful for text processing

Language: Jupyter Notebook - Size: 6.84 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 1

meng-ucalgary/ensf-612-assignment-2

An assignment on preprocessing of text including tokenization, stop word removal, noise reduction, and stemming

Language: HTML - Size: 84.3 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

meng-ucalgary/ensf-612-assignment-1

An assignment on preprocessing of text including tokenization, stop word removal

Language: HTML - Size: 84.5 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

devshashwat/Tweets-Vector-Space-Model

Using Vector Space Model in Simple Tweets Database with Custom Test Cases for COVID-19 related Misinformation Data.

Language: Java - Size: 4.56 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

chandnii7/Fake-News-Classification

Given the title of a fake news article A and the title of a coming news article B, program classifies B into agree, disagree, and unrelated.

Language: HTML - Size: 31.3 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

mesc08/movie-reviews-sentiment-analysis

A MACHINE LEARNING PROJECT IMPLEMENTATION ON REAL LIFE EXAMPLE

Language: Jupyter Notebook - Size: 58.6 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

Rajdeep2121/NLP-Fundamentals

Basics of Natural Language Processing

Language: Python - Size: 2.79 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

djuliar/ir_stemming

Simpel aplikasi untuk Tokenisasi, Stopword Removal, dan Stemming pada Information Retrieval dengan Codeigniter

Language: PHP - Size: 1.91 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 1

SeanFlannery/NAR-Data-Discovery

Nucleic Acids Research Data Discovery

Language: Jupyter Notebook - Size: 10.8 MB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

ctomtom/Word.cloud

Language: Python - Size: 345 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

anuragkr29/TweetAnalysis

Work with a set of Tweets about US airlines and examine their sentiment polarity.The aim is to learn to classify Tweets as either “positive”, “neutral”, or “negative” by using two classifiers and pipelines for pre-processing and model building.

Language: Scala - Size: 4.88 KB - Last synced at: almost 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

Euno257/Sentimental-Analysis-on-Twitter-US-Airline-dataset

Language: Jupyter Notebook - Size: 1.84 MB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

austinjoyal/spark-remove-stopwords

Prints contents of file after filtering out stopwords.

Language: Python - Size: 0 Bytes - Last synced at: 3 days ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

TawfiqAbuArrh/SearchEngine

An Implement of search Engine

Language: Java - Size: 18.6 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

mattlyons0/Blackboard-Test-Grader

A Chrome Extension which enables automatic grading (currently using Porter's Stemmer and Stopword Removal) of (short) Free Response questions in Blackboard.

Language: JavaScript - Size: 1.25 MB - Last synced at: almost 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

yeshamittal/sentimentAnalysis

Language: Python - Size: 27.3 KB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0

mraduldubey/bostonbombing

Getting started with Twitter data analysis.

Language: Jupyter Notebook - Size: 9.55 MB - Last synced at: over 2 years ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 1

Related Topics
nlp 18 stemming 13 tokenization 11 nltk 10 lemmatization 9 stopwords 9 python 9 jupyter-notebook 8 nlp-machine-learning 6 tokenizer 6 bag-of-words 6 logistic-regression 5 natural-language-processing 5 naive-bayes-classifier 4 tf-idf 4 cosine-similarity 4 stemmer 4 regular-expression 4 preprocessing 4 gensim 4 sentiment-analysis 4 beautifulsoup 4 scikit-learn 4 information-retrieval 4 text-processing 3 dataset 3 rdd 3 pyspark 3 doc2vec 3 numpy 3 tf-idf-vectorizer 3 tfidf-vectorizer 3 naive-bayes 3 python3 3 mlp-classifier 3 text-classification 3 machine-learning 2 nltk-python 2 svm 2 databricks 2 noise-reduction 2 multinomial-naive-bayes 2 flask 2 countvectorizer 2 udf 2 word-embeddings 2 binary-classifier 2 similarity-measures 2 tweets 2 java 2 multinomial-logistic-regression 2 wordcloud 2 multi-layer-perceptron 2 covid19-data 2 lemmetization 2 fake-news-classification 2 spark 2 text-mining 2 seaborn 2 porter-stemmer 2 subjectivity 2 pandas 2 pandas-dataframe 2 data-science 2 vectorization 2 sentiment-classification 2 support-vector-machines 2 machine-learning-algorithms 2 random-forest 2 text 2 text-tokenization 1 plotly-dash 1 plotly-python 1 silhouette 1 tfidf-text-analysis 1 textvectorization 1 text-normalization 1 passive-aggressive-classifier 1 matplotib 1 text-lemmatization 1 googlecolab 1 text-extraction 1 spacy-nlp 1 text-cleaning 1 requests 1 unsupervised-machine-learning 1 data-extraction 1 extract-html 1 streamlit 1 sentence-embeddings 1 sentence-similarity 1 sentence-tokenizer 1 sorting 1 csv 1 recoomendation-system 1 randomforestregressor 1 callbacks 1 clustering-algorithm 1 corpus-processing 1 dash 1