An open API service providing repository metadata for many open source software ecosystems.

Topic: "tfidf-text-analysis"

zayedrais/DocumentSearchEngine

Document Search Engine project with TF-IDF abd Google universal sentence encoder model

Language: Jupyter Notebook - Size: 28.6 MB - Last synced at: 20 days ago - Pushed at: almost 2 years ago - Stars: 53 - Forks: 24

zjohn77/retrieval

Tunable full text search engine in JavaScript that: (1) works natively on web apps like Express.js; (2) easy to customize (via BM25) to specific types of documents (e.g. tweets, scientifc journals); (3) is deployable on either the client-side or the server side.

Language: JavaScript - Size: 8.16 MB - Last synced at: 18 days ago - Pushed at: about 6 years ago - Stars: 34 - Forks: 9

CSQianDong/ArticleChecking

文本查重小程序

Language: Python - Size: 541 KB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 15 - Forks: 6

howardvickers/resume-match

Web app to match resume to job type, using nlp svm classifier model. Data via webscraping. Uploaded resume converted from PDF to text using OCR.

Language: HTML - Size: 3.99 MB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 14 - Forks: 13

samujjwaal/Cranfield-Vector-Space-Model

Implementation of a Vector Space Retrieval Model using TF-IDF and cosine similarity on the Cranfield document corpus

Language: Jupyter Notebook - Size: 2.17 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 7 - Forks: 8

chiraag-kakar/FUND

An NLP model to detect fake news and accurately classify a piece of news as REAL or FAKE trained on dataset provided by Kaggle.

Language: Jupyter Notebook - Size: 11.1 MB - Last synced at: 27 days ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 0

sid-thiru/Text-Classification-with-TFIDF-and-sklearn

Language: Python - Size: 2.93 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 4 - Forks: 2

VipinJain1/VIP-PCA_tSNE

Language: Jupyter Notebook - Size: 9.99 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 4 - Forks: 1

mutalibcs/Twitter-Sentiment-Analysis

Twitter Sentiment Analysis

Language: Jupyter Notebook - Size: 37.2 MB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

sherincheah/amz-ecom-recommender

E-Commerce Recommendation System

Language: Jupyter Notebook - Size: 13.3 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 4

himalayan-sanjeev/Nepali_Text_Summarization_Extractive

Extractive Text Summarizer, based on tf-idf text representation (an example)

Language: Jupyter Notebook - Size: 21.5 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 3

yansun1996/CSE258-Recommender-Systems

Code for UCSD CSE 258 Web Mining and Recommender Systems

Language: Jupyter Notebook - Size: 20.3 MB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 3 - Forks: 3

hanifhefaz/elm-tf-idf

Elm implementation of Term Frequency-Inverse Document Frequency (TF-IDF) for text analysis

Language: Elm - Size: 5.86 KB - Last synced at: 20 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

N-Elmer/CHAT-SUMMARIZER

CHAT 🗣️ SUMMARY 📈 ANALYSIS

Language: Jupyter Notebook - Size: 42.1 MB - Last synced at: 26 days ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

bysiber/text_similarity_tfidf

The project utilizes the TF-IDF (Term Frequency-Inverse Document Frequency) algorithm. The main objective of this project is to measure the similarity between text documents using the TF-IDF algorithm.

Language: Python - Size: 4.88 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

k-loki/My-NLP-adventures

Checkout my adventures into NLP here.

Language: Jupyter Notebook - Size: 1.3 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

rishabh-karmakar/Detection-of-Real-or-Fake-News

Detect Real or Fake News. To build a model to accurately classify a piece of news as REAL or FAKE. Using sklearn, build a TfidfVectorizer on the provided dataset. Then, initialize a PassiveAggressive Classifier and fit the model. In the end, the accuracy score and the confusion matrix tell us how well our model fares.

Language: Jupyter Notebook - Size: 11.7 MB - Last synced at: 3 months ago - Pushed at: almost 5 years ago - Stars: 2 - Forks: 0

gabrielpreda/Support-Tickets-Classification Fork of imironica/Support-Tickets-Classification

This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en

Language: Python - Size: 5.62 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 5

boosuro/text_classification_using_naive_bayes_algorithm

Text classification using Naive Bayes Algorithm¶

Language: Jupyter Notebook - Size: 106 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

yxshee/toxic-terminator

a deep learning based toxic comment classification system to detect and classify toxic texts, promoting healthy conversation by discouraging negative or profane language in chats.

Language: Jupyter Notebook - Size: 8.16 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 1 - Forks: 0

DefinetlyNotAI/VulnScan_Data

Logicytics VulnScan Module's Training Data and old model archive

Language: HTML - Size: 810 MB - Last synced at: 19 days ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

GerindT/webScraping

User-Driven Product Analysis with Web Scraping & Multi-modal NLP: Sentiment Analysis, Feature Extraction, and Recommendation using Amazon Reviews

Language: JavaScript - Size: 411 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

Akhand-Pratap-Tiwari/Automatic-Extractive-Text-Summarization-using-TF-IDF

Text Summarization using TF-IDF technique in Python.

Language: Jupyter Notebook - Size: 7.81 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

UtsavMandal2022/CodeQuery

This webpage finds you a desired cp question from leetcode using provided keywords. The backend is in flask and python. Uses TF-IDF algorithm.

Language: Python - Size: 825 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

YuanzhanGao/EIDL_PPP_Fuzzy_Matching

The repository is a duplicate of the local folder which contains codes created by Yuanzhan Gao ([email protected]) to conduct scaled fuzzy matching procedure on EIDL and PPP dataset. Please see the README file for more information.

Language: Python - Size: 5.86 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

Y1chenYao/thank-u-next

Final project for CS4300 Information Retrieval System

Language: HTML - Size: 29 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

Raksh710/Anime_Recommender_System

Recommends Anime using Content based filtering (using TFIDF vectorization and sigmoid kernel) and collaborative filtering (using KNN)

Language: HTML - Size: 60.5 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

jsngn/amazon-awesomeness-predictor

Ratings Predictor: Predict whether an Amazon product is highly rated

Language: Python - Size: 45.3 MB - Last synced at: 9 months ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

samujjwaal/uic-search-engine

Web search engine to retrieve most relevant web-pages for user search query from web-pages crawled on the UIC domain

Language: Jupyter Notebook - Size: 13.2 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

Remydeme/Descarte

NLP project in wich I analyse NLP model. And I start working on ML predictions interpretation.

Language: Jupyter Notebook - Size: 260 KB - Last synced at: 2 days ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

ylochman/NLP

NLP coursework | Applied Sciences Faculty, UCU, Lviv (2019)

Language: Jupyter Notebook - Size: 10.7 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

pranaya-mathur/Sentiment-Classification

Implemented Machine Learning Models on Amazon Fine Food Reviews Data Set

Language: Jupyter Notebook - Size: 19.5 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

vdhug/AnaliseDeSentimento

Repositorio com códigos relacionados a pesquisa de TCC sobre desempenho dos algoritmos Naive Bayes, RL e SVM para classificação de revisões.

Language: Python - Size: 9.24 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

lovevolley/NLP-Practicals

This repository contains practical implementations of NLP concepts including dependency grammar, text processing, normalization, and TF-IDF models to demonstrate key techniques in natural language processing.

Language: Jupyter Notebook - Size: 498 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

wanroulin/1131_NCCU_TFIDFwithIG

通識課程:窺探語言中的科技應用

Language: Jupyter Notebook - Size: 18.6 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

SKJNR/App-s-Review-Sentiment-Analysis

Perform Sentiment Analysis on App's Review Data

Language: Jupyter Notebook - Size: 2.07 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

zick97/TFIDF_FakeNews_detection

Exploring the effectiveness of TFIDF vectorization and Sentiment Analysis in fake news detection using various ML and visualization methods.

Language: Jupyter Notebook - Size: 49.6 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

dailyganzi/DailyGanzi-BE

어제 뉴스를 한입에, 뉴스 요약 서비스 "일간지" 백엔드

Language: Python - Size: 144 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

AtharvaMutsaddi/RPPOOP-Project--TNC-Feature-Extractor-and-Transcript-Summarizer

Online Lecture Summarizer and Terms and Conditions feature Extractor, with text analytics. A project which demonstrates the applications of relatively new and upcoming fields of NLP- Video Transcript Summarizing and Feature Extraction. In the process, creating a very useful utility for students and for consumers/investors.

Language: HTML - Size: 668 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

rod-rom/kickstarter-status-prediction

Prediction machine learning model to determine whether a Kickstarter Project is successful or not.

Language: Jupyter Notebook - Size: 26.7 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Okancan-Balci/IMDB_Spider-Man_Text_Analysis

I analyzed Spider-Man Movie reviews from IMDb. I employed basic NLP techniques like TF-IDF, Sentiment Analysis and Topic Modelling and I shared the results with solid visualizations. All done with R.

Language: RMarkdown - Size: 29.3 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Ayushi22137/Personalized_Medicine-Redefining_Cancer_Treatment

Determining the class of cancer-causing mutations using text and genetic data

Size: 1.36 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

peijin0405/What-Are-The-Commonalities-of-Successful-Social-Enterprises

This project aims to answer the question of the common features of successful social enterprises by applying unsupervised learning on 5,210 B corporations impact data.

Language: Jupyter Notebook - Size: 5.31 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

OmerShubi/Reuters_1987_Classification

Reuters 1987 Corpus Topic classification

Language: Python - Size: 33.2 MB - Last synced at: 1 day ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

MANOJPATRA1991/spacy-text-classification

Text classification

Language: Jupyter Notebook - Size: 20.1 MB - Last synced at: 20 days ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 7

sharatsawhney/reader

A One-of-its kind Platform Offering E-books as a Rental Service integrated with their Digital Devices completely Redesigning the Reading Experience

Language: Python - Size: 256 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

StacyScudder/paranormal_playlist

This is a recommender system that lets you enter a paranormal romance book and get back a Spotify playlist of hair metal songs as a soundtrack for the book

Language: Jupyter Notebook - Size: 21.3 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

sauhard2701/NLP-on-Amzon-Alexa-Reviews

This is a text and sentiment analysis model of amazon-alexa-reviews using NLP

Language: Jupyter Notebook - Size: 110 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

Dhrumil-Zion/Sentiments-Prediction-Using-NLP

Predicting customer sentiments from feedbacks for amazon. While exploring NLP and its fundamentals, I have executed many data preprocessing techniques. In this repository, I have implemented a bag of words using CountVectorizer class from sklearn. I have trained this vector using the LogisticRegression algorithm which gives approx 93% accuracy. I have found out the top 20 positive and negative feedback words from thousands how feedbacks. Also after processing this much I have automated the whole process with one function so that it can be used as generic for many machine learning algorithms. I have also tested another algorithm called DummyClassifier which gives an accuracy of around 84%. After that, I have executed the famous algorithm which is TF-IDF for NLP. I have combined TF-IDF with LogisticRegression which gives almost 93% accuracy but deep insights. Also, while working with data has solved the problem of imbalanced data through RandomOverSampler class from imblearn library.

Language: Jupyter Notebook - Size: 316 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

ChiaraDiBonaventura/covid_opinion

Applying NLP to understand people's sentiment about Covid-19 and Government actions in Italy, conditional on their political affiliation.

Language: Jupyter Notebook - Size: 13.4 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

makaravind/tfidf-ir-system

IR System - TFIDF Implementation to search relevant covid19 clinical trails

Language: Python - Size: 2.45 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

gbrsouza/TF-iDF

A Term Frequency and inverse distance Frenquency (TF-idF) algorithm in Java language using concurrent techniques

Language: Java - Size: 13.7 MB - Last synced at: almost 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

rajathpatel23/pre-processing_scripts

Python natural language pre-processing scripts

Language: Python - Size: 11.7 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

Divya-Bhargavi/Kaggle_HomeDepot

Predict search relevance given a product name and its text attributes

Language: Jupyter Notebook - Size: 130 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 1

MayukhSobo/Inverted_Index

Use of inverted index to find similar documents in a data frame

Size: 0 Bytes - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

aki83reo/Optimal-cluster-using-elbow-and-silhoutte-

Finding optimal clusters for text data using tfids , silhoutte , elbow method , and kmeans

Language: Python - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

pdubz-sudo/tfidf_text_analysis

Using tf_idf statistics to determine how important a word is to a document in a collection of documents

Language: R - Size: 731 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

Related Topics
nlp 15 tfidf-vectorizer 13 python 13 machine-learning 12 tfidf 9 tf-idf 8 nlp-machine-learning 7 sklearn 7 sentiment-analysis 7 nltk 6 text-mining 5 data-science 5 natural-language-processing 5 information-retrieval 5 svm-classifier 4 text-analysis 4 logistic-regression 4 text-classification 4 cosine-similarity 3 beautifulsoup4 3 vectorization 3 vector-space-model 3 python3 3 bag-of-words 3 topic-modeling 3 pandas 3 flask 3 naive-bayes-classifier 3 search-engine 2 tokenization 2 sentiment-classification 2 porter-stemmer 2 confusion-matrix 2 pca-analysis 2 term-frequency 2 ai 2 artificial-intelligence 2 ml 2 tokenizer 2 tf-idf-vectorizer 2 passive-aggressive-classifier 2 eda 2 data 2 deep-learning 2 linear-regression 2 kaggle 2 streamlit 2 pca 2 numpy 2 tsne-algorithm 2 random-forest 2 latent-dirichlet-allocation 2 ensemble-learning 2 inverted-index 2 emotion-analysis 1 json-mining 1 emotion-detection 1 csv-generation 1 chart-plotting 1 extratreesclassifier 1 gbm 1 visualization 1 gradient-boosting 1 learningrates 1 nltk-library 1 numericratingpredictions 1 randomforestclassifier 1 keras 1 sentiment 1 interpretability 1 vsmtechniques 1 stopwords 1 vite 1 tailwindcss 1 react 1 puppeteer 1 nodejs 1 docker 1 silhouette 1 kmeans-clustering 1 elbow-method 1 neural-network 1 fake 1 word-vectors 1 matrix-factorization 1 document-embedding 1 data-visualization 1 data-preprocessing 1 concurrent-programming 1 classifier 1 adaboostclassifier 1 training-data 1 text-processing 1 sensitive-files 1 pytorch 1 models 1 logicytics 1 unsupervised-learning 1 unsupervised-clustering 1 social-enterprise 1