An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: lemmatization

degenNovice/corpus-tfidf-analyzer

A Python tool for text analysis using TF-IDF, lemmatization, stopword filtering, and frequency visualization.

Language: Python - Size: 16.6 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

adbar/simplemma

Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

Language: Python - Size: 729 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 155 - Forks: 12

roshan-research/hazm

Persian NLP Toolkit

Language: Python - Size: 25.5 MB - Last synced at: about 23 hours ago - Pushed at: 10 months ago - Stars: 1,281 - Forks: 189

ewdlop/NLPNote

https://en.wikipedia.org/wiki/Natural_language_processing

Language: Jupyter Notebook - Size: 623 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

biolab/orange3-text

🍊 :page_facing_up: Text Mining add-on for Orange3

Language: Python - Size: 46.5 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 132 - Forks: 86

soubankhandwani/ai-paper-evaluation-model

This project is an intelligent web application that compares student answers from scanned or typed PDFs against teacher-provided answer PDFs using NLP techniques and machine learning. It performs OCR, text extraction, preprocessing, and semantic similarity scoring to generate marks for each question.

Language: HTML - Size: 195 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

warda-tariqq/bartangi-lemmatizer-v2

Final version of Bartangi Lemmatizer and Word2Vec embeddings

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

thjbdvlt/spacy-viceverser

lemmatisation du français avec hunspell et spacy

Language: Python - Size: 2.05 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

oeuvres/alix

A Lucene Indexer for XML, with lexical analysis (lemmatization for French)

Language: Java - Size: 284 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 17 - Forks: 4

Kaiten-dev/quita_mini

Quita Mini is a text analysis tool designed to calculate various linguistic metrics from text data. It processes a collection of text files, computes statistics such as Type-Token Ratio (TTR), entropy, average token and type lengths, hapax legomena percentages, and more. The results are then saved in an Excel file for further analysis.

Language: Go - Size: 3.53 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

thjbdvlt/solipCysme

spaCy pipeline for french focused on personal pronouns, fictions and first person point of view texts.

Language: Python - Size: 1.64 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 1 - Forks: 0

ssrishtix/IMDB-Sentiment

A comparative case study on stemming vs lemmatization using IMDb movie reviews, focusing on NLP preprocessing and vocabulary analysis.

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

adobe/NLP-Cube

Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing

Language: HTML - Size: 11.1 MB - Last synced at: 14 days ago - Pushed at: 6 months ago - Stars: 558 - Forks: 94

katerinaharana/chatbot

WIP-- Building the Cornerstone of a Chatbot: Creating a Clustering-Based Intent Identification Engine

Language: Jupyter Notebook - Size: 178 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

thjbdvlt/spell-fr.vim

french spellcheck files for hunspell and vim

Language: Python - Size: 5.22 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 6 - Forks: 0

aajanki/finnish-pos-accuracy

Evaluating accuracy of Finnish part-of-speech taggers

Language: Python - Size: 711 KB - Last synced at: 16 days ago - Pushed at: almost 2 years ago - Stars: 8 - Forks: 0

explosion/spacy-lookups-data

📂 Additional lookup tables and data resources for spaCy

Language: Python - Size: 144 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 105 - Forks: 53

tensordot/syntaxdot

Neural syntax annotator, supporting sequence labeling, lemmatization, and dependency parsing.

Language: Rust - Size: 1010 KB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 75 - Forks: 3

mawiesne/DE-Lemma

DE-Lemma: An OpenNLP lemmatizer tool and model files trained via German treebanks

Language: Java - Size: 982 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 1 - Forks: 2

dhyanid13/Helpify-LSTM-based-approach-for-classifying-mental-health-issues

Employing NLP techniques to classify Mental Health Issues into a particular categories

Language: Jupyter Notebook - Size: 5.8 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

drkameleon/lexm

A specification for representing dictionary-ready, lexical entries and their relationships

Language: Ruby - Size: 130 KB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

bhuvan2018/news_article_classification

This HACKATHON project implements automated news article classification using machine learning and NLP techniques. Built with Flet for the UI, it processes & classifies text-based news content using methods like tokenization, lemmatization, vectorization and BERT-based embeddings.

Language: Jupyter Notebook - Size: 6.43 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 1

AbrSantiago/corpus-tfidf-analyzer

A Python tool for text analysis using TF-IDF, lemmatization, stopword filtering, and frequency visualization.

Language: Python - Size: 14.6 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

bjascob/LemmInflect

A python module for English lemmatization and inflection.

Language: Python - Size: 4.32 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 268 - Forks: 25

ahmedsamir45/NLP-Project

NLP projects

Language: Jupyter Notebook - Size: 69.3 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 1

biblissima/collatinus Fork of PhVerkerk/Collatinus-11

Sources of Collatinus software - Latin lemmatizer, morphological analyzer and scansion

Language: JavaScript - Size: 41.1 MB - Last synced at: 23 days ago - Pushed at: 24 days ago - Stars: 73 - Forks: 15

bnosac/udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit

Language: C++ - Size: 5.74 MB - Last synced at: 4 days ago - Pushed at: about 2 years ago - Stars: 214 - Forks: 33

xga0/lightlemma

A lightweight, fast English lemmatizer

Language: Python - Size: 11.7 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 1 - Forks: 0

nlp-uoregon/trankit

Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing

Language: Python - Size: 1.06 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 749 - Forks: 103

nlpub/pymystem3

A Python wrapper of the Yandex Mystem 3.1 morphological analyzer (http://api.yandex.ru/mystem). The original tool is shipped as a binary and this library makes it easy to integrate it in Python projects. Let us know in the issues if you would like to be involved into the developments or maintenance of this project. If you have any fix or suggestion, please make a pull request. We are very open to accepting any contributions.

Language: Python - Size: 98.6 KB - Last synced at: 29 days ago - Pushed at: about 3 years ago - Stars: 295 - Forks: 43

huspacy/huspacy

HuSpaCy: industrial-strength Hungarian natural language processing

Language: Python - Size: 2.2 MB - Last synced at: 28 days ago - Pushed at: 6 months ago - Stars: 165 - Forks: 15

hipster-philology/pyrrha

A language-independent post-correction app for POS-tagging and lemmatization

Language: Python - Size: 26.2 MB - Last synced at: 29 days ago - Pushed at: 8 months ago - Stars: 28 - Forks: 16

obulat/zeyrek

Python morphological analyzer for Turkish language. Partial port of ZemberekNLP.

Language: Python - Size: 6.54 MB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 51 - Forks: 8

protoelicker/lexm

A specification for representing dictionary-ready, lexical entries and their relationships

Language: Ruby - Size: 117 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

true-real-michael/tg-message-search

Search your telegram chat for message threads all in your browser (+ russian lemmatization) :shipit:

Language: Rust - Size: 484 KB - Last synced at: 4 days ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

thangtran3112/machine-learning

NLP, Neural networks, pytorch, tensorflow, AWS Sagemaker fine-tuning

Language: Jupyter Notebook - Size: 195 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

liuzl/ling

Natural Language Processing Toolkit in Golang

Language: Go - Size: 496 KB - Last synced at: about 1 month ago - Pushed at: about 5 years ago - Stars: 64 - Forks: 4

KrishArul26/Text-Classification-DBpedia-ontology-classes-Using-LSTM

Text classification is the task of assigning a set of predefined categories to free text. Text classifiers can be used to organize, structure, and categorize pretty much anything. For example, new articles can be organized by topics, support tickets can be organized by urgency, chat conversations can be organized by language, brand mentions can be organized by sentiment, and so on.

Language: Python - Size: 27.3 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 1

labrijisaad/Twitter-Sentiment-Analysis-with-Python

I aim in this project to analyze the sentiment of tweets provided from the Sentiment140 dataset by developing a machine learning sentiment analysis model involving the use of classifiers. The performance of these classifiers is then evaluated using accuracy and F1 scores.

Language: Jupyter Notebook - Size: 11.1 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 2

eellak/gsoc2018-spacy

[GSOC] Greek language support for spacy.io python NLP software

Language: Python - Size: 293 MB - Last synced at: 22 days ago - Pushed at: over 6 years ago - Stars: 101 - Forks: 10

rosette-api/python

Babel Street Analytics Client Library for Python

Language: Python - Size: 1.63 MB - Last synced at: 5 days ago - Pushed at: 2 months ago - Stars: 38 - Forks: 37

vhyza/elasticsearch-analysis-lemmagen

Elasticsearch lemmatizer for 15 languages

Language: Java - Size: 8.95 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 105 - Forks: 28

asahala/BabyLemmatizer

State-of-the-art neural tagger and lemmatizer for ancient languages

Language: Python - Size: 4.16 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 12 - Forks: 1

maxB9F/COATL-LIGM

Web-based text aligner and comparator

Language: HTML - Size: 13.1 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

amir-zeldes/HebPipe

An NLP pipeline for Hebrew

Language: Lex - Size: 8.41 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 36 - Forks: 10

milaan9/Python_Natural_Language_Processing

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.

Language: Jupyter Notebook - Size: 182 KB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 198 - Forks: 174

anujvyas/Natural-Language-Processing-Projects

This repository consists of all my NLP Projects

Language: Jupyter Notebook - Size: 71.2 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 190 - Forks: 67

WZBSocialScienceCenter/germalemma

A lemmatizer for German language text

Language: Python - Size: 74.2 KB - Last synced at: 28 days ago - Pushed at: over 2 years ago - Stars: 88 - Forks: 11

reynoldsnlp/udar

UDAR Does Accented Russian: A finite-state morphological analyzer of Russian that handles stressed wordforms.

Language: Python - Size: 161 MB - Last synced at: 13 days ago - Pushed at: 8 months ago - Stars: 28 - Forks: 1

teakulo/Eventime-app

Eventime App is an event management platform using Angular, Spring Boot, Flask, and PostgreSQL. It offers AI-powered event recommendations, social features, and secure authentication. Users can manage events, chat with a chatbot, and view their calendar.

Language: Java - Size: 3.41 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

TahirZia-1/NLP-TextClassify

A hands-on NLP project comparing classic ML models (Naïve Bayes, SVM, Logistic Regression) and ANNs for text classification using SMS Spam and 20 Newsgroups datasets.

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

nishi1612/Email-Spam-Classification-using-SVM

The uploaded codes help to classify emails into spam and non spam classes by using Support Vector Machine classifier.

Language: Python - Size: 463 KB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 31 - Forks: 11

michmech/lemmatization-lists

Machine-readable lists of lemma-token pairs in 23 languages.

Size: 21.5 MB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 336 - Forks: 93

BigToothDev/pet-project-nlp

Natural language processing pet project. It includes data web scraping, lemmatizing, stemming, and working with related words (hyponyms, hypernyms, meronyms, holonyms). This specific code gathers all data from chosen pages of the Suspilne (Суспільне) webpage. Next, the data is manipulated and processed for future analysis

Language: Python - Size: 48.5 MB - Last synced at: 28 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

janlukasschroeder/nlp-cheat-sheet-python

NLP Cheat Sheet, Python, spacy, LexNPL, NLTK, tokenization, stemming, sentence detection, named entity recognition

Language: Jupyter Notebook - Size: 3.05 MB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 229 - Forks: 67

joppuyo/relevanssi-finnish-base-forms

Relevanssi plugin to add Finnish base forms in search index

Language: PHP - Size: 1.69 MB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 1

04bhavyaa/movie-genre-prediction

This project implements preprocessing, feature engineering, and multiple machine learning models to build a robust genre classification system.

Language: Jupyter Notebook - Size: 27.4 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

acdh-oeaw/tokeneditor 📦

TokenEditor is a web application for manual annotation (or manual review of automatic annotations) of text. Albeit primarily aimed at reviewing PoS tags and lemmas, it is fully customizable, to support any annotation levels.

Language: JavaScript - Size: 1.27 MB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

big-keva/libmorph

libmorph rus/ukr - fast & accurate morphological analyzer/analyses for Russian and Ukrainian

Language: HCL - Size: 39.8 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 26 - Forks: 4

CogComp/cogcomp-nlp

CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.

Language: Java - Size: 85.5 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 475 - Forks: 144

Qutuf/Qutuf

Qutuf (قُطُوْف): An Arabic Morphological analyzer and Part-Of-Speech tagger as an Expert System.

Language: Python - Size: 6.15 MB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 132 - Forks: 17

impresso/impresso-linguistic-processing

Code for running spaCy on rebuilt impresso data.

Language: Python - Size: 374 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

CIRCSE/LEMLAT3

Morphological analyzer and lemmatizer for Latin.

Language: C - Size: 791 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 25 - Forks: 2

aryansk/Mobile-Product-Sentiment-Analyzer

A comprehensive sentiment analysis tool that analyzes mobile product reviews using Natural Language Processing (NLP) techniques and provides detailed visualizations of customer sentiment patterns.

Language: Jupyter Notebook - Size: 302 KB - Last synced at: 30 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

cyberfantics/NaturalLanguageProcessing

A comprehensive repository for the Natural Language Processing course, featuring lecture notes, slides, and practical implementations of key NLP concepts using Python and popular libraries.

Language: Jupyter Notebook - Size: 1.2 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

alex-rusakevich/lemmatizer-be

Lemmatizer for Belarusian language (based on bnkorpus.info)

Language: Python - Size: 8.07 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

chandkund/Sentiment-Analysis-Using-NLP

This project focuses on Sentiment Analysis using the textual content from product reviews. The goal is to analyze user sentiments based on their written feedback, particularly focusing on the "reviewText" column in the dataset.

Language: Jupyter Notebook - Size: 968 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

bolner/rainbow-latin-generator

Generator for the Rainbow Latin Reader documents.

Language: C# - Size: 15 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

mit-ccc/TweebankNLP

[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset

Language: Python - Size: 16.8 MB - Last synced at: 14 minutes ago - Pushed at: over 1 year ago - Stars: 104 - Forks: 8

alhussain-shaikh/CodeCompass

CodeCompass caters to a diverse range of developers, from novices to seasoned contributors. Through surveys and analysis, we understand their programming languages, areas of interest, preferred complexities, and community engagement levels. This enables personalized recommendations for a rich contribution experience.

Language: JavaScript - Size: 23.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 3 - Forks: 1

rosette-api/nodejs

Babel Street Analytics Client Library for Node.js

Language: JavaScript - Size: 2.18 MB - Last synced at: 14 days ago - Pushed at: 4 months ago - Stars: 8 - Forks: 11

eyadrmsh/DB_lemmatization

Lemmatization of a Twitter database on an HPC system to predict survey responses based on the processed data.

Language: Python - Size: 458 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

winkjs/wink-lemmatizer

English lemmatizer

Language: JavaScript - Size: 1.81 MB - Last synced at: 19 days ago - Pushed at: almost 2 years ago - Stars: 66 - Forks: 6

trinker/textstem

Tools for fast text stemming & lemmatization

Language: R - Size: 178 KB - Last synced at: 26 days ago - Pushed at: almost 7 years ago - Stars: 45 - Forks: 8

R-Mahesh45/Text-Mining-Assignment

This project performs sentiment analysis on Elon Musk's tweets and emotion mining on product reviews from an e-commerce website. It involves data preprocessing techniques such as stemming, lemmatization, and removing stop words. The goal is to extract meaningful insights and classify text based on sentiment and emotion.

Language: Jupyter Notebook - Size: 810 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

DeepakMishra99/Natural_Language_Processing_Practice

Natural Language Processing

Language: Jupyter Notebook - Size: 354 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

karan89200/NLP_Tasks

This repository is dedicated to providing comprehensive resources and code snippets for text preprocessing and various NLP tasks. Whether you're a beginner or an experienced data scientist, you'll find useful tools and techniques here to enhance your natural language processing projects.

Size: 0 Bytes - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

shaheennabi/Natural-Language-Processing-Practices-and-Mini-Projects

🎇 NLP Experiments 🎆 A hands-on collection of NLP experiments 💬, featuring models like RNN, LSTM, and Attention Mechanism. 🚀 Explore applications like text classification, sentiment analysis, and language generation 🌍. Continuously updated with new algorithms and research implementations! 🔥

Size: 8.79 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

mim-solutions/mim_nlp

A Python package with ready-to-use models for various NLP tasks and text preprocessing utilities. The implementation allows fine-tuning.

Language: Jupyter Notebook - Size: 413 KB - Last synced at: 15 days ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

Riccorl/ipa

NLP Preprocessing Pipeline Wrappers

Language: Python - Size: 96.7 KB - Last synced at: 22 days ago - Pushed at: almost 2 years ago - Stars: 11 - Forks: 0

OmidGhadami95/metaphor-detection-cnn-lstm

Metaphor detection using cnn lstm

Language: Jupyter Notebook - Size: 160 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

mednour2019/taln-resume

Text Summarization Using Natural Language Processing (NLP)

Size: 374 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

jonathanfox5/lemon_tizer

LemonTizer is a class that wraps the spacy library to build a lemmatizer for language learning applications.

Language: Python - Size: 26.4 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

Yash22222/IBM-CSRBOX-Internship-Project

The objective of the Data Analytics internship at CSRBOX is to provide interns with hands-on experience in applying data analytics techniques to real-world projects in the field of corporate social responsibility (CSR). Interns will gain practical skills in data collection, cleaning, analysis, visualization, and reporting, while working on projects

Language: Jupyter Notebook - Size: 5.28 MB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 6

FardinHash/Chatbot-Deep-Learning

This Chatbot completed with combination of Deep Learning, Natural Language Toolkit(NLTK), PyTorch mode. And highest accuracy achieved here.

Language: Jupyter Notebook - Size: 43.9 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 2

rosette-api/curl-examples

cUrl examples for Babel Street Analytics

Language: Shell - Size: 73.2 KB - Last synced at: 8 days ago - Pushed at: 6 months ago - Stars: 1 - Forks: 7

arya-io/NLP-Explorer

NLP Explorer is an interactive Streamlit app that lets users explore various NLP techniques like Tokenization, POS Tagging, Stemming, Lemmatization, and NER. It provides real-time analysis of text, making it a great tool for learning and experimenting with NLP concepts.

Language: Python - Size: 113 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

ILR-Stuttgart/old-french-lemmatization-tools

Scripts to enable lemmatization of Old French using a combination of the RNN Tagger autolemmas and a dictionary lookup

Language: Python - Size: 3.46 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

rosette-api/ruby-script

Contains Ruby scripts for accessing Babel Street Analytics

Size: 16.6 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

raghavendranhp/Dynamic-Hotel-Recommendation-System-Using-NLP

Developing a Python-based system for personalized hotel recommendations. The goal is to match user descriptions with hotel features, enhancing user satisfaction and decision-making in the hospitality industry.

Language: Jupyter Notebook - Size: 53.6 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

dogaanismail/nlp-examples

Natural Language Procesing examples from https://realpython.com/natural-language-processing-spacy-python/

Language: Python - Size: 0 Bytes - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

smileart/lemmingo

Defensive lemmatiser/stemmer written in Go ⊂( ⚆ ϖ⚆)っ

Language: Go - Size: 741 KB - Last synced at: 5 days ago - Pushed at: almost 2 years ago - Stars: 11 - Forks: 2

SannketNikam/Emotion-Detection-in-Text

This project employs emotion detection in textual data, specifically trained on Twitter data comprising tweets labeled with corresponding emotions. It seamlessly takes text inputs and provides the most fitting emotion assigned to it. This app has more than 600 visitors!

Language: Jupyter Notebook - Size: 4.24 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 31 - Forks: 10

abdullahashfaqvirk/NLP-Workshops

Embark on your NLP journey by learning essential techniques through a series of notebooks designed to kickstart your career in this field.

Language: Jupyter Notebook - Size: 27.3 KB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 3 - Forks: 0

rjarman/Bus-Mama

The Bus-Mama is a bus tracking mobile application for the transportation of the students of BSMRSTU. It helps the students of our university by showing the available route, bus, and their exact location. This app includes real-time bus tracking which is going to solve a problem that university students have been facing for many years. Students are often seen missing their buses. Often they can't maintain the bus time. Since there are many buses in our university, students can easily catch a bus if they know where and when it will pass by. My goal is to track the buses and make hardware, mobile application, and machine learning solution to solve the issue. This way the students can get relief from missing the bus and use the buses efficiently. The main idea is to track the buses. GPS trackers will be attached to every bus that will give the current position of them and automatically sync on the server. The Bus-Mama mobile application will show every real-time position of those buses. This application will be installed on students' mobile phones and in this way the students can easily maintain their transportation. In this application, the current location of the bus can be seen through Google map. Every bus will have a specific marker on Google map and all the details about a specific bus will be shown by clicking on the marker. There will be seen about how far the bus is, from which direction it will come, how much time to reach the bus, how much time it will take if there is any traffic on road, etc. There is also a search option to know about any specific bus details. There is also a list of all buses with sufficient details that will help students to know about all the details. Every student will have an account through which they can access bus data. Another main objective is the Bus-Mama Chatbot in the Bengali language so that the students can communicate to know about the bus easily. For now, they can make conversation only about bus-related information. The Chatbot is not yet able to make conversation except bus-related questions. If anyone asks anything except bus-related questions, it cannot reply to the question rather it will give a tag to the question as a reply. As the Chatbot is created in the Bengali language, it has used the "trie" data structure in lemmatization. A library has been designed to lemmatize the Bengali words. Almost 63,205 Bengali words have been lemmatized by using the library to train the SVM machine learning model.

Language: TypeScript - Size: 10 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 14 - Forks: 1

Flight-School/lemma 📦

A command-line utility that lemmatizes words in natural language text.

Language: Swift - Size: 3.91 KB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 28 - Forks: 0

KanishkNavale/Text-Mining-with-TF-IDF-and-Cosine-Similarity

A simple python repository for developing perceptron based text mining involving dataset linguistics preprocessing for text classification and extracting similar text for a given query.

Language: Jupyter Notebook - Size: 7.34 MB - Last synced at: 3 months ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 1

anishLearnsToCode/lemmatization

Project that implements basic lemmatization over a small coprus using the nltk Natural Language 🗣 Processing Toolkit. Shows implementation in Jupyter 📓 with Analytics 📊.

Language: Jupyter Notebook - Size: 489 KB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

GabrielMazzotta/NLP-Clustering--Movie-Similarity-from-Plot-Summaries

A Python-based movie recommendation system leveraging NLP and clustering techniques. This project includes data processing, vectorization of plot summaries, and the implementation of recommendation algorithms to suggest similar movies based on user input.

Language: Jupyter Notebook - Size: 1.21 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

antixrist/node-phpmorphy

Полнофункциональный порт phpMorphy на Node.JS

Language: JavaScript - Size: 26.4 MB - Last synced at: 18 days ago - Pushed at: over 2 years ago - Stars: 34 - Forks: 7

Related Keywords
lemmatization 373 nlp 177 stemming 104 tokenization 102 natural-language-processing 76 python 71 machine-learning 64 nltk 52 sentiment-analysis 39 lemmatizer 39 nlp-machine-learning 38 tf-idf 37 bag-of-words 30 stopwords 30 pos-tagging 29 spacy 28 named-entity-recognition 23 text-classification 22 logistic-regression 21 morphological-analysis 21 pandas 19 tokenizer 18 deep-learning 18 naive-bayes-classifier 18 word2vec 18 text-mining 17 nltk-python 17 tfidf-vectorizer 16 vectorization 15 classification 15 python3 15 text-processing 15 preprocessing 14 morphology 12 wordcloud 12 data-science 12 matplotlib 12 lda 11 text-analysis 11 multinomial-naive-bayes 11 tfidf 11 stemmer 10 part-of-speech-tagging 10 ner 10 data-analysis 9 random-forest 9 text-preprocessing 9 stopwords-removal 9 word-embeddings 9 textblob 9 flask 9 regular-expression 9 information-retrieval 9 spacy-nlp 8 jupyter-notebook 8 topic-modeling 8 scikit-learn 8 javascript 8 count-vectorizer 8 clustering 8 numpy 7 dependency-parsing 7 part-of-speech-tagger 7 lstm-neural-networks 7 pytorch 7 pos 7 svm 7 data-mining 6 dictionary 6 glove-embeddings 6 feature-extraction 6 support-vector-machines 6 pipeline 6 machine-learning-algorithms 6 chatbot 6 sentiment-classification 6 tf-idf-vectorizer 6 neural-network 6 word-cloud 6 corpus 6 normalization 6 bert 6 java 6 computational-linguistics 6 gensim 6 stemming-algorithm 6 cosine-similarity 6 lookup 5 sklearn 5 lstm 5 streamlit 5 data-visualization 5 nltk-library 5 dependency-parser 5 tagging 5 text 5 tensorflow 5 json 5 french 5 svm-classifier 5