GitHub topics: text-analysis
gwyndolin75/Document-QA-System
A Streamlit-based app for asking questions directly from uploaded documents using Gemini embeddings and a language model. Supports PDF, TXT, and DOCX files. Fast, simple, and powerful document-based QA.
Language: Jupyter Notebook - Size: 1.22 MB - Last synced at: about 4 hours ago - Pushed at: about 5 hours ago - Stars: 1 - Forks: 0

degenNovice/corpus-tfidf-analyzer
A Python tool for text analysis using TF-IDF, lemmatization, stopword filtering, and frequency visualization.
Language: Python - Size: 16.6 KB - Last synced at: about 7 hours ago - Pushed at: about 8 hours ago - Stars: 0 - Forks: 0

ChanMeng666/customer-insight
【Star us if you're awesome!⭐️】A comprehensive customer review analysis system that provides deep insights through sentiment analysis, keyword extraction, topic modeling, and interactive visualizations. Built with Python and Streamlit, optimized for Chinese text with English language support.
Language: Python - Size: 294 KB - Last synced at: about 16 hours ago - Pushed at: about 16 hours ago - Stars: 0 - Forks: 1

SilentProgrammer-max/AI-Powered-Resume-Analyzer
An AI-powered tool that analyzes resumes and gives insights on skills, experience, and job match.
Language: Python - Size: 38.1 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 5 - Forks: 0

programminghistorian/jekyll
Jekyll-based static site for The Programming Historian
Language: HTML - Size: 929 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 530 - Forks: 228

sansan0/bilibili-comment-analyzer
🎯 哔哩哔哩(bilibili)评论下载器兼数据可视化桌面软件 -- 用数据来指导自己的b站题材和内容选择方向,支持单视频/批量下载、地区分布地图、词云分析、图片获取等功能的桌面应用。未来会做一个 youtube 评论区观察,欢迎⭐支持我~常见问题在最后
Language: Python - Size: 35.4 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 74 - Forks: 3

mathrailsAI/sentiment_insights
SentimentInsights is a Ruby gem for extracting actionable insights from qualitative survey responses. It provides sentiment analysis, key phrase extraction, and named entity recognition using multiple NLP providers including OpenAI, Claude and AWS Comprehend.
Language: Ruby - Size: 44.9 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 6 - Forks: 1

hiDaDeng/shreport
上海证券交易所上市公司定期报告下载,项目地址
Language: Python - Size: 103 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 103 - Forks: 31

idears-org/fabula
An open-source tool for writers and creators to visualize narrative structures, character relationships, and plot timelines. Turn your complex story into a clear, interactive map.
Language: TypeScript - Size: 93.8 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

Yuzufi/word-freq-statistic
盲分词的高性能中文语料词频统计工具:1分钟内统计10亿字语料的2字词!
Language: Rust - Size: 20.5 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

BetulKarakaya/Python-Tips-And-Tricks
This project is a compilation of diverse Python code examples showcasing useful techniques, mathematical concepts, and practical implementations. It covers topics such as text analysis, data visualization, mathematical computations, and unique problem-solving approaches to enhance learning and creativity in Python programming.
Language: Python - Size: 567 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 1

hiDaDeng/hidadeng.github.io
大邓的个人博客,博客域名在下方, 访问可能有点慢啊。
Language: HTML - Size: 1.61 GB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 4 - Forks: 2

CentreForDigitalHumanities/I-analyzer
The great textmining tool that obviates all others
Language: Python - Size: 59.8 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 8 - Forks: 3

komed3/cmpstr
CmpStr is a lightweight, fast and well performing package for calculating string similarity
Language: TypeScript - Size: 872 KB - Last synced at: about 20 hours ago - Pushed at: 5 days ago - Stars: 4 - Forks: 1

komed3/cmpstr-cli
CLI for the CmpStr library supporting string normalization, similarity scoring, phonetic indexing, matrix comparison and more
Language: TypeScript - Size: 154 KB - Last synced at: about 20 hours ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

fiacrerougieux/auto-sdg-mapping-dashboard
Interactive dashboard visualizing UN Sustainable Development Goal (SDG) coverage in university courses via automated text analysis.
Language: JavaScript - Size: 22.2 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

shcherbak-ai/contextgem
ContextGem: Effortless LLM extraction from documents
Language: Python - Size: 29.4 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1,186 - Forks: 88

jharemza/glassdoor_review_analysis
End-to-end Glassdoor review scraping and sentiment analysis pipeline.
Language: Jupyter Notebook - Size: 1.01 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

DCS-training/CDCS-Summer-School2021
2021 Text and Data Analysis Summer School
Language: Jupyter Notebook - Size: 101 MB - Last synced at: 1 day ago - Pushed at: 7 days ago - Stars: 10 - Forks: 7

lazappi/twitter-stats
Analysis of Twitter hashtags
Language: R - Size: 249 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 2 - Forks: 7

JonathanReeve/text-matcher
A simple text reuse detection CLI tool.
Language: Python - Size: 67.4 KB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 134 - Forks: 26

JBGruber/LexisNexisTools
:newspaper: Working with newspaper data from 'LexisNexis'
Language: R - Size: 2.08 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 110 - Forks: 22

juba/rainette
R implementation of the Reinert text clustering method
Language: R - Size: 15.5 MB - Last synced at: about 4 hours ago - Pushed at: about 1 year ago - Stars: 57 - Forks: 7

Yosef-AlSabbah/Cloud-Based-Document-Analytics-Service-2
Cloud-based service for uploading, scraping, and managing PDF/DOCX documents. Features include title sorting, content search with highlights, rule-based classification, and storage stats. Integrated with cloud platforms for scalable document analytics.
Language: TypeScript - Size: 269 KB - Last synced at: 9 days ago - Pushed at: 23 days ago - Stars: 3 - Forks: 0

faizhalas/library-tools
Coconut Libtool is the all-in-one data mining and textual analysis tool for librarians or anyone interested in these applications. Our tool does not require any prior knowledge of coding or programming, making it approachable and great for users who want to test out these data analysis and visualization techniques.
Language: Python - Size: 968 KB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 8 - Forks: 6

fbkarsdorp/python-course 📦
Tutorial and introduction into programming with Python for the humanities and social sciences
Language: Jupyter Notebook - Size: 120 MB - Last synced at: 5 days ago - Pushed at: over 4 years ago - Stars: 428 - Forks: 298

teenu/gpu-text-search
Ultra-high-performance GPU-accelerated text search using Metal compute shaders
Language: Swift - Size: 61.5 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 1 - Forks: 0

Neplex/ArchiTXT
ArchiTXT is an open source Python library that transforms unstructured text into structured, searchable, and AI-ready data. It enables automated database generation and seamless data integration.
Language: Python - Size: 4.07 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 3 - Forks: 0

alfredhw/zotero-to-voyant
A plugin for Zotero 7 to send attachments to Voyant Tools for text analysis
Size: 16.6 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

devspidr/NLP-Tools
A collection of powerful Natural Language Processing (NLP) tools and scripts for tasks like text preprocessing, sentiment analysis, keyword extraction, and more — built with Python and popular NLP libraries.
Language: Python - Size: 17.6 KB - Last synced at: 10 days ago - Pushed at: 15 days ago - Stars: 1 - Forks: 0

obsei/obsei
Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand image analysis, comparative study and more .
Language: Python - Size: 16.3 MB - Last synced at: 10 days ago - Pushed at: 17 days ago - Stars: 1,279 - Forks: 171

Blake-Madden/OleanderStemmingLibrary
Porter stemming library (C++)
Language: C++ - Size: 1.1 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 52 - Forks: 26

ds-modules/XENGLIS-31AC
UC Berkeley ENGLISH R1A (Literature of American Cultures, Chinatown and the Culture of Exclusion) Fall 2017
Language: Jupyter Notebook - Size: 8.22 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 6 - Forks: 2

ds-modules/PACS-190
UC Berkeley PACS 190 Fall 2018: Introductory text analysis workshop for senior thesis students
Language: Jupyter Notebook - Size: 13.4 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 2 - Forks: 2

ai4society/GenAIResultsComparator
A Python library providing evaluation metrics to compare generated texts from LLMs, often against reference texts. Features streamlined workflows for model comparison and visualization.
Language: Python - Size: 25.2 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 3 - Forks: 0

05-Jagritii/TextTweak
TextTweak is a simple React-based web app that lets users analyze and edit text. It offers features like case conversion, space cleanup, and word or character counting, with support for light and dark modes.
Language: JavaScript - Size: 486 KB - Last synced at: 9 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 0

jennlopez49/State_Laws
This project creates an indicator to measure how pro- or anti-immigrant a state is using a text data from state-level immigration-related bills.
Language: R - Size: 29.3 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

TBosak/codiff-mcp 📦
A simple MCP server that computes line-based diffs between two text inputs.
Language: JavaScript - Size: 25.4 KB - Last synced at: 3 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

mainlp/semantic_components
Finding semantic components in your neural representations.
Language: Python - Size: 5.58 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 4 - Forks: 1

trinker/textclean
Tools for cleaning and normalizing text data
Language: R - Size: 23.8 MB - Last synced at: 3 days ago - Pushed at: over 3 years ago - Stars: 251 - Forks: 26

MIT-LCP/bloatectomy
A python package for removing duplicate text in clinical notes or other documents
Language: TeX - Size: 7.48 MB - Last synced at: 4 days ago - Pushed at: almost 5 years ago - Stars: 37 - Forks: 9

shama-llama/amharic-dictionary
Amharic dictionary mapped from official dictionaries published by linguistic institutions
Size: 1.95 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

dhowe/rita
Website, documentation and examples for RiTa
Language: JavaScript - Size: 226 MB - Last synced at: about 7 hours ago - Pushed at: 18 days ago - Stars: 72 - Forks: 9

Sanjanaa7/AI-Symptom-Diary
A simple AI-powered Python tool to track your mood and physical symptoms from daily journal entries. It uses sentiment analysis to detect emotional state, extracts health symptoms using keyword detection, and visualizes your mood trends over time, all while storing your data locally for privacy
Language: Jupyter Notebook - Size: 38.1 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

brucewlee/lftk
[BEA @ ACL 2023] General-purpose tool for linguistic features extraction; Tested on readability assessment, essay scoring, fake news detection, hate speech detection, etc.
Language: Python - Size: 7.19 MB - Last synced at: 5 days ago - Pushed at: 7 months ago - Stars: 134 - Forks: 25

hiDaDeng/simtext
计算两文档间文本相似性指标
Language: Python - Size: 1.73 MB - Last synced at: 2 days ago - Pushed at: about 2 years ago - Stars: 12 - Forks: 1

daniau23/british_airways_topic_modelling
Analysing customer reviews using unsupervised learning approaches via Topic Modelling
Language: HTML - Size: 9.26 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

remram44/taguette
Free and open source qualitative research tool -- MIRROR OF GITLAB REPOSITORY
Language: Python - Size: 10.8 MB - Last synced at: 1 day ago - Pushed at: 3 months ago - Stars: 70 - Forks: 13

kurai-sx/Data-Extraction-and-Sentiment-Analysis-using-NLP
In this repository, you will be able to get how to extract text from the title and content from any article. Also using this extractede data to define the sentiment of the sentence.
Language: Jupyter Notebook - Size: 163 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

eleonc56/Cloud-Based-Document-Analytics-Service
Cloud-Based Document Analytics Service offers a simple way to manage your documents in the cloud. With features like drag-and-drop upload and powerful web scraping, it streamlines your document analysis. 🗂️💻
Language: TypeScript - Size: 315 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

MasuRii/FBScrapeIdeas
AI-powered CLI tool to scrape Facebook group posts, categorize them using Google's Gemini, and uncover insights for academic research or idea generation.
Language: Python - Size: 212 KB - Last synced at: 10 days ago - Pushed at: 29 days ago - Stars: 3 - Forks: 2

taylor-arnold/rpkg
A collection of R packages spanning natural language processing, statistical analysis, data visualization, and text analysis
Language: HTML - Size: 31.1 MB - Last synced at: 13 days ago - Pushed at: 20 days ago - Stars: 215 - Forks: 36

arnvjshi/Threat-Detection-Dashboard
ThreatShield AI-powered threat detection system using GROQ to analyze audio, image, and text data. It extracts insights and flags potential threats in real-time across multiple media formats.
Language: TypeScript - Size: 402 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 2 - Forks: 1

tsmdt/dygest
CLI tool to extract content insights from raw txt using LLMs and NER
Language: Python - Size: 7.96 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 3 - Forks: 0

Anishgoswamicode/wikipedia-semantic-clustering
Unsupervised semantic clustering of Wikipedia topics using Sentence-BERT embeddings, UMAP for visualization, and DBSCAN for topic discovery
Language: Jupyter Notebook - Size: 149 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

MonikaBarget/distant-reading
teaching materials for distant reading
Language: Jupyter Notebook - Size: 57.5 MB - Last synced at: 4 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 1

venkat-0706/AutoText-GPT
A Next-Word Prediction project uses Transformers and GPT-2 for text generation. GPTTokenizer preprocesses input, and the model is fine-tuned. Evaluation measures accuracy, perplexity, and fluency.
Language: Jupyter Notebook - Size: 101 KB - Last synced at: 12 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 1

Lips7/Matcher
A high-performance matcher designed to solve LOGICAL and TEXT VARIATIONS problems in word matching, implemented in Rust.
Language: Rust - Size: 36.9 MB - Last synced at: 5 days ago - Pushed at: 23 days ago - Stars: 17 - Forks: 1

koheiw/newsmap
Semi-supervised algorithm for geographical document classification
Language: R - Size: 1.84 MB - Last synced at: 17 days ago - Pushed at: about 1 year ago - Stars: 64 - Forks: 22

koheiw/LSX
Semi-supervised algorithm for document scaling
Language: R - Size: 118 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 55 - Forks: 5

psychbruce/PsychWordVec
🔜 Integrative Toolbox of Word Embedding Research for Psychological Science.
Language: R - Size: 44.5 MB - Last synced at: 22 days ago - Pushed at: 26 days ago - Stars: 23 - Forks: 1

zsxkib/TTDS-G35-CW3
TTDS Group Project: Video Games Search Engine. Sakib Ahamed. Dan Buxton, Kenza Amira, Wini Lau, Mansoor Ahmad
Language: Python - Size: 149 MB - Last synced at: about 14 hours ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 1

NationalLibraryOfNorway/DHLAB
DHLAB is a library of python modules for accessing text and pictures at the National Library of Norway.
Language: Python - Size: 1.36 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 22 - Forks: 5

veralvx/docker-languagetool-cli
LanguageTool client for Docker/Podman with ngrams and fasttext installed by default
Language: Dockerfile - Size: 14.6 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

FayazK/Document-Metadata-Extractor
A Python tool that uses Google's Gemini AI to automatically extract structured metadata from PDF and DOCX documents, saving results to Excel for easy analysis and organizing raw responses as JSON files.
Language: Python - Size: 11.7 KB - Last synced at: 25 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

anidixit64/LexicaForge
LexicaForge is a comprehensive natural language processing (NLP) toolkit designed for multilingual text analysis and processing. It provides a robust set of tools for text preprocessing, language detection, tokenization, and advanced NLP tasks, with a focus on scalability and performance.
Size: 182 KB - Last synced at: 5 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

johnbumgarner/wordhoard
This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.
Language: Python - Size: 539 KB - Last synced at: 13 days ago - Pushed at: about 1 year ago - Stars: 124 - Forks: 11

Pranav-Patel-123/GenAI
Language: TypeScript - Size: 102 KB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 0 - Forks: 0

5j9/wikitextparser
A Python library to parse MediaWiki WikiText
Language: Python - Size: 1.83 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 310 - Forks: 23

ds-modules/core-resources
Short examples and templates for common ds-module tasks
Language: Jupyter Notebook - Size: 9.47 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 12 - Forks: 22

airbnb/artificial-adversary
🗣️ Tool to generate adversarial text examples and test machine learning models against them
Language: Python - Size: 116 KB - Last synced at: 5 days ago - Pushed at: over 3 years ago - Stars: 402 - Forks: 57

yongzhuo/Text-Analysis
文本数据分析, Text-Analysis
Language: Python - Size: 4.71 MB - Last synced at: 19 days ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 5

forTEXT/catma
Computer Assisted Text Markup and Analysis
Language: Java - Size: 115 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 93 - Forks: 11

eriglesias/RosettaFables
Analyzing Aesop's fables across languages using advanced NLP techniques
Language: Python - Size: 19 MB - Last synced at: about 12 hours ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

renswickd/talkwise-ai
NLP-powered dialogue analyzer with real-time sentiment and filler-word metrics using Hugging Face, spaCy, and Streamlit.
Language: Python - Size: 977 KB - Last synced at: 18 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

ucd-dnp/ConTexto
Librería en Python para minería de texto y NLP
Language: Jupyter Notebook - Size: 34.1 MB - Last synced at: 23 days ago - Pushed at: about 1 year ago - Stars: 49 - Forks: 14

HardikGohilHLR/letter-lens
LetterLens – Your Ultimate Text Analysis Tool!
Language: TypeScript - Size: 117 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

ddtdanilo/OpenAI-Document-Analyzer
A powerful Python application for analyzing text and PDF files using OpenAI's latest chat completion models. Features dynamic model switching, customizable prompts, and comprehensive error handling.
Language: Python - Size: 46.9 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 39 - Forks: 5

dlite-tools/NLPiper
NLPiper is a package that agglomerates different NLP tools and applies their transformations in the target document.
Language: Python - Size: 165 KB - Last synced at: 15 days ago - Pushed at: almost 2 years ago - Stars: 19 - Forks: 1

jboynyc/textnets
Text analysis with networks.
Language: Python - Size: 2.92 MB - Last synced at: 2 days ago - Pushed at: 3 months ago - Stars: 285 - Forks: 25

dario-github/notion-nlp
Read the text from a Notion database and perform NLP analysis.
Language: Python - Size: 11 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 18 - Forks: 2

vinayakdasgupta/anvay
anvay is a Flask-based Bengali text processing and topic modeling tool that uses Latent Dirichlet Allocation (LDA) to extract topics from uploaded text files.
Language: HTML - Size: 6.15 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

opensemanticsearch/open-semantic-search
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Language: Shell - Size: 8.91 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 1,034 - Forks: 183

Cicatriiz/text-toolkit
Advanced MCP server providing comprehensive text transformation and formatting tools. TextToolkit offers over 40 specialized utilities for case conversion, encoding/decoding, formatting, analysis, and text manipulation - all accessible directly within your AI assistant workflow.
Language: TypeScript - Size: 1.37 MB - Last synced at: 26 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

shuddha2021/AI-Document-Analyzer
An interactive, client-side AI Document Summarizer & Analyzer built with HTML, CSS, and JavaScript. Features summarization, entity extraction, insights, file parsing (TXT, CSV, XLSX, HTML), and visualizations, all in-browser.
Language: HTML - Size: 28.3 KB - Last synced at: 20 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

lykmapipo/US-Inaugural-Addresses
Python scripts to download, process, and analyze US Inaugural Addresses
Language: Python - Size: 4.45 MB - Last synced at: 27 days ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

greyblake/whatlang-rs
Natural language detection library for Rust. Try demo online: https://whatlang.org/
Language: Rust - Size: 2.05 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 1,018 - Forks: 113

smkrv/ha-text-ai
Cutting-edge AI solution for Home Assistant. Multi-LLM provider support to transform your smart home experience with intelligent, adaptive automation.
Language: Python - Size: 6.2 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 29 - Forks: 3

prabhashj07/nepalikit
NepaliKit is a Python library for natural language processing (NLP) tasks in Nepali. It features tokenization (rule-based and SentencePiece), text preprocessing, stopword management, and sentence segmentation. Ideal for developers and researchers working with Nepali text data.
Language: Python - Size: 364 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 6 - Forks: 0

Yeasir-Hossain/keycloak-authentication
A full-stack text analysis application with Keycloak authentication, implementing word tokenization, sentence boundary detection, and longest word extraction algorithms with Redis caching for performance optimization.
Language: TypeScript - Size: 123 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

ThakkarVidhi/restaurant-review-analysis
Insight Platter: A comprehensive platform offering actionable insights from restaurant reviews through web scraping, sentiment analysis, and data visualization.
Language: Python - Size: 3.59 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

abalvet/textometry
A simple standalone textometry/lexicometry applet.
Language: HTML - Size: 22.5 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

GarthTB/word-freq-statistic
盲分词的高性能中文语料词频统计工具:1分钟内统计10亿字语料的2字词!
Language: Rust - Size: 36.1 KB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

oneai-nlp/oneai-python
Python SDK for One AI APIs. One AI is an NLP-as-a-service platform. Our APIs enables language comprehension in context, transforming texts from any source into structured data to use in code.
Language: Python - Size: 539 KB - Last synced at: 9 days ago - Pushed at: almost 2 years ago - Stars: 38 - Forks: 7

AsafManela/HurdleDMR.jl
Hurdle Distributed Multinomial Regression (HDMR) implemented in Julia
Language: Julia - Size: 5.13 MB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 26 - Forks: 13

Rishita-rm/Named-Entity-Recognition-App-spacy-flask
Web-based Named Entity Recognition (NER) app using Flask and spaCy, featuring multilingual support, entity filtering, an API endpoint, and interactive visualizations.
Language: HTML - Size: 15.6 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

HarshMonga-CSE/textutils-react
TextUtils is a lightweight and user-friendly web application built with React that provides a set of utilities to manipulate and analyze text, including case conversion, space removal, word/character count, and more.
Language: JavaScript - Size: 548 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

EmilHvitfeldt/smltar
Manuscript of the book "Supervised Machine Learning for Text Analysis in R" by Emil Hvitfeldt and Julia Silge
Language: TeX - Size: 464 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 259 - Forks: 104

nlpie/mtap
MTAP: A framework for distributed text analysis using gRPC and microservices-based architecture.
Language: Python - Size: 6.05 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 6 - Forks: 2

Ambeteco/ViberChatStatAnalyzer
Analyze Viber chat exports. Perform insightful analysis on chat data exported from the Viber messager.
Language: Python - Size: 249 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0
