GitHub topics: text-mining
deanmalmgren/textract
extract text from any document. no muss. no fuss.
Language: HTML - Size: 4.31 MB - Last synced at: about 13 hours ago - Pushed at: 5 months ago - Stars: 4,120 - Forks: 626

degenNovice/corpus-tfidf-analyzer
A Python tool for text analysis using TF-IDF, lemmatization, stopword filtering, and frequency visualization.
Language: Python - Size: 16.6 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

Ne0bliviscaris/Job-Search-Tool
Organizer for job searching across multiple sites. Fetch offers, measure recruitment progress, collect info about potential employer
Language: Python - Size: 4.69 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 3 - Forks: 0

Paulanerus/TextExplorer
A tool designed for the exploration, analysis, and comparison of textual data variants.
Language: Kotlin - Size: 492 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 2 - Forks: 0

palladian/palladian
Palladian is a Java-based toolkit with functionality for text processing, classification, information extraction, and data retrieval from the Web.
Language: Java - Size: 274 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 38 - Forks: 10

gesiscss/awesome-computational-social-science
A list of awesome resources for Computational Social Science
Language: R - Size: 209 KB - Last synced at: about 10 hours ago - Pushed at: about 1 month ago - Stars: 678 - Forks: 84

LatiefDataVisionary/text-mining-and-natural-language-processing-college-task
Language: Jupyter Notebook - Size: 12.6 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

notready155/whatsapp-chat-analysis
This project involves analyzing WhatsApp chat data to extract valuable insights. Using Python and various libraries like Pandas and Matplotlib, the project processes and visualizes chat statistics such as message frequency, most active participants, and sentiment analysis.
Size: 1.95 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

Neplex/ArchiTXT
ArchiTXT is an open source Python library that transforms unstructured text into structured, searchable, and AI-ready data. It enables automated database generation and seamless data integration.
Language: Python - Size: 5.21 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 3 - Forks: 0

keon/awesome-nlp
:book: A curated list of resources dedicated to Natural Language Processing (NLP)
Size: 541 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 17,145 - Forks: 2,607

jinhangjiang/textregress
TextRegress is a Python package designed to help researchers perform advanced regression analysis on long-form text data.
Language: Python - Size: 82 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 6 - Forks: 1

vmenger/deduce
Deduce: de-identification method for Dutch medical text
Language: Python - Size: 7.2 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 56 - Forks: 23

kevv1m/tikara
The metadata and text content extractor for almost every file type.
Size: 1000 Bytes - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

caiohutis/Steam-Game-Review-Analysis-Using-NLP-and-Clustering
This project uses Natural Language Processing (NLP) and Machine Learning techniques to analyze user reviews of top-selling games on the Steam platform. The goal is to detect bug-related reviews using keyword filtering, assess user sentiment (positive, neutral, negative), and group similar games using clustering methods.
Language: Jupyter Notebook - Size: 1.98 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

ujjwalkarn/DataScienceR
a curated list of R tutorials for Data Science, NLP and Machine Learning
Language: R - Size: 15.7 MB - Last synced at: about 9 hours ago - Pushed at: about 2 years ago - Stars: 2,042 - Forks: 891

biolab/orange3-text
🍊 :page_facing_up: Text Mining add-on for Orange3
Language: Python - Size: 46.5 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 132 - Forks: 86

george-gca/ai_papers_cleaner
Extract text from papers PDFs and abstracts, and remove uninformative words.
Language: Python - Size: 397 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 5 - Forks: 0

deweylab/MetaSRA-pipeline
MetaSRA: normalized sample-specific metadata for the Sequence Read Archive
Language: Python - Size: 27.5 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 43 - Forks: 14

blueprints-for-text-analytics-python/blueprints-text
Jupyter notebooks for our O'Reilly book "Blueprints for Text Analysis Using Python"
Language: Jupyter Notebook - Size: 164 MB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 256 - Forks: 147

mesejo/trrex
Efficient string matching with regular expressions
Language: Python - Size: 440 KB - Last synced at: about 1 hour ago - Pushed at: 6 days ago - Stars: 143 - Forks: 6

Lilykos/pyphonetics
A Python 3 phonetics library.
Language: Python - Size: 21.5 KB - Last synced at: about 2 hours ago - Pushed at: about 5 years ago - Stars: 132 - Forks: 20

SoaresAlisson/sto
operation with strings and other facilities
Language: R - Size: 645 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

Saeidhoseinipour/ELBMcoclust
We unified some latent block models by proposing a flexible ELBM that is extended to SELBM to address the sparse problem by revealing a diagonal structure from sparse datasets. This leads to obtain more homogeneous co-clusters and therefore produce useful, ready-to-use and easy-to-interpret results.
Language: Python - Size: 19.4 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

PetrKorab/Arabica
Python package for text mining of time-series data
Language: Python - Size: 102 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 73 - Forks: 14

hiDaDeng/cntext
text analysis, supporting multiple methods including word count, readability, document similarity, sentiment analysis, Word2Vec/GloVe, and Large Language Models (LLMs).文本分析包,支持字数统计、可读性、文档相似度、情感分析在内的多种文本分析方法。
Language: Python - Size: 64 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 338 - Forks: 30

SoaresAlisson/txtnet
{txtnet} a package to build graphs from text
Language: HTML - Size: 2.89 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 0 - Forks: 0

mesolitica/malaysian-dataset
We gather Malaysian dataset! https://malaysian-dataset.readthedocs.io/
Language: Jupyter Notebook - Size: 1.36 GB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 315 - Forks: 111

juliasilge/tidytext
Text mining using tidy tools :sparkles::page_facing_up::sparkles:
Language: R - Size: 129 MB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 1,185 - Forks: 182

docwire/docwire
DocWire SDK: Award-winning modern data processing in C++20. SourceForge Community Choice & Microsoft support. AI-driven processing. Supports nearly 100 data formats, including email boxes and OCR. Boost efficiency in text extraction, web data extraction, data mining, document analysis. Offline processing is possible for security and confidentiality
Language: C++ - Size: 35.8 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 83 - Forks: 18

JasonKessler/scattertext
Beautiful visualizations of how language differs among document types.
Language: Python - Size: 39.4 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 2,294 - Forks: 291

stepthom/text_mining_resources
Resources for learning about Text Mining and Natural Language Processing
Size: 707 KB - Last synced at: 4 days ago - Pushed at: about 2 years ago - Stars: 577 - Forks: 199

Hords01/Data_Mining
TF-IDF Calculation
Language: Python - Size: 37 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

ropensci/jstor
Import journal data from DfR (JSTOR)
Language: R - Size: 6.14 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 47 - Forks: 10

ArdentEmpiricist/text_analysis
Analyze text stored as *.txt in chosen file or directory. Doesn't read files in subdirectories. Counting all words and then searching for every unique word in the vicinity (+-5 words).
Language: Rust - Size: 213 KB - Last synced at: 7 days ago - Pushed at: 12 days ago - Stars: 5 - Forks: 1

inaridiy/webforai
The best HTML to Markdown library, A esm-native & Useful Utilities with simple, lightweight and epic quality.
Language: TypeScript - Size: 3.5 MB - Last synced at: 4 days ago - Pushed at: about 1 month ago - Stars: 65 - Forks: 5

openaire/iis
Information Inference Service of the OpenAIRE system
Language: Java - Size: 71.8 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 20 - Forks: 11

adbar/trafilatura
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Language: Python - Size: 33.8 MB - Last synced at: 13 days ago - Pushed at: about 2 months ago - Stars: 4,170 - Forks: 290

cindyoff/AI-detection-system
Supervised learning model built to detect AI from a text
Language: Python - Size: 445 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

ssrishtix/IMDB-Sentiment
A comparative case study on stemming vs lemmatization using IMDb movie reviews, focusing on NLP preprocessing and vocabulary analysis.
Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

MatoYing/TextMining
一个比较全面的文本挖掘过程。包含了利用机器学习和文本挖掘技术完成情感分析模型搭建;利用情感极性判断与程度计算来判断情感倾向;利用词频和TF-IDF挖掘出正负文本中的关键点情况;利用文本挖掘相关算法找到平台中用户讨论的集中点。
Language: Jupyter Notebook - Size: 26.8 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 31 - Forks: 2

stdlib-js/nlp-lda
Latent Dirichlet Allocation via collapsed Gibbs sampling.
Language: JavaScript - Size: 2.58 MB - Last synced at: 11 days ago - Pushed at: 15 days ago - Stars: 9 - Forks: 0

mcs07/ChemDataExtractor
Automatically extract chemical information from scientific documents
Language: Python - Size: 542 KB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 325 - Forks: 119

aphp/edsnlp
Modular, fast NLP framework, compatible with Pytorch and spaCy, offering tailored support for French clinical notes.
Language: Python - Size: 121 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 123 - Forks: 31

urtx13/Four-Phase-seed
This repository contains the seed-frozen version (seed=1405) of the original statistical pipeline described in Cho 2025a. All scripts, data and results have been made reproducible for verification and independent replication.
Language: Python - Size: 190 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

mtworth/sectext
Interface for text analytics of SEC 10-K filings
Language: Python - Size: 57.6 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 3 - Forks: 0

rmillikin/fast_km Fork of stewart-lab/fast_km
A Containerized KinderMiner / Serial KinderMiner Server
Language: Python - Size: 16.3 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

stephenhky/PyShortTextCategorization
Various Algorithms for Short Text Mining
Language: Python - Size: 111 MB - Last synced at: 7 days ago - Pushed at: 17 days ago - Stars: 470 - Forks: 72

juliasilge/janeaustenr
An R Package for Jane Austen's Complete Novels :orange_book:
Language: R - Size: 4.78 MB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 96 - Forks: 22

hpham1295/Impact-Data-Mining
An approach to extracting and summarizing key infrastructure and community impact information from wind disaster reconnaissance reports using Zero-shot text classification with BART-large models, highlighted by keywords.
Language: Jupyter Notebook - Size: 85.5 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

jonaschn/awesome-topic-models
✨ Awesome - A curated list of amazing Topic Models (implementations, libraries, and resources)
Size: 53.7 KB - Last synced at: 4 days ago - Pushed at: almost 3 years ago - Stars: 94 - Forks: 8

oroszgy/awesome-hungarian-nlp
A curated list of NLP resources for Hungarian
Size: 125 KB - Last synced at: 16 days ago - Pushed at: 28 days ago - Stars: 245 - Forks: 18

notesjor/CorpusExplorer.Terminal.Console
Erlaubt anderen Programmen/Programmiersprachen den Zugriff auf Analysen/Daten des CorpusExplorer v2.0
Language: C# - Size: 668 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 7 - Forks: 0

alirezatheh/perke
A keyphrase extractor for Persian
Language: Python - Size: 143 KB - Last synced at: 18 days ago - Pushed at: about 1 month ago - Stars: 69 - Forks: 8

ko-ichi-h/khcoder
KH Coder: for Quantitative Content Analysis or Text Mining
Language: Perl - Size: 30.5 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 316 - Forks: 98

vinit714/Steam-Game-Review-Analysis-Using-NLP-and-Clustering
This project uses Natural Language Processing (NLP) and Machine Learning techniques to analyze user reviews of top-selling games on the Steam platform. The goal is to detect bug-related reviews using keyword filtering, assess user sentiment (positive, neutral, negative), and group similar games using clustering methods.
Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

AbrSantiago/corpus-tfidf-analyzer
A Python tool for text analysis using TF-IDF, lemmatization, stopword filtering, and frequency visualization.
Language: Python - Size: 14.6 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

cpsievert/LDAvis
R package for web-based interactive topic model visualization.
Language: JavaScript - Size: 24 MB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 559 - Forks: 132

kottoization/SentimentAnalysisOnConsumentOpinions
NLP, text mining sentiment analysis on consumer opinions, using BERT and 2 ML models
Language: Jupyter Notebook - Size: 1.02 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 1 - Forks: 0

opensemanticsearch/open-semantic-search
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Language: Shell - Size: 8.91 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 1,019 - Forks: 180

NicholasMamo/multiplex-plot
Multiplex: visualizations that tell stories—A Python library to create and annotate beautiful network graph visualizations, text visualizations and more.
Language: Python - Size: 94.2 MB - Last synced at: 14 days ago - Pushed at: over 2 years ago - Stars: 111 - Forks: 15

chiphuyen/lazynlp
Library to scrape and clean web pages to create massive datasets.
Language: Python - Size: 37.1 KB - Last synced at: 4 days ago - Pushed at: over 4 years ago - Stars: 2,184 - Forks: 311

shangjingbo1226/AutoPhrase
AutoPhrase: Automated Phrase Mining from Massive Text Corpora
Language: C++ - Size: 195 MB - Last synced at: 2 days ago - Pushed at: over 3 years ago - Stars: 1,184 - Forks: 278

Inkdecker/Inktyping
Free tool for text exploration, analyze your favorites books and practice writing.
Language: Python - Size: 68.8 MB - Last synced at: 22 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

trinker/qdap
Quantitative Discourse Analysis Package: Bridging the gap between qualitative data and quantitative analysis
Language: R - Size: 36.9 MB - Last synced at: 6 days ago - Pushed at: over 4 years ago - Stars: 177 - Forks: 44

giocomai/castarter
Content Analysis Starter Toolkit for the R programming language
Language: R - Size: 1.25 MB - Last synced at: 4 days ago - Pushed at: 24 days ago - Stars: 3 - Forks: 0

seinecle/nocodefunctions-web-app
The code base of the front-end of nocodefunctions.com
Language: CSS - Size: 37.7 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 39 - Forks: 7

TiesdeKok/Python_NLP_Tutorial
This repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)
Language: Jupyter Notebook - Size: 443 KB - Last synced at: 22 days ago - Pushed at: almost 5 years ago - Stars: 125 - Forks: 66

SentometricsResearch/sentometrics
An integrated framework in R for textual sentiment time series aggregation and prediction
Language: R - Size: 438 MB - Last synced at: 15 days ago - Pushed at: about 1 month ago - Stars: 84 - Forks: 22

HanXinzi-AI/awesome-python-machine-learning-resources
a collection of awesome machine learning and deep learning Python libraries&tools. 热门实用机器学习和深入学习Python库和工具的集合
Size: 11 MB - Last synced at: 5 days ago - Pushed at: 11 months ago - Stars: 166 - Forks: 25

dselivanov/text2vec
Fast vectorization, topic modeling, distances and GloVe word embeddings in R.
Language: R - Size: 46.2 MB - Last synced at: 6 days ago - Pushed at: 9 months ago - Stars: 862 - Forks: 133

danielvartan/iramuteqlike
💬⛏️ IRaMuTeQ Software Analyses in R
Language: R - Size: 3.37 MB - Last synced at: 12 days ago - Pushed at: 10 months ago - Stars: 7 - Forks: 2

sergioburdisso/pyss3
A Python package implementing a new interpretable machine learning model for text classification (with visualization tools for Explainable AI :octocat:)
Language: Python - Size: 102 MB - Last synced at: about 20 hours ago - Pushed at: 4 months ago - Stars: 341 - Forks: 44

rosette-api/java
Babel Street Analytics Client Library for Java
Language: Java - Size: 64.8 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 11 - Forks: 35

bnosac/udpipe
R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
Language: C++ - Size: 5.74 MB - Last synced at: 4 days ago - Pushed at: about 2 years ago - Stars: 214 - Forks: 33

massimoaria/tall
Text Analysis for aLL
Language: R - Size: 63.6 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 16 - Forks: 5

Lips7/Matcher
A high-performance matcher designed to solve LOGICAL and TEXT VARIATIONS problems in word matching, implemented in Rust.
Language: Rust - Size: 36.9 MB - Last synced at: 1 day ago - Pushed at: 27 days ago - Stars: 17 - Forks: 1

M-Serajian/MTB-Pipeline
MTB++ a software developed to predict antimicrobial resistance to 13 antibiotics and 3 families of antimicrobials.
Language: Python - Size: 16.3 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 2 - Forks: 1

sbkellogg/eci-588
ECI 588: Text Mining in Education is graduate-level course for preparing education researchers and practitioners to use text as data for understanding and improving teaching and learning contexts.
Language: HTML - Size: 93.2 MB - Last synced at: 28 days ago - Pushed at: 29 days ago - Stars: 0 - Forks: 0

cosmoduende/r-holy-books-sentiment-data-analysis
What's the most positive or negative religion? . Sentiment and Data Analysis of Holy Books with R. Analysis of religious dogmas by exploring their Holy Books (The Bible, The Quran, The Dhammapada, and The Book of Mormon) with R
Language: R - Size: 1.42 MB - Last synced at: 22 days ago - Pushed at: about 4 years ago - Stars: 3 - Forks: 4

ingmarboeschen/JATSdecoder
A text extraction and manipulation toolset for NISO-JATS coded XML files
Language: R - Size: 2.94 MB - Last synced at: 28 days ago - Pushed at: 30 days ago - Stars: 19 - Forks: 1

PearlLeeCode/2024-us-election-analysis
[인공지능기초] 텍스트 마이닝을 통한 2024 미국 대선 분석 🗽
Language: Jupyter Notebook - Size: 67.3 MB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 1 - Forks: 0

Yingjie4Science/SDGdetector
A novel R package that can identify and visualize 17 Sustainable Development Goals and associated 169 Targets in text
Language: R - Size: 8.16 MB - Last synced at: 4 days ago - Pushed at: 8 months ago - Stars: 16 - Forks: 1

jbesomi/texthero
Text preprocessing, representation and visualization from zero to hero.
Language: Python - Size: 22.1 MB - Last synced at: 30 days ago - Pushed at: over 1 year ago - Stars: 2,904 - Forks: 240

nlppln/nlppln
NLP pipeline software using common workflow language
Language: Python - Size: 266 KB - Last synced at: 28 days ago - Pushed at: about 6 years ago - Stars: 33 - Forks: 3

caufieldjh/awesome-bioie
🧫 A curated list of resources relevant to doing Biomedical Information Extraction (including BioNLP)
Size: 588 KB - Last synced at: 12 days ago - Pushed at: 12 months ago - Stars: 371 - Forks: 33

apelullo/cobalt_health_wellness_platform_ops
Cobalt is a mental health and wellness platform created for Penn Medicine employees that serves as a hub for support services such as therapy, wellness coaching, topic- and population-specific group sessions, and a variety of self-help resources.
Language: Jupyter Notebook - Size: 194 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

juliasilge/tidy-text-mining
Manuscript of the book "Tidy Text Mining with R" by Julia Silge and David Robinson
Language: TeX - Size: 84.8 MB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 1,338 - Forks: 802

avrtt/MobileEAST
Lightweight and fast scene text detection based on EAST architecture and MobileNet layers
Size: 3.48 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 1

SmartDataAnalytics/HORUS-NER
HORUS: A framework to boost NLP tasks
Language: Python - Size: 949 MB - Last synced at: 14 days ago - Pushed at: almost 5 years ago - Stars: 48 - Forks: 6

AnttiHaerkoenen/laadulliset
Laadullisten aineistojen työmenetelmät historiatieteessä
Language: HTML - Size: 7.28 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

kavgan/nlp-in-practice
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Language: Jupyter Notebook - Size: 91.8 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 1,168 - Forks: 792

climate-ip/radar-nlp-animal-rescue
Real-time Animal Danger Alert Recognition (RADAR): NLP pipeline to detect urgent animal rescue signals from social media posts.
Language: Jupyter Notebook - Size: 15.6 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

jmartinezheras/2018-MachineLearning-Lectures-ESA
Machine Learning Lectures at the European Space Agency (ESA) in 2018
Language: Jupyter Notebook - Size: 58.3 MB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 358 - Forks: 147

maxent-ai/lda2vec 📦
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Language: Jupyter Notebook - Size: 89.8 KB - Last synced at: about 1 month ago - Pushed at: about 6 years ago - Stars: 30 - Forks: 3

terence-lim/financial-data-science
Support financial data science workflow, manage large structured and unstructured data sets, and apply financial econometrics and machine learning
Language: Python - Size: 33.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 40 - Forks: 14

terence-lim/financial-data-science-notebooks
Practical financial data science examples applying statistics, time series analysis, graph analytics, backtesting, machine learning, natural language processing, neural networks and LLMs
Language: Jupyter Notebook - Size: 122 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 32 - Forks: 9

csurfer/rake-nltk
Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.
Language: Python - Size: 477 KB - Last synced at: 27 days ago - Pushed at: over 2 years ago - Stars: 1,069 - Forks: 150

virgantara/sundanese-twitter-dataset
This is sundanese twitter dataset for emotion analysis purpose
Language: Python - Size: 402 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 5 - Forks: 0

kk7nc/HDLTex
HDLTex: Hierarchical Deep Learning for Text Classification
Language: Python - Size: 32 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 272 - Forks: 65

nluninja/text-mining-dataviz
Data Visualization and Text Mining course repository: it provides notebook implementation for data analysis and machine learning applied to text content - UNICATT:
Language: Jupyter Notebook - Size: 127 MB - Last synced at: 20 days ago - Pushed at: 5 months ago - Stars: 6 - Forks: 0
