An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: text-mining

kk7nc/HDLTex

HDLTex: Hierarchical Deep Learning for Text Classification

Language: Python - Size: 32 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 272 - Forks: 65

nluninja/text-mining-dataviz

Data Visualization and Text Mining course repository: it provides notebook implementation for data analysis and machine learning applied to text content - UNICATT:

Language: Jupyter Notebook - Size: 127 MB - Last synced at: 21 days ago - Pushed at: 5 months ago - Stars: 6 - Forks: 0

lfoppiano/document-qa

Scientific Document Insight Q/A

Language: Python - Size: 635 KB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 29 - Forks: 5

kongusen/Graphuison

A RAG-based framework for constructing scientific knowledge graphs.

Language: Python - Size: 172 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 0

lining0806/TextMining

Python文本挖掘系统 Research of Text Mining System

Language: Python - Size: 3.79 MB - Last synced at: about 1 month ago - Pushed at: about 7 years ago - Stars: 341 - Forks: 154

klajosw/python

Python data analyst, integration, migration, quality

Language: Jupyter Notebook - Size: 42.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 2

huspacy/huspacy

HuSpaCy: industrial-strength Hungarian natural language processing

Language: Python - Size: 2.2 MB - Last synced at: 29 days ago - Pushed at: 7 months ago - Stars: 165 - Forks: 15

dhamodharanrk/dhamodharanrk.github.io

Welcome to my career portfolio

Language: HTML - Size: 1.61 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

navigating-stories/orange-story-navigator

Add-on to the Orange3 data mining toolkit with text processing widgets from the project Navigating Stories

Language: Python - Size: 14.7 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 4 - Forks: 1

carlosacchi/captiocrweb

This is the web interface for CaptiOCR, a real-time live captions screen text extraction tool. CaptiOCR allows you to capture, extract, and transform on-screen text instantly.

Language: HTML - Size: 729 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

notesjor/corpusexplorer2.0

Korpuslinguistik war noch nie so einfach...

Language: C# - Size: 32.5 MB - Last synced at: about 2 hours ago - Pushed at: 2 months ago - Stars: 23 - Forks: 3

Pipe199x/TheRaven

Análisis de texto literario del poema "El Cuervo" de Edgar Allan Poe Proyecto de minería de texto que extrae, limpia y visualiza el contenido del famoso poema "El Cuervo" utilizando Python, spaCy, visualizaciones y procesamiento lingüístico en español.

Language: Python - Size: 171 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Parakh-4/r-course-exercise

🏋️♂️ Exercise for the Course "An Introduction to the R Programming Language"

Language: R - Size: 4.2 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

gsurma/password_cracker

Char-level RNN LSTM password cracker 🔑🔓.

Size: 1.02 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 56 - Forks: 16

CaritoRamos/text-mining-project-in-python

This project applies Text Mining techniques using Python (NLTK, spaCy, TextBlob) to analyze a book. It includes text cleaning, tokenization, sentiment analysis, and keyword extraction to uncover insights.

Language: Jupyter Notebook - Size: 2.9 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Clement-LVD/codexplor

R package : assess & monitor programming projects with standardized metrics

Language: R - Size: 6.06 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

PasaOpasen/ContentDetector

Detect hard/soft skills from resumes in Russian

Language: Python - Size: 72.8 MB - Last synced at: 27 days ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 4

psychbruce/PsychWordVec

🔜 Integrative Toolbox of Word Embedding Research for Psychological Science.

Language: R - Size: 44.5 MB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 22 - Forks: 1

san089/Big_Data_Project

Fake News Detection - Feature Extraction using Vectorization such as Count Vectorizer, TFIDF Vectorizer, Hash Vectorizer,. Then used an Ensemble model to classify whether the news is fake or not.

Language: Python - Size: 12.4 MB - Last synced at: about 1 month ago - Pushed at: about 5 years ago - Stars: 19 - Forks: 12

catlism/catlism.github.io

Companion website for "Corpus Approaches to Language in Social Media" - source and build versions

Language: HTML - Size: 45.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

juliasilge/learntidytext

Learn about text mining 📄 with tidy data principles

Language: CSS - Size: 6.22 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 46 - Forks: 9

caimeng2/seesus

A Python package that identifies 17 Sustainable Development Goals and their 169 Targets in text, and classifies into social, environmental, and economic sustainability.

Language: Python - Size: 827 KB - Last synced at: 9 days ago - Pushed at: 10 months ago - Stars: 8 - Forks: 2

nalimilan/R.TeMiS

R.TeMiS: R Text Mining Solution

Language: C - Size: 9.87 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 28 - Forks: 6

SCAI-BIO/cv-extraction

Web-Tool for LLM based CV extraction

Language: Python - Size: 70.3 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0

Living-with-machines/T-Res

A Toponym Resolution Pipeline for Digitised Historical Newspapers

Language: Python - Size: 11.3 MB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 8 - Forks: 1

DmitryRyumin/EMNLP-2023-Papers

EMNLP 2023 Papers: Explore cutting-edge research from EMNLP 2023, the premier conference for advancing empirical methods in natural language processing. Stay updated on the latest in machine learning, deep learning, and natural language processing with code included. :star: support NLP!

Language: Python - Size: 6.43 MB - Last synced at: 29 days ago - Pushed at: 12 months ago - Stars: 107 - Forks: 7

Makepad-fr/fbjs

Tooling that automates your Facebook interactions.

Language: TypeScript - Size: 588 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 62 - Forks: 25

graphbrain/graphbrain

Language, Knowledge, Cognition

Language: Python - Size: 103 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 598 - Forks: 69

lisc-tools/lisc

Literature Scanner: Automated collection & analyses of the scientific literature.

Language: Python - Size: 6.85 MB - Last synced at: 9 days ago - Pushed at: 14 days ago - Stars: 106 - Forks: 12

kk7nc/RMDL

RMDL: Random Multimodel Deep Learning for Classification

Language: Python - Size: 223 MB - Last synced at: 29 days ago - Pushed at: almost 2 years ago - Stars: 430 - Forks: 122

luozhouyang/AutoPhraseX

Automated Phrase Mining from Massive Text Corpora in Python.

Language: Python - Size: 90.8 KB - Last synced at: about 22 hours ago - Pushed at: almost 4 years ago - Stars: 171 - Forks: 37

tax-8974/location-analyzer

The Location Data Analyzer is a Spring Boot application that offers insights on location data, such as counting locations by type, calculating average ratings, and identifying the most reviewed and incomplete entries. It features a simple frontend (HTML, CSS, JavaScript) and is deployed on Render.

Language: Java - Size: 0 Bytes - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

pbellot/ANF-TDM

Code, données et documentations de l'atelier "Apprentissage automatique pour la classification textuelle" organisé dans le cadre de l'Action Nationale de Formation "Exploration documentaire et extraction d'information" CNRS-INRAE en 2020-21.

Language: Jupyter Notebook - Size: 57.3 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 1

LMU-Seminar-LLMs/TopicGPT

TopicGPT allows to integrate the benefits of LLMs into Topic Modelling

Language: Python - Size: 14 MB - Last synced at: 4 days ago - Pushed at: 11 months ago - Stars: 25 - Forks: 3

arj1211/cluster-links

pipeline that extracts, cleans, embeds, and clusters web links into topical groups using text extraction, semantic keyword extraction, and unsupervised clustering

Language: Python - Size: 34.2 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

CBravoR/AdvancedAnalyticsLabs

Analytics labs notebooks for Statistics and Business School students

Language: Jupyter Notebook - Size: 12.2 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 30 - Forks: 17

jphall663/GWU_data_mining

Materials for GWU DNSC 6279 and DNSC 6290.

Language: Jupyter Notebook - Size: 186 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 238 - Forks: 173

sfu-discourse-lab/GenderGapTracker

Scrape news articles and analyze them using NLP to quantify the gender gap in Canadian mainstream media

Language: Python - Size: 9.34 MB - Last synced at: 7 days ago - Pushed at: about 1 year ago - Stars: 42 - Forks: 11

GGNoWayBack/cathodedataextractor

A document-level information extraction pipeline for layered cathode materials for sodium-ion batteries.

Language: Python - Size: 608 KB - Last synced at: 1 day ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 2

stdlib-js/nlp-tokenize

Tokenize a string.

Language: JavaScript - Size: 834 KB - Last synced at: 29 days ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

BlueObelisk/oscar4

OSCAR (Open Source Chemistry Analysis Routines) is an open source extensible system for the automated annotation of chemistry in scientific articles.

Language: Java - Size: 125 MB - Last synced at: 28 days ago - Pushed at: 2 months ago - Stars: 31 - Forks: 4

fendouai/Awesome-Text-Classification

Awesome-Text-Classification Projects,Papers,Tutorial .

Size: 7.81 KB - Last synced at: about 16 hours ago - Pushed at: over 7 years ago - Stars: 171 - Forks: 32

trinker/lexicon

A data package containing lexicons and dictionaries for text analysis

Language: R - Size: 9.17 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 110 - Forks: 14

trinker/readability

Fast readability scores for text data

Language: R - Size: 175 KB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 22 - Forks: 4

trinker/textreadr

Tools to uniformly read in text data including semi-structured transcripts

Language: R - Size: 1.78 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 74 - Forks: 5

hrbrmstr/misinfo

📊 Tools to Perform ‘Misinformation’ Analysis on a Text Corpus (wrapper for methods in https://github.com/PDXBek/Misinformation)

Language: R - Size: 401 KB - Last synced at: about 1 month ago - Pushed at: about 7 years ago - Stars: 16 - Forks: 0

hrbrmstr/elpresidente

🇺🇸 Search and Extract Corpus Elements from 'The American Presidency Project'

Language: R - Size: 17.6 KB - Last synced at: about 1 month ago - Pushed at: about 7 years ago - Stars: 20 - Forks: 1

mkearney/textfeatures

👷‍♂️ A simple package for extracting useful features from character objects 👷‍♀️

Language: R - Size: 7.64 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 167 - Forks: 17

assafmo/xioc

Extract indicators of compromise from text, including "escaped" ones.

Language: Go - Size: 64.5 KB - Last synced at: about 1 month ago - Pushed at: about 5 years ago - Stars: 159 - Forks: 13

andrewtavis/kwx

BERT, LDA, and TFIDF based keyword extraction in Python

Language: Python - Size: 12.3 MB - Last synced at: 28 days ago - Pushed at: about 1 year ago - Stars: 72 - Forks: 10

narrnar/133FP

UCLA STATS 133 Final Project

Size: 5.44 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

gsurma/text_predictor

Char-level RNN LSTM text generator📄.

Language: Python - Size: 125 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 115 - Forks: 35

bigartm/bigartm

Fast topic modeling platform

Language: C++ - Size: 16.8 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 668 - Forks: 120

RajnishProgrammer/NLTK-Textual-Analysis

NLP pipeline for text processing and feature extraction 🛠

Language: Python - Size: 350 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

mathsyouth/awesome-text-summarization Fork of lipiji/App-DL

A curated list of resources dedicated to text summarization

Size: 243 KB - Last synced at: 18 days ago - Pushed at: over 2 years ago - Stars: 1,542 - Forks: 265

brandonleekramer/tidyorgs

A tidy package that detects and standardizes organizations in unstructured text data

Language: R - Size: 48.2 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 0

stewart-lab/fast_km

A Containerized KinderMiner / Serial KinderMiner Server

Language: Python - Size: 16.3 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 2

WLXie-Tony/Movie_Review_Analysis

A comprehensive pipeline for scraping, structuring, and analyzing IMDb movie reviews. This repository includes automated web scraping scripts, structured datasets, and advanced large language model (LLM)-based sentiment analysis to extract insights from user reviews.

Language: Python - Size: 120 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

luizsci42/Analise-de-sentimentos-pandemia-covid19

Repositório utilizado para o plano de PIBIC 2020-2021 com o prof. Dr. Hendrik Macedo. Tem como finalidade criar um dataset para treinamento de modelos de aprendizado de máquina sobre as 5 emoções de Ekman e analisar os sentimentos predominantes durante os primeiros 12 meses da pandemia de COVID-19.

Language: Jupyter Notebook - Size: 9.66 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

mchesterkadwell/intro-to-text-mining-with-python

Cambridge Digital Humanities 'Introduction to Text-Mining with Python' (workshops 1 and 2)

Language: Jupyter Notebook - Size: 1.73 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 22 - Forks: 8

ontoligent-design/polo2

A revised version of Polo

Language: Jupyter Notebook - Size: 442 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 2

rtrad89/authorship_clustering_code_repo

LAC: Latent Authorial Clustering of Shorter Texts

Language: Python - Size: 259 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 1

BlueBrain/Search 📦

Blue Brain text mining toolbox for semantic search and structured information extraction

Language: Python - Size: 4.62 MB - Last synced at: 29 days ago - Pushed at: about 2 years ago - Stars: 45 - Forks: 12

rosette-api/rosette-elasticsearch-plugin

Document Enrichment plugin for Elasticsearch

Language: Java - Size: 425 KB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 27 - Forks: 13

SarthakJShetty/pyResearchInsights

End-to-end NLP tool to analyze research publications. Published in Ecology & Evolution 2021.

Language: Python - Size: 6.67 MB - Last synced at: 12 days ago - Pushed at: 11 months ago - Stars: 32 - Forks: 8

lfoppiano/material-parsers

Material parsers and other tools, scripts Initially developed for Grobid Superconductor

Language: Python - Size: 68.7 MB - Last synced at: 23 days ago - Pushed at: 3 months ago - Stars: 12 - Forks: 0

koheiw/marimo

A multi-lingual stopwords lists

Language: R - Size: 69.3 KB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 17 - Forks: 7

bit2r/bitNLP

Tools that support "Natural Language Processing" for Korean text analytics.

Language: R - Size: 45.8 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 10 - Forks: 3

apelullo/twitter_covid_stream_processing_ops

An AWS-based data pipeline to collect, process, store, and monitor Twitter streaming data thoughout the COVID-19 pandemic in support of local, regional, and national public health initiatives.

Language: Jupyter Notebook - Size: 117 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

sap218/jabberwocky

NLP toolkit for those nonsensical ontologies

Language: Python - Size: 936 KB - Last synced at: 9 days ago - Pushed at: about 2 months ago - Stars: 16 - Forks: 1

rosette-api/python

Babel Street Analytics Client Library for Python

Language: Python - Size: 1.63 MB - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 38 - Forks: 37

biomedicalinformaticsgroup/cadmus

A full-text article retrieval pipeline for biomedical literature.

Language: Python - Size: 271 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 20 - Forks: 2

FinnishCancerRegistry/gleason_extraction_py

Extract Gleason scores from texts.

Language: Python - Size: 65.4 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 1

jakelever/pubrunner

A framework for keeping biomedical text mining result up-to-date

Language: Python - Size: 565 KB - Last synced at: 28 days ago - Pushed at: almost 5 years ago - Stars: 42 - Forks: 6

venkat-0706/Social-Media-Analysis

Clean & analyze social media data with Python. Explore trends, sentiments, & user behavior. Includes data cleaning, visualization, & insights.

Language: Jupyter Notebook - Size: 249 KB - Last synced at: 29 days ago - Pushed at: 9 months ago - Stars: 12 - Forks: 0

TextDatasetCleaner/TextDatasetCleaner

🔬 Очистка датасетов от мусора (нормализация, препроцессинг)

Language: Python - Size: 72.3 KB - Last synced at: 12 days ago - Pushed at: about 4 years ago - Stars: 40 - Forks: 10

ropensci/tokenizers

Fast, Consistent Tokenization of Natural Language Text

Language: R - Size: 1.24 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 186 - Forks: 25

panis-konstantinos/stress_detection_text_mining

Text Mining project for stress detection from social media articles.

Size: 1.06 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

kernel-loophole/KG-graph

Knowledge graph from unstructured text

Language: Python - Size: 4.27 MB - Last synced at: 22 days ago - Pushed at: 2 months ago - Stars: 8 - Forks: 1

finjahasi/clinical-text-mining_R_SCRIPT

A lightweight R script for text mining and harmonizing medical phenotype data. Cleans, standardizes, and maps diagnoses to ICD-10 codes, with clinical annotations for enhanced data usability.

Language: R - Size: 9.77 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

busebircan/Business-Analytics

A Jupyter Notebook stack for example business analytics cases + a simulation study

Language: Jupyter Notebook - Size: 8.49 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 4 - Forks: 1

jbampton/xml-schema-diff

:point_right: Visual XML Schema differencing tool :point_left: :arrow_right: Static site generator :arrow_left:

Language: HTML - Size: 1.25 MB - Last synced at: 27 days ago - Pushed at: over 7 years ago - Stars: 8 - Forks: 0

pemagrg1/Natural-Language-Processing-NLP-Roadmap

A simple RoadMap to Natural Language Processing(NLP)

Size: 48.8 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 134 - Forks: 17

nikolamilosevic86/TabInOut

Framework for information extraction from tables

Language: Python - Size: 5.21 MB - Last synced at: 30 days ago - Pushed at: about 6 years ago - Stars: 41 - Forks: 10

lfoppiano/SuperMat

Superconductors material dataset

Language: Jupyter Notebook - Size: 20.2 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 26 - Forks: 3

the-abadie/text-mining-synthesis Fork of CederGroupHub/text-mined-synthesis_public

Fork of the Ceder Group's Text-Mining Synthesis packages

Language: Python - Size: 109 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Lightbridge-KS/radreportparser

Regex-based text parser for common radiology report

Language: Jupyter Notebook - Size: 3.47 MB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

Nolram567/HesseParlPy

Ein Parser für die Generation eines XML-TEI-Korpus der Plenarprotokolle der 20. Legislaturperiode des hessischen Landtags und die Berechnung eines Topic Models.

Language: Objective-C++ - Size: 78.7 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

lasigeBioTM/BENT

Biomedical Term Annotator

Language: Python - Size: 6.48 MB - Last synced at: 24 days ago - Pushed at: 3 months ago - Stars: 9 - Forks: 1

bakrianoo/aravec

AraVec is a pre-trained distributed word representation (word embedding) open source project which aims to provide the Arabic NLP research community with free to use and powerful word embedding models.

Language: Jupyter Notebook - Size: 1.17 MB - Last synced at: about 2 months ago - Pushed at: about 4 years ago - Stars: 398 - Forks: 80

laugustyniak/awesome-sentiment-analysis

Repository with all what is necessary for sentiment analysis and related areas

Size: 36.1 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 539 - Forks: 110

airbnb/artificial-adversary

🗣️ Tool to generate adversarial text examples and test machine learning models against them

Language: Python - Size: 116 KB - Last synced at: 4 days ago - Pushed at: over 3 years ago - Stars: 402 - Forks: 57

jalajthanaki/NLPython

This repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"

Language: Jupyter Notebook - Size: 131 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 322 - Forks: 207

pjhampton/woolly

The Text Mining Elixir

Language: Elixir - Size: 82 KB - Last synced at: 5 days ago - Pushed at: about 4 years ago - Stars: 54 - Forks: 8

copyleftdev/nword

nword is a command-line tool designed to scan directories, detect programming languages, and process source code files. It generates a JSON snapshot of the directory structure, including file metadata, language detection, and condensed content.

Language: Rust - Size: 12.7 KB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

stdlib-js/nlp-sentencize

Split a string into an array of sentences.

Language: JavaScript - Size: 688 KB - Last synced at: 27 days ago - Pushed at: 3 months ago - Stars: 5 - Forks: 2

lasigeBioTM/K-RET

K-RET: Knowledgeable Biomedical Relation Extraction System

Language: Python - Size: 4.17 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 9 - Forks: 3

Aptivi-Analytics/WordsList

All common and uncommon English words

Language: Shell - Size: 48.7 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 5 - Forks: 0

michabirklbauer/hgb_dse_text_mining_solutions 📦

Solutions for the practical part of the lecture Text Mining

Language: HTML - Size: 38.1 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

michabirklbauer/hgb_dse_text_mining 📦

Contents for the practical part of the lecture Text Mining

Language: Jupyter Notebook - Size: 60.1 MB - Last synced at: 22 days ago - Pushed at: 6 months ago - Stars: 0 - Forks: 3

Related Keywords
text-mining 1,932 nlp 446 python 380 machine-learning 336 natural-language-processing 297 text-classification 245 r 233 sentiment-analysis 228 text-analysis 185 data-science 166 topic-modeling 152 text-processing 110 data-mining 107 nlp-machine-learning 92 deep-learning 88 python3 72 nltk 72 classification 69 data-visualization 63 tf-idf 57 information-retrieval 57 clustering 55 data-analysis 54 text 52 twitter 50 wordcloud 44 rstats 42 visualization 41 webscraping 40 word2vec 39 web-scraping 39 spacy 37 named-entity-recognition 37 lda 36 java 34 keyword-extraction 33 dataset 33 information-extraction 30 naive-bayes-classifier 30 artificial-intelligence 29 jupyter-notebook 29 logistic-regression 28 pandas 28 sentiment-classification 27 latent-dirichlet-allocation 27 twitter-api 26 scikit-learn 25 random-forest 24 ai 22 bag-of-words 22 text-analytics 22 neural-network 22 analysis 21 tensorflow 21 machine-learning-algorithms 21 word-embeddings 21 digital-humanities 21 network-analysis 20 regex 20 r-package 20 neural-networks 20 tokenization 20 gensim 20 data 20 ner 20 bioinformatics 19 tidytext 19 unsupervised-learning 19 news 19 javascript 18 search-engine 18 social-media 18 crawler 18 scraping 18 corpus 18 pubmed 18 sentiment 17 covid-19 17 lemmatization 17 summarization 17 sklearn 17 statistics 16 corpus-linguistics 16 tweets 16 tokenizer 16 social-network-analysis 16 feature-extraction 16 pytorch 16 numpy 16 image-processing 15 keras 15 matplotlib 15 text-clustering 15 exploratory-data-analysis 15 flask 15 text-extraction 15 naive-bayes 15 embeddings 15 bert 15 shiny 15