An open API service providing repository metadata for many open source software ecosystems.

Topic: "nlp-library"

huggingface/transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language: Python - Size: 287 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 144,208 - Forks: 28,911

explosion/spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python

Language: Python - Size: 193 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 31,541 - Forks: 4,502

bharathgs/Awesome-pytorch-list

A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.

Size: 867 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 15,824 - Forks: 2,822

thunlp/OpenPrompt

An Open-Source Framework for Prompt-Learning.

Language: Python - Size: 14.4 MB - Last synced at: 1 day ago - Pushed at: 10 months ago - Stars: 4,583 - Forks: 466

fastnlp/fastNLP

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

Language: Python - Size: 35.1 MB - Last synced at: about 5 hours ago - Pushed at: almost 2 years ago - Stars: 3,134 - Forks: 449

FudanNLP/fnlp

中文自然语言处理工具包 Toolkit for Chinese natural language processing

Language: Java - Size: 2.63 MB - Last synced at: 4 minutes ago - Pushed at: over 1 year ago - Stars: 2,670 - Forks: 723

xavier-zy/Awesome-pytorch-list-CNVersion

Awesome-pytorch-list 翻译工作进行中......

Language: Jupyter Notebook - Size: 1.16 MB - Last synced at: 1 day ago - Pushed at: almost 4 years ago - Stars: 1,767 - Forks: 402

deepset-ai/FARM 📦

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

Language: Python - Size: 6.97 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 1,750 - Forks: 249

chrismattmann/tika-python

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

Language: Python - Size: 31.5 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 1,585 - Forks: 240

undertheseanlp/underthesea

Underthesea - Vietnamese NLP Toolkit

Language: Python - Size: 166 MB - Last synced at: 1 day ago - Pushed at: 11 days ago - Stars: 1,530 - Forks: 282

MilaNLProc/contextualized-topic-models

A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).

Language: Python - Size: 32 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 1,228 - Forks: 152

PyThaiNLP/pythainlp

Thai natural language processing in Python

Language: Python - Size: 65.6 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,032 - Forks: 279

thunlp/OpenDelta

A plug-and-play library for parameter-efficient-tuning (Delta Tuning)

Language: Python - Size: 42 MB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 1,027 - Forks: 83

datadreamer-dev/DataDreamer

DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models.   🤖💤

Language: Python - Size: 895 KB - Last synced at: 25 days ago - Pushed at: 3 months ago - Stars: 1,010 - Forks: 53

ashishpatel26/Treasure-of-Transformers

💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️

Language: Jupyter Notebook - Size: 370 KB - Last synced at: about 14 hours ago - Pushed at: 10 months ago - Stars: 990 - Forks: 210

atilika/kuromoji

Kuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search

Language: Java - Size: 5.5 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 936 - Forks: 128

NorskRegnesentral/skweak

skweak: A software toolkit for weak supervision applied to NLP tasks

Language: Python - Size: 28 MB - Last synced at: 1 day ago - Pushed at: 8 months ago - Stars: 922 - Forks: 77

mocobeta/janome

Japanese morphological analysis engine written in pure Python

Language: Python - Size: 403 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 873 - Forks: 52

ikawaha/kagome

Self-contained Japanese Morphological Analyzer written in pure Go

Language: Go - Size: 711 MB - Last synced at: about 21 hours ago - Pushed at: 6 days ago - Stars: 869 - Forks: 55

mindspore-lab/mindnlp

Easy-to-use and high-performance NLP and LLM framework based on MindSpore, compatible with models and datasets of 🤗Huggingface.

Language: Jupyter Notebook - Size: 45.8 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 855 - Forks: 257

taishi-i/awesome-japanese-nlp-resources

A curated list of resources dedicated to Python libraries, LLMs, dictionaries, and corpora of NLP for Japanese

Size: 8.18 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 807 - Forks: 30

MIND-Lab/OCTIS

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

Language: Python - Size: 168 MB - Last synced at: 18 days ago - Pushed at: 10 months ago - Stars: 758 - Forks: 111

WorksApplications/Sudachi

A Japanese Tokenizer for Business

Language: Java - Size: 1.57 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 741 - Forks: 69

Ailln/cn2an

📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)

Language: Python - Size: 685 KB - Last synced at: about 22 hours ago - Pushed at: 5 months ago - Stars: 726 - Forks: 79

cbaziotis/ekphrasis

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).

Language: Python - Size: 659 KB - Last synced at: 1 day ago - Pushed at: about 1 year ago - Stars: 671 - Forks: 92

pemistahl/lingua

The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

Language: Kotlin - Size: 424 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 660 - Forks: 60

wyounas/homer

Homer, a text analyser in Python, can help make your text more clear, simple and useful for your readers.

Language: Python - Size: 4.66 MB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 632 - Forks: 35

medspacy/medspacy

Library for clinical NLP with spaCy.

Language: Jupyter Notebook - Size: 2.87 MB - Last synced at: about 12 hours ago - Pushed at: about 1 month ago - Stars: 572 - Forks: 98

fhamborg/Giveme5W1H

Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?

Language: HTML - Size: 208 MB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 507 - Forks: 88

proycon/pynlpl

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

Language: Python - Size: 12.8 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 477 - Forks: 67

linuxscout/pyarabic

pyarabic

Language: Python - Size: 1.23 MB - Last synced at: about 22 hours ago - Pushed at: over 1 year ago - Stars: 455 - Forks: 87

CAMeL-Lab/camel_tools

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

Language: Python - Size: 11.5 MB - Last synced at: 5 days ago - Pushed at: 29 days ago - Stars: 449 - Forks: 75

WorksApplications/SudachiPy 📦

Python version of Sudachi, a Japanese tokenizer.

Language: Python - Size: 669 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 403 - Forks: 52

taishi-i/nagisa

A Japanese tokenizer based on recurrent neural networks

Language: Python - Size: 39.4 MB - Last synced at: 15 days ago - Pushed at: 11 months ago - Stars: 398 - Forks: 23

ElizaLo/NLP-Natural-Language-Processing

Projects and useful articles / links

Language: Jupyter Notebook - Size: 71.2 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 376 - Forks: 78

IBM/zshot

Zero and Few shot named entity & relationships recognition

Language: Python - Size: 1.48 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 367 - Forks: 24

hellohaptik/multi-task-NLP

multi_task_NLP is a utility toolkit enabling NLP developers to easily train and infer a single model for multiple tasks.

Language: Python - Size: 7.46 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 358 - Forks: 54

hellohaptik/chatbot_ner

chatbot_ner: Named Entity Recognition for chatbots.

Language: Python - Size: 15.5 MB - Last synced at: 23 days ago - Pushed at: 3 months ago - Stars: 328 - Forks: 133

urduhack/urduhack

An NLP library for the Urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.

Language: Python - Size: 475 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 296 - Forks: 42

outcastofmusic/quick-nlp

Pytorch NLP library based on FastAI

Language: Python - Size: 56.5 MB - Last synced at: 8 days ago - Pushed at: almost 7 years ago - Stars: 282 - Forks: 50

gandersen101/spaczz

Fuzzy matching and more functionality for spaCy.

Language: Python - Size: 1.4 MB - Last synced at: 9 days ago - Pushed at: 10 months ago - Stars: 256 - Forks: 28

ikegami-yukino/mecab Fork of taku910/mecab 📦

This repository is archived! The maintained MeCab can be found https://github.com/shogo82148/mecab

Language: C++ - Size: 84.2 MB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 254 - Forks: 16

BobXWu/TopMost

A Topic Modeling System Toolkit (ACL 2024 Demo)

Language: Jupyter Notebook - Size: 254 MB - Last synced at: about 2 hours ago - Pushed at: about 1 month ago - Stars: 251 - Forks: 26

neomatrix369/nlp_profiler

A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.

Language: Python - Size: 3.54 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 242 - Forks: 37

BLLIP/bllip-parser Fork of dmcc/bllip-parser

BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.

Language: GAP - Size: 47.8 MB - Last synced at: 3 days ago - Pushed at: over 3 years ago - Stars: 227 - Forks: 53

alexandrainst/danlp 📦

DaNLP is a repository for Natural Language Processing resources for the Danish Language.

Language: Python - Size: 49.4 MB - Last synced at: 18 days ago - Pushed at: 3 months ago - Stars: 205 - Forks: 34

IBM/unitxt

🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data for end-to-end AI benchmarking

Language: Python - Size: 95.9 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 193 - Forks: 53

dccuchile/wefe

WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!

Language: Python - Size: 41.6 MB - Last synced at: 14 days ago - Pushed at: 11 months ago - Stars: 177 - Forks: 14

TakeLab/spacy-udpipe

spaCy + UDPipe

Language: Python - Size: 104 KB - Last synced at: 7 days ago - Pushed at: about 3 years ago - Stars: 161 - Forks: 10

chewxy/lingo

package lingo provides the data structures and algorithms required for natural language processing

Language: Go - Size: 465 KB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 155 - Forks: 15

infinitylogesh/mutate

A library to synthesize text datasets using Large Language Models (LLM)

Language: Python - Size: 163 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 151 - Forks: 8

emres/turkish-deasciifier

Turkish deasciifier in Python based on Deniz Yüret's turkish-mode for Emacs

Language: Python - Size: 211 KB - Last synced at: 12 days ago - Pushed at: over 4 years ago - Stars: 147 - Forks: 23

nullnull/simstring

A Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.

Language: Python - Size: 1.2 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 123 - Forks: 15

taishi-i/toiro

A comparison tool of Japanese tokenizers

Language: Python - Size: 1.04 MB - Last synced at: 10 months ago - Pushed at: 11 months ago - Stars: 115 - Forks: 8

aymara/lima

The Libre Multilingual Analyzer, a Natural Language Processing (NLP) C++ toolkit.

Language: C++ - Size: 276 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 112 - Forks: 20

plkmo/NLP_Toolkit

Library of state-of-the-art models (PyTorch) for NLP tasks

Language: Python - Size: 4.3 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 108 - Forks: 27

tsterbak/promptmage

simplifies the process of creating and managing LLM workflows.

Language: Python - Size: 4.08 MB - Last synced at: 26 days ago - Pushed at: 7 months ago - Stars: 100 - Forks: 8

lfcipriani/punkt-segmenter

Ruby port of the NLTK Punkt sentence segmentation algorithm

Language: Ruby - Size: 148 KB - Last synced at: 3 days ago - Pushed at: almost 7 years ago - Stars: 92 - Forks: 10

uma-pi1/minie

An open information extraction system that provides compact extractions

Language: Java - Size: 4.48 MB - Last synced at: 2 days ago - Pushed at: about 3 years ago - Stars: 91 - Forks: 27

doches/rwordnet

A pure Ruby interface to the WordNet database

Language: Ruby - Size: 8.06 MB - Last synced at: 3 days ago - Pushed at: over 5 years ago - Stars: 90 - Forks: 28

litus-ai/classy

classy is a simple-to-use library for building high-performance Machine Learning models in NLP.

Language: Python - Size: 2.99 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 86 - Forks: 3

mikeroyal/NLP-Guide

Natural Language Processing (NLP). Covering topics such as Tokenization, Part Of Speech tagging (POS), Machine translation, Named Entity Recognition (NER), Classification, and Sentiment analysis.

Language: Python - Size: 315 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 86 - Forks: 15

legacyai/tf-transformers

State of the art faster Transformer with Tensorflow 2.0 ( NLP, Computer Vision, Audio ).

Language: Jupyter Notebook - Size: 18.1 MB - Last synced at: 16 days ago - Pushed at: about 2 years ago - Stars: 85 - Forks: 2

mikahama/uralicNLP

An NLP library for Uralic languages such as Finnish, Skolt Sami, Moksha and so on. Also supporting some non-Uralic languages such as Spanish, French, Arabic, Swedish, Norwegian, Russian and English. LLMs, FSTs and More!

Language: Python - Size: 432 KB - Last synced at: about 23 hours ago - Pushed at: 6 months ago - Stars: 80 - Forks: 7

FareedKhan-dev/basiclingua-LLM-Based-NLP

LLM Based NLP Library.

Language: Python - Size: 4.31 MB - Last synced at: 10 months ago - Pushed at: about 1 year ago - Stars: 77 - Forks: 9

Ars-Linguistica/mlconjug3

A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.

Language: Python - Size: 397 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 75 - Forks: 11

Koziev/GrammarEngine

Грамматический Словарь Русского Языка (+ английский, японский, etc)

Language: C++ - Size: 401 MB - Last synced at: 21 days ago - Pushed at: almost 5 years ago - Stars: 75 - Forks: 20

VietHoang1512/khmer-nltk

Khmer language processing toolkit

Language: Python - Size: 10 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 73 - Forks: 18

SekouD/mlconjug

A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.

Language: Python - Size: 52.2 MB - Last synced at: 3 days ago - Pushed at: 5 months ago - Stars: 72 - Forks: 8

old-wang-95/easy-bert

easy-bert是一个中文NLP工具,提供诸多bert变体调用和调参方法,极速上手;清晰的设计和代码注释,也很适合学习

Language: Python - Size: 9.05 MB - Last synced at: 9 months ago - Pushed at: over 2 years ago - Stars: 68 - Forks: 12

OpenPecha/Botok

🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python

Language: Python - Size: 30.8 MB - Last synced at: 7 days ago - Pushed at: 2 months ago - Stars: 67 - Forks: 16

mikahama/natas

Python 3 library for processing historical English

Language: Python - Size: 95.7 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 67 - Forks: 11

SkBlaz/rakun2

RaKUn 2.0 - A fast keyword detection algorithm

Language: Python - Size: 2.62 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 66 - Forks: 9

StarCC0/starcc-py

简繁转换 簡繁轉換 Python implementation of StarCC, the next generation of Simplified-Traditional Chinese conversion framework

Language: Python - Size: 21.5 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 58 - Forks: 3

ART-Group-it/KERMIT

🐸 KERMIT - A lightweight library to encode and interpret Universal Syntactic Embeddings

Language: JavaScript - Size: 14.6 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 56 - Forks: 8

mbejda/Node-OpenNLP

Apache OpenNLP wrapper for Nodejs

Language: JavaScript - Size: 19.7 MB - Last synced at: 20 days ago - Pushed at: about 6 years ago - Stars: 56 - Forks: 17

StatguyUser/TextFeatureSelection

Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models

Language: Python - Size: 1.11 MB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 52 - Forks: 5

wayfair-incubator/extra-model

Code to run the ExtRA algorithm for unsupervised topic/aspect extraction on English texts.

Language: HTML - Size: 80.6 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 51 - Forks: 11

yakivyusin/SimpleNetNlp

.NET NLP library

Language: C# - Size: 28.2 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 48 - Forks: 10

UniversalDataTool/react-nlp-annotate

Interface for making NLP annotations.

Language: JavaScript - Size: 11.8 MB - Last synced at: 6 days ago - Pushed at: about 2 years ago - Stars: 47 - Forks: 21

golsun/NLP-tools

Useful python NLP tools (evaluation, GUI interface, tokenization)

Language: Python - Size: 207 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 44 - Forks: 9

OpenJarbas/simple_NER

simple rule based named entity recognition

Language: Python - Size: 2.1 MB - Last synced at: 20 days ago - Pushed at: about 3 years ago - Stars: 43 - Forks: 9

KylinC/PySeg

Python 中文分词库/词性标注库

Language: Python - Size: 12.2 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 39 - Forks: 1

AnFreTh/STREAM

A versatile Python package engineered for seamless topic modeling, topic evaluation, and topic visualization. Ideal for text analysis, natural language processing (NLP), and research in the social sciences, STREAM simplifies the extraction, interpretation, and visualization of topics from large, complex datasets.

Language: Python - Size: 228 MB - Last synced at: 1 day ago - Pushed at: 3 months ago - Stars: 38 - Forks: 9

jfreddypuentes/spanlp

spanlp: nlp applied for spanish vulgarity. A fast, robust Python library to check for profanity or offensive language in Spanish strings. It contains all the rude words of Spanish-speaking countries.

Language: Python - Size: 1.88 MB - Last synced at: about 6 hours ago - Pushed at: 11 months ago - Stars: 38 - Forks: 8

linonetwo/template-based-generator-template

基于模板的文本生成器的模板,模生模,凤生凤,老鼠的儿子会打洞。本地启动:npm i && npm run dev:demo

Language: TypeScript - Size: 3.18 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 37 - Forks: 22

lingualytics/py-lingualytics

A text analytics library with support for codemixed data

Language: Python - Size: 15.3 MB - Last synced at: 26 days ago - Pushed at: over 4 years ago - Stars: 37 - Forks: 4

Koziev/rulemma

Лемматизатор для русскоязычных текстов

Language: Python - Size: 251 MB - Last synced at: almost 2 years ago - Pushed at: almost 5 years ago - Stars: 37 - Forks: 6

syzer/sentiment-analyser

ML that can extract german and english sentiment

Language: JavaScript - Size: 101 KB - Last synced at: 8 days ago - Pushed at: over 4 years ago - Stars: 36 - Forks: 12

TrainingByPackt/Deep-Learning-for-Natural-Language-Processing

Solve your natural language processing problems with smart deep neural networks

Language: Jupyter Notebook - Size: 44 MB - Last synced at: about 1 month ago - Pushed at: almost 6 years ago - Stars: 36 - Forks: 53

ispras/atr4s

Toolkit with state-of-the-art Automatic Terms Recognition methods in Scala

Language: Scala - Size: 180 KB - Last synced at: about 1 month ago - Pushed at: almost 7 years ago - Stars: 35 - Forks: 5

andreihar/taibun

Taiwanese Hokkien Transliterator and Tokeniser

Language: Python - Size: 4.57 MB - Last synced at: 9 days ago - Pushed at: 9 months ago - Stars: 34 - Forks: 2

FareedKhan-dev/Most-powerful-NLP-library

Gemini, as capable as GPT-4, provides a free API with limited access. I tested it with the help of prompt engineering and found that it can solve almost any NLP task you want to tackle.

Language: Jupyter Notebook - Size: 107 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 34 - Forks: 9

DataScienceUIBK/HintEval

HintEval💡: A Comprehensive Framework for Hint Generation and Evaluation for Questions

Language: Python - Size: 2.94 MB - Last synced at: about 9 hours ago - Pushed at: about 1 month ago - Stars: 33 - Forks: 2

sentencizer/sentencizer

A sentence splitting (sentence boundary disambiguation) library for Go. It is rule-based and works out-of-the-box.

Language: Go - Size: 1.83 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 33 - Forks: 6

pncnmnp/LuaNLP

Natural Language Processing Library for Lua

Language: Lua - Size: 3.16 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 32 - Forks: 0

wannaphong/LaoNLP

Lao language NLP

Language: Python - Size: 8.15 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 31 - Forks: 5

HiveGuard-AI/taxonomy4good

Taxonomy4Good: a sustainability lexicon that provides the freedom to create custom taxonomies in addition to listed ESG and Sustainability Standards taxonomies.

Language: Python - Size: 3.48 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 30 - Forks: 5

proycon/python-ucto

This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).

Language: Cython - Size: 87.9 KB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 29 - Forks: 5

jpmanson/llm_templates

Instruction/chat prompts creation library for text generation LLMs. It supports local and Hugging Face models.

Language: Python - Size: 302 KB - Last synced at: 4 days ago - Pushed at: 9 months ago - Stars: 29 - Forks: 1

Related Topics
nlp 295 nlp-machine-learning 116 natural-language-processing 88 python 83 machine-learning 69 deep-learning 37 nlp-parsing 35 nlp-keywords-extraction 26 python3 25 spacy 25 nltk 22 sentiment-analysis 22 ai 22 pytorch 21 text-processing 18 tokenizer 16 llm 16 data-science 15 natural-language-understanding 14 artificial-intelligence 13 nltk-python 13 named-entity-recognition 13 text-classification 13 nlp-resources 13 language-model 12 spacy-nlp 12 bert 12 tokenization 11 sentiment-classification 11 transformer 11 hacktoberfest 11 chatbot 11 neural-network 10 text-analysis 10 nltk-library 10 morphological-analysis 9 nlp-datasets 9 pos-tagging 8 linguistics 8 nlp-apis 8 nodejs 8 text-mining 8 ml 8 japanese 7 seq2seq 7 javascript 7 preprocessing 7 natural-language 7 question-answering 7 stemmer 7 machine-learning-algorithms 7 library 7 transformers 7 java 7 speech-recognition 7 topic-modeling 7 flask 7 lemmatizer 7 word2vec 6 embeddings 6 pretrained-models 6 keras 6 machine-translation 6 ner 6 segmentation 6 word-segmentation 6 pandas 6 tensorflow 6 awesome 5 clinical-nlp 5 word-embeddings 5 computational-linguistics 5 machinelearning 5 medspacy 5 part-of-speech-tagger 5 numpy 5 lemmatization 5 japanese-language 5 text 5 pipeline 5 llms 5 classification 5 python-library 5 natural-language-generation 5 language-models 5 sklearn-library 5 large-language-models 5 data-analysis 5 stemming 5 english 4 sentiment 4 pos-tagger 4 twitter-api 4 twitter-sentiment-analysis 4 text-summarization 4 api 4 neural-topic-models 4 data 4 random-forest 4 multilingual 4