Topic: "nlp-library"
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language: Python - Size: 287 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 144,208 - Forks: 28,911

explosion/spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python
Language: Python - Size: 193 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 31,541 - Forks: 4,502

bharathgs/Awesome-pytorch-list
A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
Size: 867 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 15,824 - Forks: 2,822

thunlp/OpenPrompt
An Open-Source Framework for Prompt-Learning.
Language: Python - Size: 14.4 MB - Last synced at: 1 day ago - Pushed at: 10 months ago - Stars: 4,583 - Forks: 466

fastnlp/fastNLP
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Language: Python - Size: 35.1 MB - Last synced at: about 5 hours ago - Pushed at: almost 2 years ago - Stars: 3,134 - Forks: 449

FudanNLP/fnlp
中文自然语言处理工具包 Toolkit for Chinese natural language processing
Language: Java - Size: 2.63 MB - Last synced at: 4 minutes ago - Pushed at: over 1 year ago - Stars: 2,670 - Forks: 723

xavier-zy/Awesome-pytorch-list-CNVersion
Awesome-pytorch-list 翻译工作进行中......
Language: Jupyter Notebook - Size: 1.16 MB - Last synced at: 1 day ago - Pushed at: almost 4 years ago - Stars: 1,767 - Forks: 402

deepset-ai/FARM 📦
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
Language: Python - Size: 6.97 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 1,750 - Forks: 249

chrismattmann/tika-python
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Language: Python - Size: 31.5 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 1,585 - Forks: 240

undertheseanlp/underthesea
Underthesea - Vietnamese NLP Toolkit
Language: Python - Size: 166 MB - Last synced at: 1 day ago - Pushed at: 11 days ago - Stars: 1,530 - Forks: 282

MilaNLProc/contextualized-topic-models
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
Language: Python - Size: 32 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 1,228 - Forks: 152

PyThaiNLP/pythainlp
Thai natural language processing in Python
Language: Python - Size: 65.6 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1,032 - Forks: 279

thunlp/OpenDelta
A plug-and-play library for parameter-efficient-tuning (Delta Tuning)
Language: Python - Size: 42 MB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 1,027 - Forks: 83

datadreamer-dev/DataDreamer
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
Language: Python - Size: 895 KB - Last synced at: 25 days ago - Pushed at: 3 months ago - Stars: 1,010 - Forks: 53

ashishpatel26/Treasure-of-Transformers
💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️
Language: Jupyter Notebook - Size: 370 KB - Last synced at: about 14 hours ago - Pushed at: 10 months ago - Stars: 990 - Forks: 210

atilika/kuromoji
Kuromoji is a self-contained and very easy to use Japanese morphological analyzer designed for search
Language: Java - Size: 5.5 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 936 - Forks: 128

NorskRegnesentral/skweak
skweak: A software toolkit for weak supervision applied to NLP tasks
Language: Python - Size: 28 MB - Last synced at: 1 day ago - Pushed at: 8 months ago - Stars: 922 - Forks: 77

mocobeta/janome
Japanese morphological analysis engine written in pure Python
Language: Python - Size: 403 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 873 - Forks: 52

ikawaha/kagome
Self-contained Japanese Morphological Analyzer written in pure Go
Language: Go - Size: 711 MB - Last synced at: about 21 hours ago - Pushed at: 6 days ago - Stars: 869 - Forks: 55

mindspore-lab/mindnlp
Easy-to-use and high-performance NLP and LLM framework based on MindSpore, compatible with models and datasets of 🤗Huggingface.
Language: Jupyter Notebook - Size: 45.8 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 855 - Forks: 257

taishi-i/awesome-japanese-nlp-resources
A curated list of resources dedicated to Python libraries, LLMs, dictionaries, and corpora of NLP for Japanese
Size: 8.18 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 807 - Forks: 30

MIND-Lab/OCTIS
OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)
Language: Python - Size: 168 MB - Last synced at: 18 days ago - Pushed at: 10 months ago - Stars: 758 - Forks: 111

WorksApplications/Sudachi
A Japanese Tokenizer for Business
Language: Java - Size: 1.57 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 741 - Forks: 69

Ailln/cn2an
📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)
Language: Python - Size: 685 KB - Last synced at: about 22 hours ago - Pushed at: 5 months ago - Stars: 726 - Forks: 79

cbaziotis/ekphrasis
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Language: Python - Size: 659 KB - Last synced at: 1 day ago - Pushed at: about 1 year ago - Stars: 671 - Forks: 92

pemistahl/lingua
The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Language: Kotlin - Size: 424 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 660 - Forks: 60

wyounas/homer
Homer, a text analyser in Python, can help make your text more clear, simple and useful for your readers.
Language: Python - Size: 4.66 MB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 632 - Forks: 35

medspacy/medspacy
Library for clinical NLP with spaCy.
Language: Jupyter Notebook - Size: 2.87 MB - Last synced at: about 12 hours ago - Pushed at: about 1 month ago - Stars: 572 - Forks: 98

fhamborg/Giveme5W1H
Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
Language: HTML - Size: 208 MB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 507 - Forks: 88

proycon/pynlpl
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
Language: Python - Size: 12.8 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 477 - Forks: 67

linuxscout/pyarabic
pyarabic
Language: Python - Size: 1.23 MB - Last synced at: about 22 hours ago - Pushed at: over 1 year ago - Stars: 455 - Forks: 87

CAMeL-Lab/camel_tools
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
Language: Python - Size: 11.5 MB - Last synced at: 5 days ago - Pushed at: 29 days ago - Stars: 449 - Forks: 75

WorksApplications/SudachiPy 📦
Python version of Sudachi, a Japanese tokenizer.
Language: Python - Size: 669 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 403 - Forks: 52

taishi-i/nagisa
A Japanese tokenizer based on recurrent neural networks
Language: Python - Size: 39.4 MB - Last synced at: 15 days ago - Pushed at: 11 months ago - Stars: 398 - Forks: 23

ElizaLo/NLP-Natural-Language-Processing
Projects and useful articles / links
Language: Jupyter Notebook - Size: 71.2 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 376 - Forks: 78

IBM/zshot
Zero and Few shot named entity & relationships recognition
Language: Python - Size: 1.48 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 367 - Forks: 24

hellohaptik/multi-task-NLP
multi_task_NLP is a utility toolkit enabling NLP developers to easily train and infer a single model for multiple tasks.
Language: Python - Size: 7.46 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 358 - Forks: 54

hellohaptik/chatbot_ner
chatbot_ner: Named Entity Recognition for chatbots.
Language: Python - Size: 15.5 MB - Last synced at: 23 days ago - Pushed at: 3 months ago - Stars: 328 - Forks: 133

urduhack/urduhack
An NLP library for the Urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.
Language: Python - Size: 475 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 296 - Forks: 42

outcastofmusic/quick-nlp
Pytorch NLP library based on FastAI
Language: Python - Size: 56.5 MB - Last synced at: 8 days ago - Pushed at: almost 7 years ago - Stars: 282 - Forks: 50

gandersen101/spaczz
Fuzzy matching and more functionality for spaCy.
Language: Python - Size: 1.4 MB - Last synced at: 9 days ago - Pushed at: 10 months ago - Stars: 256 - Forks: 28

ikegami-yukino/mecab Fork of taku910/mecab 📦
This repository is archived! The maintained MeCab can be found https://github.com/shogo82148/mecab
Language: C++ - Size: 84.2 MB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 254 - Forks: 16

BobXWu/TopMost
A Topic Modeling System Toolkit (ACL 2024 Demo)
Language: Jupyter Notebook - Size: 254 MB - Last synced at: about 2 hours ago - Pushed at: about 1 month ago - Stars: 251 - Forks: 26

neomatrix369/nlp_profiler
A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Language: Python - Size: 3.54 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 242 - Forks: 37

BLLIP/bllip-parser Fork of dmcc/bllip-parser
BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.
Language: GAP - Size: 47.8 MB - Last synced at: 3 days ago - Pushed at: over 3 years ago - Stars: 227 - Forks: 53

alexandrainst/danlp 📦
DaNLP is a repository for Natural Language Processing resources for the Danish Language.
Language: Python - Size: 49.4 MB - Last synced at: 18 days ago - Pushed at: 3 months ago - Stars: 205 - Forks: 34

IBM/unitxt
🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data for end-to-end AI benchmarking
Language: Python - Size: 95.9 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 193 - Forks: 53

dccuchile/wefe
WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!
Language: Python - Size: 41.6 MB - Last synced at: 14 days ago - Pushed at: 11 months ago - Stars: 177 - Forks: 14

TakeLab/spacy-udpipe
spaCy + UDPipe
Language: Python - Size: 104 KB - Last synced at: 7 days ago - Pushed at: about 3 years ago - Stars: 161 - Forks: 10

chewxy/lingo
package lingo provides the data structures and algorithms required for natural language processing
Language: Go - Size: 465 KB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 155 - Forks: 15

infinitylogesh/mutate
A library to synthesize text datasets using Large Language Models (LLM)
Language: Python - Size: 163 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 151 - Forks: 8

emres/turkish-deasciifier
Turkish deasciifier in Python based on Deniz Yüret's turkish-mode for Emacs
Language: Python - Size: 211 KB - Last synced at: 12 days ago - Pushed at: over 4 years ago - Stars: 147 - Forks: 23

nullnull/simstring
A Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.
Language: Python - Size: 1.2 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 123 - Forks: 15

taishi-i/toiro
A comparison tool of Japanese tokenizers
Language: Python - Size: 1.04 MB - Last synced at: 10 months ago - Pushed at: 11 months ago - Stars: 115 - Forks: 8

aymara/lima
The Libre Multilingual Analyzer, a Natural Language Processing (NLP) C++ toolkit.
Language: C++ - Size: 276 MB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 112 - Forks: 20

plkmo/NLP_Toolkit
Library of state-of-the-art models (PyTorch) for NLP tasks
Language: Python - Size: 4.3 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 108 - Forks: 27

tsterbak/promptmage
simplifies the process of creating and managing LLM workflows.
Language: Python - Size: 4.08 MB - Last synced at: 26 days ago - Pushed at: 7 months ago - Stars: 100 - Forks: 8

lfcipriani/punkt-segmenter
Ruby port of the NLTK Punkt sentence segmentation algorithm
Language: Ruby - Size: 148 KB - Last synced at: 3 days ago - Pushed at: almost 7 years ago - Stars: 92 - Forks: 10

uma-pi1/minie
An open information extraction system that provides compact extractions
Language: Java - Size: 4.48 MB - Last synced at: 2 days ago - Pushed at: about 3 years ago - Stars: 91 - Forks: 27

doches/rwordnet
A pure Ruby interface to the WordNet database
Language: Ruby - Size: 8.06 MB - Last synced at: 3 days ago - Pushed at: over 5 years ago - Stars: 90 - Forks: 28

litus-ai/classy
classy is a simple-to-use library for building high-performance Machine Learning models in NLP.
Language: Python - Size: 2.99 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 86 - Forks: 3

mikeroyal/NLP-Guide
Natural Language Processing (NLP). Covering topics such as Tokenization, Part Of Speech tagging (POS), Machine translation, Named Entity Recognition (NER), Classification, and Sentiment analysis.
Language: Python - Size: 315 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 86 - Forks: 15

legacyai/tf-transformers
State of the art faster Transformer with Tensorflow 2.0 ( NLP, Computer Vision, Audio ).
Language: Jupyter Notebook - Size: 18.1 MB - Last synced at: 16 days ago - Pushed at: about 2 years ago - Stars: 85 - Forks: 2

mikahama/uralicNLP
An NLP library for Uralic languages such as Finnish, Skolt Sami, Moksha and so on. Also supporting some non-Uralic languages such as Spanish, French, Arabic, Swedish, Norwegian, Russian and English. LLMs, FSTs and More!
Language: Python - Size: 432 KB - Last synced at: about 23 hours ago - Pushed at: 6 months ago - Stars: 80 - Forks: 7

FareedKhan-dev/basiclingua-LLM-Based-NLP
LLM Based NLP Library.
Language: Python - Size: 4.31 MB - Last synced at: 10 months ago - Pushed at: about 1 year ago - Stars: 77 - Forks: 9

Ars-Linguistica/mlconjug3
A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.
Language: Python - Size: 397 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 75 - Forks: 11

Koziev/GrammarEngine
Грамматический Словарь Русского Языка (+ английский, японский, etc)
Language: C++ - Size: 401 MB - Last synced at: 21 days ago - Pushed at: almost 5 years ago - Stars: 75 - Forks: 20

VietHoang1512/khmer-nltk
Khmer language processing toolkit
Language: Python - Size: 10 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 73 - Forks: 18

SekouD/mlconjug
A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.
Language: Python - Size: 52.2 MB - Last synced at: 3 days ago - Pushed at: 5 months ago - Stars: 72 - Forks: 8

old-wang-95/easy-bert
easy-bert是一个中文NLP工具,提供诸多bert变体调用和调参方法,极速上手;清晰的设计和代码注释,也很适合学习
Language: Python - Size: 9.05 MB - Last synced at: 9 months ago - Pushed at: over 2 years ago - Stars: 68 - Forks: 12

OpenPecha/Botok
🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python
Language: Python - Size: 30.8 MB - Last synced at: 7 days ago - Pushed at: 2 months ago - Stars: 67 - Forks: 16

mikahama/natas
Python 3 library for processing historical English
Language: Python - Size: 95.7 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 67 - Forks: 11

SkBlaz/rakun2
RaKUn 2.0 - A fast keyword detection algorithm
Language: Python - Size: 2.62 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 66 - Forks: 9

StarCC0/starcc-py
简繁转换 簡繁轉換 Python implementation of StarCC, the next generation of Simplified-Traditional Chinese conversion framework
Language: Python - Size: 21.5 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 58 - Forks: 3

ART-Group-it/KERMIT
🐸 KERMIT - A lightweight library to encode and interpret Universal Syntactic Embeddings
Language: JavaScript - Size: 14.6 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 56 - Forks: 8

mbejda/Node-OpenNLP
Apache OpenNLP wrapper for Nodejs
Language: JavaScript - Size: 19.7 MB - Last synced at: 20 days ago - Pushed at: about 6 years ago - Stars: 56 - Forks: 17

StatguyUser/TextFeatureSelection
Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for improving text classification models. Helps improve your machine learning models
Language: Python - Size: 1.11 MB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 52 - Forks: 5

wayfair-incubator/extra-model
Code to run the ExtRA algorithm for unsupervised topic/aspect extraction on English texts.
Language: HTML - Size: 80.6 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 51 - Forks: 11

yakivyusin/SimpleNetNlp
.NET NLP library
Language: C# - Size: 28.2 MB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 48 - Forks: 10

UniversalDataTool/react-nlp-annotate
Interface for making NLP annotations.
Language: JavaScript - Size: 11.8 MB - Last synced at: 6 days ago - Pushed at: about 2 years ago - Stars: 47 - Forks: 21

golsun/NLP-tools
Useful python NLP tools (evaluation, GUI interface, tokenization)
Language: Python - Size: 207 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 44 - Forks: 9

OpenJarbas/simple_NER
simple rule based named entity recognition
Language: Python - Size: 2.1 MB - Last synced at: 20 days ago - Pushed at: about 3 years ago - Stars: 43 - Forks: 9

KylinC/PySeg
Python 中文分词库/词性标注库
Language: Python - Size: 12.2 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 39 - Forks: 1

AnFreTh/STREAM
A versatile Python package engineered for seamless topic modeling, topic evaluation, and topic visualization. Ideal for text analysis, natural language processing (NLP), and research in the social sciences, STREAM simplifies the extraction, interpretation, and visualization of topics from large, complex datasets.
Language: Python - Size: 228 MB - Last synced at: 1 day ago - Pushed at: 3 months ago - Stars: 38 - Forks: 9

jfreddypuentes/spanlp
spanlp: nlp applied for spanish vulgarity. A fast, robust Python library to check for profanity or offensive language in Spanish strings. It contains all the rude words of Spanish-speaking countries.
Language: Python - Size: 1.88 MB - Last synced at: about 6 hours ago - Pushed at: 11 months ago - Stars: 38 - Forks: 8

linonetwo/template-based-generator-template
基于模板的文本生成器的模板,模生模,凤生凤,老鼠的儿子会打洞。本地启动:npm i && npm run dev:demo
Language: TypeScript - Size: 3.18 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 37 - Forks: 22

lingualytics/py-lingualytics
A text analytics library with support for codemixed data
Language: Python - Size: 15.3 MB - Last synced at: 26 days ago - Pushed at: over 4 years ago - Stars: 37 - Forks: 4

Koziev/rulemma
Лемматизатор для русскоязычных текстов
Language: Python - Size: 251 MB - Last synced at: almost 2 years ago - Pushed at: almost 5 years ago - Stars: 37 - Forks: 6

syzer/sentiment-analyser
ML that can extract german and english sentiment
Language: JavaScript - Size: 101 KB - Last synced at: 8 days ago - Pushed at: over 4 years ago - Stars: 36 - Forks: 12

TrainingByPackt/Deep-Learning-for-Natural-Language-Processing
Solve your natural language processing problems with smart deep neural networks
Language: Jupyter Notebook - Size: 44 MB - Last synced at: about 1 month ago - Pushed at: almost 6 years ago - Stars: 36 - Forks: 53

ispras/atr4s
Toolkit with state-of-the-art Automatic Terms Recognition methods in Scala
Language: Scala - Size: 180 KB - Last synced at: about 1 month ago - Pushed at: almost 7 years ago - Stars: 35 - Forks: 5

andreihar/taibun
Taiwanese Hokkien Transliterator and Tokeniser
Language: Python - Size: 4.57 MB - Last synced at: 9 days ago - Pushed at: 9 months ago - Stars: 34 - Forks: 2

FareedKhan-dev/Most-powerful-NLP-library
Gemini, as capable as GPT-4, provides a free API with limited access. I tested it with the help of prompt engineering and found that it can solve almost any NLP task you want to tackle.
Language: Jupyter Notebook - Size: 107 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 34 - Forks: 9

DataScienceUIBK/HintEval
HintEval💡: A Comprehensive Framework for Hint Generation and Evaluation for Questions
Language: Python - Size: 2.94 MB - Last synced at: about 9 hours ago - Pushed at: about 1 month ago - Stars: 33 - Forks: 2

sentencizer/sentencizer
A sentence splitting (sentence boundary disambiguation) library for Go. It is rule-based and works out-of-the-box.
Language: Go - Size: 1.83 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 33 - Forks: 6

pncnmnp/LuaNLP
Natural Language Processing Library for Lua
Language: Lua - Size: 3.16 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 32 - Forks: 0

wannaphong/LaoNLP
Lao language NLP
Language: Python - Size: 8.15 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 31 - Forks: 5

HiveGuard-AI/taxonomy4good
Taxonomy4Good: a sustainability lexicon that provides the freedom to create custom taxonomies in addition to listed ESG and Sustainability Standards taxonomies.
Language: Python - Size: 3.48 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 30 - Forks: 5

proycon/python-ucto
This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).
Language: Cython - Size: 87.9 KB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 29 - Forks: 5

jpmanson/llm_templates
Instruction/chat prompts creation library for text generation LLMs. It supports local and Hugging Face models.
Language: Python - Size: 302 KB - Last synced at: 4 days ago - Pushed at: 9 months ago - Stars: 29 - Forks: 1
