An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: sentence-splitting

sentencizer/sentencizer

A sentence splitting (sentence boundary disambiguation) library for Go. It is rule-based and works out-of-the-box.

Language: Go - Size: 1.83 MB - Last synced at: about 3 hours ago - Pushed at: about 1 month ago - Stars: 33 - Forks: 6

adobe/NLP-Cube

Natural Language Processing Pipeline - Sentence Splitting, Tokenization, Lemmatization, Part-of-speech Tagging and Dependency Parsing

Language: HTML - Size: 11.1 MB - Last synced at: 19 days ago - Pushed at: 6 months ago - Stars: 558 - Forks: 94

mediacloud/sentence-splitter

Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.

Language: Python - Size: 45.9 KB - Last synced at: 29 days ago - Pushed at: over 2 years ago - Stars: 244 - Forks: 30

Prismadic/magnet

the small distributed language model toolkit; fine-tune state-of-the-art LLMs anywhere, rapidly

Language: Python - Size: 11.8 MB - Last synced at: 2 days ago - Pushed at: 7 months ago - Stars: 31 - Forks: 3

erre-quadro/spikex

SpikeX - SpaCy Pipes for Knowledge Extraction

Language: Python - Size: 3.43 MB - Last synced at: 11 days ago - Pushed at: almost 4 years ago - Stars: 398 - Forks: 28

zaemyung/sentsplit

A flexible sentence segmentation library using CRF model and regex rules

Language: Python - Size: 2.48 MB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 29 - Forks: 7

vngrs-ai/vnlp

State-of-the-art, lightweight NLP tools for Turkish language. Developed by VNGRS.

Language: Python - Size: 392 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 259 - Forks: 17

jparkerweb/splitter-vs-splitter

🪓 simple app to pit two sentence splitters against one another to understand their differences

Language: JavaScript - Size: 271 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

KorAP/Datok

High-Performance Finite State Tokenizer

Language: Go - Size: 124 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 4 - Forks: 0

astariul/Sentencize.jl

Smallish library for sentence splitting in Julia

Language: Julia - Size: 256 KB - Last synced at: about 16 hours ago - Pushed at: 6 months ago - Stars: 4 - Forks: 3

ZJaume/splitters

A CLI for Rust SRX sentence segmenation rules as Python package.

Language: Rust - Size: 68.4 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

mbanon/benchmarks

Several benchmarks on sentence splitting and language identification

Language: Mathematica - Size: 35.7 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

kimryan/Lingua-EN-Sentence

split text into sentences (a Perl module)

Language: Perl - Size: 29.3 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 3

M4t1ss/chunker

A sentence chunker PHP class + visualizer for Berkeley Parser parse trees

Language: PHP - Size: 23.5 MB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 2 - Forks: 0

Related Keywords