Topic: "segmenter"
jesperorb/intl-explorer
Intl Explorer is an interactive tool for experimenting and trying out the ECMAScript Internationalization API.
Language: Svelte - Size: 1.68 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 136 - Forks: 3

jordicenzano/transport-stream-online-segmenter
Transport stream web based HLS segmenter.
Language: JavaScript - Size: 517 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 37 - Forks: 6

Hemisphere-Project/HLS-segmenter
Linux HLS Server including uploader, segmenter, chunks dealer and media manager
Language: Python - Size: 27.3 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 37 - Forks: 13

datquocnguyen/BioPosDep
Tokenization, sentence segmentation, POS tagging and dependency parsing for biomedical texts (BMC Bioinformatics 2019)
Language: Python - Size: 65 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 32 - Forks: 5

chatopera/chop
Chinese Tokenizer module for Python
Language: Python - Size: 9.32 MB - Last synced at: 3 days ago - Pushed at: almost 7 years ago - Stars: 15 - Forks: 7

jonschlinkert/intl-segmenter
A high-performance wrapper around Intl.Segmenter for efficient text segmentation. This class resolves memory handling issues seen with large strings and "maximum call stack exceeded" exceptions that occur when strings exceed 40-50k characters. Enhances performance by 50-500x. Only ~70 loc (with comments) and no dependencies.
Language: JavaScript - Size: 43.9 KB - Last synced at: 14 days ago - Pushed at: 3 months ago - Stars: 10 - Forks: 1

zeeshansayyed/ArabicSOS
Segmenter and Orthography Standardazier (SOS) for Classical Arabic (CA)
Language: Python - Size: 286 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 7 - Forks: 2

kuhumcst/rtfreader
Text segmenter and tokeniser for Danish, English and other languages. Reads an RTF or flat text file and outputs the text, one line per sentence & optionally tokenized.
Language: C++ - Size: 375 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 4

jordicenzano/fmp4-stream-online-segmenter
This is a tool that allows you to create an DASH manifest from any fmp4 stream file. There are online and CLI versions, also we provide file and stream versions
Language: JavaScript - Size: 998 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 1

Heresta/OCR17plus
Data for layout analysis and HTR.
Language: Python - Size: 4.85 GB - Last synced at: 17 days ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 3

xamgore/segtok
A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic features
Language: Rust - Size: 101 KB - Last synced at: 16 days ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

IMAGO-Catalogues-Jjanes/cataloguesSegmentationOCR
Dataset and models for catalogs' Layout analysis and HTR
Language: Python - Size: 966 MB - Last synced at: 16 days ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 1

caopengfei/jieba.NET Fork of anderscui/jieba.NET
jieba中文分词库jieba.NET的.NET Standard版本
Language: C# - Size: 11.7 MB - Last synced at: 17 days ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

rimo02/Deep-Learning-Notebooks
Repo of all deep learning models
Language: Jupyter Notebook - Size: 38.7 MB - Last synced at: about 2 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

lz1998/cgo_icu4x_segmenter
Segment strings by word and sentences. See icu4x for more details.
Language: Rust - Size: 10.7 KB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

vidraj/segmentace
Tools for segmenting natural language words into morphs / morphemes.
Language: Python - Size: 3.09 MB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 2

LauJames/Tagging
Sequence tagging(Word Segmenter), POS tagging and NER with Tensorflow
Language: Python - Size: 5.96 MB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 1

MINED-MATKIT/Segmenter
Segmentation module. Deals with the transition from raw imaging data to microstructure representations.
Language: Matlab - Size: 7 MB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 2
