An open API service providing repository metadata for many open source software ecosystems.

Topic: "segmenter"

jesperorb/intl-explorer

Intl Explorer is an interactive tool for experimenting and trying out the ECMAScript Internationalization API.

Language: Svelte - Size: 1.68 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 136 - Forks: 3

jordicenzano/transport-stream-online-segmenter

Transport stream web based HLS segmenter.

Language: JavaScript - Size: 517 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 37 - Forks: 6

Hemisphere-Project/HLS-segmenter

Linux HLS Server including uploader, segmenter, chunks dealer and media manager

Language: Python - Size: 27.3 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 37 - Forks: 13

datquocnguyen/BioPosDep

Tokenization, sentence segmentation, POS tagging and dependency parsing for biomedical texts (BMC Bioinformatics 2019)

Language: Python - Size: 65 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 32 - Forks: 5

chatopera/chop

Chinese Tokenizer module for Python

Language: Python - Size: 9.32 MB - Last synced at: 3 days ago - Pushed at: almost 7 years ago - Stars: 15 - Forks: 7

jonschlinkert/intl-segmenter

A high-performance wrapper around Intl.Segmenter for efficient text segmentation. This class resolves memory handling issues seen with large strings and "maximum call stack exceeded" exceptions that occur when strings exceed 40-50k characters. Enhances performance by 50-500x. Only ~70 loc (with comments) and no dependencies.

Language: JavaScript - Size: 43.9 KB - Last synced at: 14 days ago - Pushed at: 3 months ago - Stars: 10 - Forks: 1

zeeshansayyed/ArabicSOS

Segmenter and Orthography Standardazier (SOS) for Classical Arabic (CA)

Language: Python - Size: 286 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 7 - Forks: 2

kuhumcst/rtfreader

Text segmenter and tokeniser for Danish, English and other languages. Reads an RTF or flat text file and outputs the text, one line per sentence & optionally tokenized.

Language: C++ - Size: 375 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 4

jordicenzano/fmp4-stream-online-segmenter

This is a tool that allows you to create an DASH manifest from any fmp4 stream file. There are online and CLI versions, also we provide file and stream versions

Language: JavaScript - Size: 998 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 1

Heresta/OCR17plus

Data for layout analysis and HTR.

Language: Python - Size: 4.85 GB - Last synced at: 17 days ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 3

xamgore/segtok

A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic features

Language: Rust - Size: 101 KB - Last synced at: 16 days ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

IMAGO-Catalogues-Jjanes/cataloguesSegmentationOCR

Dataset and models for catalogs' Layout analysis and HTR

Language: Python - Size: 966 MB - Last synced at: 16 days ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 1

caopengfei/jieba.NET Fork of anderscui/jieba.NET

jieba中文分词库jieba.NET的.NET Standard版本

Language: C# - Size: 11.7 MB - Last synced at: 17 days ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

rimo02/Deep-Learning-Notebooks

Repo of all deep learning models

Language: Jupyter Notebook - Size: 38.7 MB - Last synced at: about 2 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

lz1998/cgo_icu4x_segmenter

Segment strings by word and sentences. See icu4x for more details.

Language: Rust - Size: 10.7 KB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

vidraj/segmentace

Tools for segmenting natural language words into morphs / morphemes.

Language: Python - Size: 3.09 MB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 2

LauJames/Tagging

Sequence tagging(Word Segmenter), POS tagging and NER with Tensorflow

Language: Python - Size: 5.96 MB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 1

MINED-MATKIT/Segmenter

Segmentation module. Deals with the transition from raw imaging data to microstructure representations.

Language: Matlab - Size: 7 MB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 2