Topic: "word-segmentation"
google/sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
Language: C++ - Size: 23.9 MB - Last synced at: about 17 hours ago - Pushed at: about 1 month ago - Stars: 10,838 - Forks: 1,220

baidu/lac
百度NLP:分词,词性标注,命名实体识别,词重要性
Language: C++ - Size: 63.6 MB - Last synced at: 25 days ago - Pushed at: almost 4 years ago - Stars: 3,921 - Forks: 596

wolfgarbe/SymSpell
SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Language: C# - Size: 12 MB - Last synced at: 17 days ago - Pushed at: about 1 month ago - Stars: 3,230 - Forks: 303

PyThaiNLP/pythainlp
Thai natural language processing in Python
Language: Python - Size: 65.5 MB - Last synced at: about 21 hours ago - Pushed at: 6 days ago - Stars: 1,029 - Forks: 277

VKCOM/YouTokenToMe 📦
Unsupervised text tokenizer focused on computational efficiency
Language: C++ - Size: 192 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 966 - Forks: 103

mammothb/symspellpy
Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm
Language: Python - Size: 5.76 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 824 - Forks: 124

ckiplab/ckip-transformers
CKIP Transformers
Language: Python - Size: 232 KB - Last synced at: 28 days ago - Pushed at: about 2 years ago - Stars: 723 - Forks: 76

cbaziotis/ekphrasis
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Language: Python - Size: 659 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 667 - Forks: 92

bab2min/Kiwi
Kiwi(지능형 한국어 형태소 분석기)
Language: C++ - Size: 396 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 576 - Forks: 51

vncorenlp/VnCoreNLP
A Vietnamese natural language processing toolkit (NAACL 2018)
Language: Java - Size: 232 MB - Last synced at: 9 months ago - Pushed at: about 2 years ago - Stars: 570 - Forks: 141

JayYip/m3tl
BERT for Multitask Learning
Language: Jupyter Notebook - Size: 29.1 MB - Last synced at: 24 days ago - Pushed at: about 2 years ago - Stars: 547 - Forks: 125

modelscope/AdaSeq
AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence Understanding Models
Language: Python - Size: 5.03 MB - Last synced at: 24 days ago - Pushed at: over 1 year ago - Stars: 434 - Forks: 41

taishi-i/nagisa
A Japanese tokenizer based on recurrent neural networks
Language: Python - Size: 39.4 MB - Last synced at: 6 days ago - Pushed at: 11 months ago - Stars: 398 - Forks: 23

ku-nlp/jumanpp
Juman++ (a Morphological Analyzer Toolkit)
Language: C++ - Size: 3.78 MB - Last synced at: 23 days ago - Pushed at: over 1 year ago - Stars: 387 - Forks: 44

yongzhuo/Pytorch-NLU
Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词、抽取式文本摘要等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of spee
Language: Python - Size: 379 KB - Last synced at: 10 days ago - Pushed at: 10 months ago - Stars: 345 - Forks: 50

jacksonllee/pycantonese
Cantonese Linguistics and NLP
Language: Python - Size: 15.1 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 335 - Forks: 38

bab2min/kiwipiepy
Python API for Kiwi
Language: Python - Size: 163 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 312 - Forks: 32

ikegami-yukino/mecab Fork of taku910/mecab 📦
This repository is archived! The maintained MeCab can be found https://github.com/shogo82148/mecab
Language: C++ - Size: 84.2 MB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 254 - Forks: 16

monpa-team/monpa
MONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型
Language: Python - Size: 8.25 MB - Last synced at: 15 days ago - Pushed at: 3 months ago - Stars: 246 - Forks: 25

jidasheng/bi-lstm-crf
A PyTorch implementation of the BI-LSTM-CRF model.
Language: Python - Size: 12.7 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 246 - Forks: 48

fastcws/fastcws
轻量级高性能中文分词项目
Language: C++ - Size: 524 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 190 - Forks: 8

ckiplab/ckipnlp
CKIP CoreNLP Toolkits
Language: Python - Size: 573 KB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 119 - Forks: 15

taishi-i/toiro
A comparison tool of Japanese tokenizers
Language: Python - Size: 1.04 MB - Last synced at: 10 months ago - Pushed at: 11 months ago - Stars: 115 - Forks: 8

peterolson/hanzi-tools
Converts from Chinese characters to pinyin, between simplified and traditional, and does word segmentation.
Language: JavaScript - Size: 2.51 MB - Last synced at: 29 days ago - Pushed at: almost 2 years ago - Stars: 111 - Forks: 19

Ailln/nlp-roadmap
🗺️ 一个自然语言处理的学习路线图
Size: 135 KB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 109 - Forks: 12

fudannlp16/CWS_Dict
Source codes for paper "Neural Networks Incorporating Dictionaries for Chinese Word Segmentation", AAAI 2018
Language: Python - Size: 39.3 MB - Last synced at: 13 days ago - Pushed at: over 7 years ago - Stars: 90 - Forks: 32

jcyk/CWS
Source code for an ACL2016 paper of Chinese word segmentation
Language: Python - Size: 44.9 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 80 - Forks: 26

ruanchaves/hashformers
Hashformers is a framework for hashtag segmentation with Transformers and Large Language Models (LLMs).
Language: Python - Size: 23.6 MB - Last synced at: 27 days ago - Pushed at: 9 months ago - Stars: 70 - Forks: 5

datquocnguyen/RDRsegmenter
A Fast and Accurate Vietnamese Word Segmenter (LREC 2018)
Language: Java - Size: 420 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 69 - Forks: 9

wolfgarbe/WordSegmentationTM
Fast Word Segmentation with Triangular Matrix
Language: C# - Size: 1.22 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 67 - Forks: 12

MighTguY/customized-symspell
Java port of SymSpell: 1 million times faster through Symmetric Delete spelling correction algorithm
Language: Java - Size: 8.6 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 67 - Forks: 18

phongnt570/UETsegmenter
A toolkit for Vietnamese word segmentation
Language: Java - Size: 31.5 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 62 - Forks: 12

dnanhkhoa/python-vncorenlp
A Python wrapper for VnCoreNLP using a bidirectional communication channel.
Language: Python - Size: 40 KB - Last synced at: 22 days ago - Pushed at: over 6 years ago - Stars: 56 - Forks: 18

ye-kyaw-thu/sylbreak
Syllable segmentation tool for Myanmar language (Burmese) by Ye.
Language: HTML - Size: 2.97 MB - Last synced at: 9 months ago - Pushed at: over 1 year ago - Stars: 55 - Forks: 19

giganticode/codeprep
A toolkit for pre-processing large source code corpora
Language: Python - Size: 1.56 MB - Last synced at: 26 days ago - Pushed at: over 2 years ago - Stars: 47 - Forks: 11

undertheseanlp/word_tokenize 📦
Vietnamese Word Tokenize
Language: Python - Size: 28.5 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 45 - Forks: 24

viig99/SymSpellCppPy
Fast SymSpell written in c++ and exposes to python via pybind11
Language: C++ - Size: 8.31 MB - Last synced at: 27 days ago - Pushed at: 2 months ago - Stars: 42 - Forks: 7

KrakenAI/SynThai
Thai Word Segmentation and Part-of-Speech Tagging with Deep Learning
Language: Python - Size: 35.2 KB - Last synced at: 13 days ago - Pushed at: almost 8 years ago - Stars: 40 - Forks: 16

dalinvip/pytorch_Joint-Word-Segmentation-and-POS-Tagging
Paper: A Simple and Effective Neural Model for Joint Word Segmentation and POS Tagging
Language: Python - Size: 293 KB - Last synced at: 14 days ago - Pushed at: about 6 years ago - Stars: 35 - Forks: 11

wchan757/Cantonese_Word_Segmentation
Dictionary for Cantonese word segmentation
Size: 826 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 33 - Forks: 5

stevenay/myan-word-breaker
Myanmar Word Segmentation Tool
Language: Python - Size: 859 KB - Last synced at: 9 months ago - Pushed at: over 6 years ago - Stars: 29 - Forks: 9

levyfan/sentencepiece-jni
Java JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.
Language: C++ - Size: 240 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 28 - Forks: 10

eskriett/spell
Spelling correction and string segmentation written in Go
Language: Go - Size: 50.8 KB - Last synced at: 26 days ago - Pushed at: 9 months ago - Stars: 27 - Forks: 5

chengchingwen/BytePairEncoding.jl
Julia implementation of Byte Pair Encoding for NLP
Language: Julia - Size: 2.28 MB - Last synced at: 24 days ago - Pushed at: 11 months ago - Stars: 27 - Forks: 3

JayYip/cws-tensorflow
基于Tensorflow的中文分词模型
Language: Python - Size: 2.47 MB - Last synced at: 7 days ago - Pushed at: over 6 years ago - Stars: 26 - Forks: 3

rust-han/han-segment
基于隐式马尔可夫模型和正向最大化匹配的中文分词系统
Language: Rust - Size: 1.97 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 26 - Forks: 3

hellonlp/hellonlp
NLP tools, word segmentation, sentence segmentation, New-Word-Discovery,新词发现
Language: Python - Size: 43.9 MB - Last synced at: 17 days ago - Pushed at: about 1 year ago - Stars: 25 - Forks: 8

bnosac/sentencepiece
R package for Byte Pair Encoding / Unigram modelling based on Sentencepiece
Language: C++ - Size: 4.56 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 25 - Forks: 6

qiaofei32/dnn-lstm-word-segment
Chinese Word Segmention Base on the Deep Learning and LSTM Neural Network
Language: Python - Size: 12.7 KB - Last synced at: over 1 year ago - Pushed at: over 8 years ago - Stars: 23 - Forks: 15

crackcell/gonlpir
Golang wapper for NLPIR/ICTCLAS2015.
Language: Go - Size: 79.8 MB - Last synced at: 11 months ago - Pushed at: over 8 years ago - Stars: 23 - Forks: 6

cvikasreddy/skt
Sanskrit compound segmentation using seq2seq model
Language: Python - Size: 24.6 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 22 - Forks: 11

ikegami-yukino/rakutenma-python
Rakuten MA (Python version)
Language: Python - Size: 24.1 MB - Last synced at: 27 days ago - Pushed at: almost 8 years ago - Stars: 22 - Forks: 1

ankane/youtokentome-ruby 📦
High performance unsupervised text tokenization for Ruby
Language: Ruby - Size: 31.3 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 21 - Forks: 1

yuanhao-chen-nyoeghau/shanghainese-tts
Shanghainese TTS
Language: Jupyter Notebook - Size: 1.98 GB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 21 - Forks: 5

apdullahyayik/TrTokenizer
🧩 A simple sentence tokenizer.
Language: Python - Size: 480 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 1

wolfgarbe/WordSegmentationDP
Word Segmentation with Dynamic Programming
Language: C# - Size: 1.21 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 20 - Forks: 5

Systemcluster/kitoken
Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and WordPiece tokenization in JavaScript, Python and Rust.
Language: Rust - Size: 27.3 MB - Last synced at: 19 days ago - Pushed at: about 2 months ago - Stars: 19 - Forks: 0

harshavkumar/word_segmentation
Word Segmentation done for handwritten text recogntion
Language: Python - Size: 1.99 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 19 - Forks: 5

messense/cjieba-py
Python cffi binding to CppJieba
Language: Python - Size: 4.06 MB - Last synced at: 21 days ago - Pushed at: over 4 years ago - Stars: 15 - Forks: 0

salsowelim/tawseem
NLP crowdsourcing platform for word-level annotations
Language: Go - Size: 716 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 13 - Forks: 1

hrishikeshrt/vaiyyakarana
Vaiyyākaraṇaḥ is a telegram bot that offers various tools for a Sanskrit learner including stem (प्रातिपदिकम्) finder, root (धातुः) finder, declension (सुबन्ताः) generator, conjugation (तिङन्ताः) generator, and compound word (सन्धिसमासौ) splitter.
Language: Python - Size: 9.63 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 12 - Forks: 1

crusnic-corp/BN-DRISHTI
Line and Word Segmentation for Bangla Handwritten Text Recognition
Language: Jupyter Notebook - Size: 169 MB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 2

mathsyouth/awesome-word-segmentation
A curated list of resources dedicated to word segmentation
Size: 9.77 KB - Last synced at: 7 days ago - Pushed at: over 6 years ago - Stars: 12 - Forks: 1

jason2506/esapp
An unsupervised Chinese word segmentation tool.
Language: C++ - Size: 254 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 12 - Forks: 2

Socret360/joint-khmer-word-segmentation-and-pos-tagging
A Keras implementation of a deep learning network to simultaneously perform Word Segmentation and Part-of-Speech (POS) Tagging introduced by Bouy et al. in the paper Joint Khmer Word Segmentation and Part-of-Speech Tagging Using Deep Learning.
Language: Python - Size: 10.3 MB - Last synced at: 11 months ago - Pushed at: about 3 years ago - Stars: 11 - Forks: 1

hankcs/iparser
Yet another dependency parser, integrated with tokenizer, tagger and visualization tool.
Language: Python - Size: 69.3 KB - Last synced at: 25 days ago - Pushed at: about 7 years ago - Stars: 11 - Forks: 2

dogterbox/thai-word-segmentation
Thai word segmentation using deep learning
Language: Jupyter Notebook - Size: 19.4 MB - Last synced at: almost 2 years ago - Pushed at: almost 6 years ago - Stars: 10 - Forks: 1

Waino/morfessor-emprune
Morfessor EM+Prune
Language: Python - Size: 523 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 9 - Forks: 2

NoerNova/ShanNLP
ShanNLP experimental project inspired by PythaiNLP
Language: Python - Size: 630 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 8 - Forks: 1

ThuraAung1601/mmCRFseg
mmCRFseg: Word Segmentation for Myanmar Language using Conditional Random Fields
Language: Jupyter Notebook - Size: 611 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 1

ckiplab/ckip-classic
CKIP Classic Word Segmentation and Sentence Parsing Tools
Language: Python - Size: 356 KB - Last synced at: 11 days ago - Pushed at: about 2 years ago - Stars: 8 - Forks: 1

wannaphong/NokCut
Thai Word Segmentation using TCC + Bidirectional RNNs
Language: Python - Size: 5.56 MB - Last synced at: 13 days ago - Pushed at: over 6 years ago - Stars: 8 - Forks: 1

yaoguangluo/ChromosomeDNA
《DNA元基催化与肽计算》 在进化计算中, 软件函数文件进行 DNA 语义元基索引编码的 PDE 新陈代谢优化方式, 是一种有效的进化方式.
Language: Java - Size: 676 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 7 - Forks: 2

fann1993814/HanLPerceptron
Native Python HanLP Perceptron Model: HanLPerceptron 中文斷詞 詞性標註 命名實體識別
Language: Python - Size: 875 KB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 2

akhvorov/vgram
Feature extraction from sequential data
Language: C++ - Size: 545 KB - Last synced at: about 18 hours ago - Pushed at: almost 6 years ago - Stars: 7 - Forks: 0

dongjinleekr/beanpiece
A Java binding to Google SentencePiece
Language: C++ - Size: 235 KB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 6 - Forks: 0

Jyutt/jieba-hs
Jieba中文分詞算法Haskell版本 Haskell Implementation of Jieba Chinese Segmentation Algorithm
Language: Haskell - Size: 969 KB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 0

khanhtran2000/OCR-Dev
A small computer vision project in the making. Partners: Minh Quan Huynh and Duc Minh Hoang.
Language: Jupyter Notebook - Size: 21 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 0

pku-nlp-forfun/CWS_POS_NER
Chinese word segmentation, Part-of-speech tagging and Medical named entity recognition From scratch.
Language: Jupyter Notebook - Size: 2.83 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 5 - Forks: 1

Ailln/simple-jieba
✂️用 100 行实现简单版本的 jieba 分词
Language: Python - Size: 1.95 MB - Last synced at: 20 days ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 1

naetherm/NLP
Some of my NLP projects I've worked on and to harden my experience with the research field of NLP.
Language: Python - Size: 45.2 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 0

Sara-HY/Mini_Search
A Mini Search Engine.
Language: C++ - Size: 40.6 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 4 - Forks: 0

jacksonllee/wordseg
Word segmentation models
Language: Python - Size: 28.3 KB - Last synced at: 7 months ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 1

fastcws/tagged-wiki2019zh
基于4-tag标注好的2019中文维基语料库,使用hanlp进行标注
Language: Python - Size: 1.95 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

ljdyer/Space-Punct-Cap-Restoration
A portal to GitHub repositories associated with the paper "Comparison of Token- and Character-Level Approaches to Restoration of Spaces, Punctuation, and Capitalization in Various Languages"
Language: SCSS - Size: 30.3 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

emanuelegiona/NLP2019_HW1
Chinese word segmentation, replicating "State-of-the-art Chinese Word Segmentation with Bi-LSTMs", Ji Ma, Kuzman Ganchev and David Weiss, EMNLP 2018
Language: Python - Size: 183 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 3 - Forks: 1

Cater5009/Chinese-word-segmentation
使用MM、RMM、BM和CRF实现中文分词
Language: Roff - Size: 61.9 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 3 - Forks: 3

Nguyendat-bit/VieTokenizer
Vietnamese Tokenizer package based on deeplearning methods
Language: Python - Size: 13.7 KB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

ashishpoudel995/NLP-for-Nepali-Language
The project is all about Natural Language Processing for the Nepali Language. "Text Summarization" and "Word Segmentation" are implemented in this project.
Language: Jupyter Notebook - Size: 446 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 2

jp-myk/lm-decoder
Language Model Decoder is Transducer from a sentence to word/reading sequence.
Language: C++ - Size: 763 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 3

khanhtran2000/FPT.AI_2020
My work during internship at FPT.AI 2020
Language: Jupyter Notebook - Size: 778 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

hongquan/ViStickedWord
Library to split sticked Vietnamese words
Language: Python - Size: 43 KB - Last synced at: 6 days ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

PyThaiNLP/pylexto 📦
LexTo with Python 2 & 3 Wrapper
Language: Java - Size: 189 KB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 2 - Forks: 3

lixxin2/uninlp-phd 📦
No long maintained! Java codes for basic natural language processing tasks, including Pinyin-to-Character Conversion, Chinese word segmentation, Part-of-Speech tagging, English chunking, dependency parsing
Language: Java - Size: 3.42 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 1

patorn/thaitokenizer
Thai Word Segmentation + Sentiment Analysis with Keras
Language: Jupyter Notebook - Size: 227 KB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 1

cvikasreddy/Sanskrit-Segmentation
Sanskrit Segmentation using Beam Search and Seq2Seq model
Language: Jupyter Notebook - Size: 1.88 MB - Last synced at: about 2 years ago - Pushed at: over 8 years ago - Stars: 2 - Forks: 1

ndthuan/vi-word-segmenter
HTTP wrapper of the VnCoreNLP library - A Vietnamese natural language processing toolkit
Language: Java - Size: 82 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

KOLANICH-libs/WordSplitAbs.py
An abstraction layer around word splitters for python
Language: Python - Size: 13.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

maris205/dnasearchengine
Segmenting DNA sequence into ‘words’,https://arxiv.org/pdf/1202.2518.pdf
Language: C++ - Size: 20.7 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 1

ljdyer/Feature-Restoration-Evaluator
Quantitative and qualitative evaluation of restorations of textual features using machine learning models
Language: Python - Size: 295 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0
