Topic: "subword-segmentation"
aalto-speech/morfessor
Morfessor is a tool for unsupervised and semi-supervised morphological segmentation
Language: Python - Size: 409 KB - Last synced at: about 9 hours ago - Pushed at: over 4 years ago - Stars: 192 - Forks: 29

aalto-speech/flatcat
Morfessor FlatCat
Language: Python - Size: 1.1 MB - Last synced at: 29 days ago - Pushed at: over 5 years ago - Stars: 13 - Forks: 6

Waino/morfessor-emprune
Morfessor EM+Prune
Language: Python - Size: 523 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 9 - Forks: 2

BassaniRiccardo/ICEBERT Fork of huggingface/transformers
ICEBERT: Interlingual-Clusters Enhanced BERT. A BERT-like model trained on clusters of monolingual subwords.
Language: Python - Size: 184 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

Waino/morfessor-cognates
Cognate-aware morphological segmentation
Language: Python - Size: 461 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 1

aalto-speech/morfessor-emprune Fork of Waino/morfessor-emprune
Morfessor EM+Prune
Size: 497 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

majeek/vml-hd
Parsing and subword segmentation code for the VML-HD Dataset
Language: Python - Size: 1.7 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

dlsucelt/Cellar
Central repository with pretrained models for transfer learning, BPE subword-tokenization, mono/multilingual embeddings, and everything in between.
Language: Python - Size: 379 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

TiMauzi/dawg
The concept of DAWGs is based on: Blumer, A. et al. (1985). The smallest automation recognizing the subwords of a text. Theoretical Computer Science, 40, 31–55.
Language: Java - Size: 88.9 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

JoyeBright/FT-IWSLT2014-BPEVocab
Repository for the experiments in my paper: "A Systematic Analysis of Vocabulary and BPE Settings for Optimal Fine-tuning of NMT: A Case Study of In-domain Translation "
Size: 0 Bytes - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

aalto-speech/morfessor-demo
Morfessor demonstration
Language: Python - Size: 426 KB - Last synced at: about 1 year ago - Pushed at: about 10 years ago - Stars: 0 - Forks: 0
