Topic: "machine-translation-data-processing"
facebookresearch/stopes
A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB team.
Language: Python - Size: 4.31 MB - Last synced at: 1 day ago - Pushed at: 4 months ago - Stars: 276 - Forks: 40

lt3/nfr
Neural Fuzzy Repair (NFR) is a data augmentation pipeline, which integrates fuzzy matches (i.e. similar translations) into neural machine translation.
Language: Python - Size: 34 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 11 - Forks: 2

alphadl/corpus_filter
Scripts for machine translation corpora filtering/ 机器翻译平行语料过滤的脚本
Language: Python - Size: 313 KB - Last synced at: almost 2 years ago - Pushed at: almost 6 years ago - Stars: 8 - Forks: 2

ELDAELRA/elda_cmtk
ELDA Crawled Data Management Toolkit
Language: OCaml - Size: 166 KB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 3 - Forks: 1

geovedi/nmt-playground
Personal NMT Playground
Language: Python - Size: 95.7 MB - Last synced at: about 1 hour ago - Pushed at: almost 8 years ago - Stars: 2 - Forks: 1

ShristiK/Cross-Lingual-Document-Translator
Translator developed and trained on a provided corpus using IBM model
Language: Jupyter Notebook - Size: 56.8 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

deepak2233/Nueral_Machine_Translation_Eng_to_Hin
Using Sq2Sq LSTM based model alsg with attension
Language: Jupyter Notebook - Size: 579 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

vzhomeexperiments/cloud_translate
repository for automatic files translation using Google Translate API and R Statistical Software
Language: R - Size: 40 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 6

MarsPanther/machine-translation-research
just trying to translate from Amharic to English
Language: Shell - Size: 888 KB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 2

mrsumitbd/SOParallelCorpusReplication
Replication package for SO processing for bitext
Language: Python - Size: 434 KB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 1

moodser/splitter-transliteration
Python script to split the text generated by 'wikipedia parallel title extractor' into separate text files (separate file for each language)
Language: Python - Size: 10.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

erayyildiz/parallel-sentence-quality-filter
Parallel sentence quality filter based on text classification methods
Language: Perl - Size: 1.21 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0
