Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: parallel-corpora

techiaith/alinio

Cod hwyluso alinio testunau gyda hunalign a dogfennaeth ar sut i ddefnyddio LFAligner // Code for simplifying aligning texts with hunalign and documentation for LFAligner

Language: Python - Size: 28.3 KB - Last synced: about 2 months ago - Pushed: about 8 years ago - Stars: 0 - Forks: 0

rggdmonk/hadal

A simple and efficient tool for mining and aligning sentences with pre-trained models.

Language: Python - Size: 680 KB - Last synced: 18 days ago - Pushed: 19 days ago - Stars: 2 - Forks: 0

csebuetnlp/banglanmt

This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.

Language: Python - Size: 2.05 MB - Last synced: about 2 months ago - Pushed: over 1 year ago - Stars: 144 - Forks: 45

timarkh/tsakorpus

Yet another search platform for linguistic corpora.

Language: Python - Size: 3.28 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 16 - Forks: 12

shashwatup9k/bho-resources

Size: 4.21 MB - Last synced: about 1 month ago - Pushed: 6 months ago - Stars: 2 - Forks: 0

bitextor/bitextor

Bitextor generates translation memories from multilingual websites

Language: Python - Size: 177 MB - Last synced: 7 months ago - Pushed: 9 months ago - Stars: 265 - Forks: 45

Sohyo/Using-Confidential-Data-for-NMT

Size: 7.59 MB - Last synced: 10 months ago - Pushed: almost 3 years ago - Stars: 1 - Forks: 1

korenyoni/opus-api

OPUS (opus.nlpl.eu) Python3 API

Language: Python - Size: 118 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 14 - Forks: 5

tsuruoka-lab/BSD

The Business Scene Dialogue corpus

Size: 2.91 MB - Last synced: over 1 year ago - Pushed: over 2 years ago - Stars: 55 - Forks: 6

Kartikaggarwal98/Indian_ParallelCorpus

Curated list of publicly available parallel corpus for Indian Languages

Size: 8.79 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 28 - Forks: 1

gederajeg/constructional-equivalence

Repository of supplementary materials and RStudio project for the paper on corpus-based approach to measuring constructional equivalence.

Language: TeX - Size: 2.53 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

czcorpus/ictools

A program for calculating corpora alignments using a pivot language

Language: Go - Size: 242 KB - Last synced: about 2 months ago - Pushed: 3 months ago - Stars: 1 - Forks: 1

npedrazzini/parallelbibles

Word-alignment models for Bible translations in 100+ historical and contemporary languages

Language: R - Size: 936 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

gederajeg/rob-steal-parallel-corpora

Repository kode pemrograman R dan data untuk analisis dalam penelitian dengan judul MODEL KAJIAN TERJEMAHAN BERBASIS BANK DATA TERJEMAHAN DIGITAL INGGRIS-INDONESIA DAN IMPLIKASI PEDAGOGISNYA

Language: R - Size: 8.51 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

Related Keywords
parallel-corpora 14 parallel-corpus 6 machine-translation 5 corpus 5 nlp 3 corpus-linguistics 3 neural-machine-translation 3 corpora 2 construction-grammar 2 english-indonesian-translation 2 annotated-corpora 2 udayana-university 2 linguistics 2 corpus-tools 2 low-resource-machine-translation 2 low-resource-languages 2 alignment 2 open-science 1 open-data 1 open-code 1 rob-steal-synonyms 1 constructionist-approach 1 subtitle-corpora 1 multilingual-translation 1 machinetranslation 1 indian-languages 1 japanese 1 english 1 document-aligned 1 python 1 opus 1 machine-learning 1 language-model 1 monolingual-corpora 1 translation 1 bible-translations 1 kriging 1 multidimensional-scaling 1 word-alignment 1 manatee-open 1 cmd 1 verbal-near-synonyms 1 universitas-udayana 1 constructional-equivalence 1 english-indonesian-parallel-corpora 1 opensubtitle 1 translation-studies 1 translation-equivalence 1 r-programming-projects 1 r-programming 1 quantitative-linguistics 1 open-subtitle 1 english-bhojpuri 1 bhojpuri-textcorpus 1 bhojpuri 1 media-aligned-corpora 1 linguistic-corpora 1 language-documentation 1 flask 1 elasticsearch 1 low-resource-nlp 1 emnlp-2020 1 bangla-nlp 1 bangla-machine-translation 1 bangla-dataset-machine-translation 1 sentence-alignment 1 parallel-sentence-mining 1 nlp-library 1 welsh 1 cymraeg 1 corporate 1 api 1 datasets 1 wget 1 warc 1 tokenizer 1 tmx 1 statistical-machine-translation 1 sentence-segmentation 1 hunalign 1 document-aligner 1 dictionaries 1 crawler 1 corpus-processing 1 corpus-generator 1 bleualign 1 bitextor 1 bicleaner 1 apertium 1