GitHub topics: parallel-data
PartitionedArrays/PartitionedArrays.jl
Large-scale, distributed, sparse linear algebra in Julia.
Language: Julia - Size: 5.61 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 127 - Forks: 21

VinAIResearch/PhoMT
PhoMT: A High-Quality and Large-Scale Benchmark Dataset for Vietnamese-English Machine Translation (EMNLP 2021)
Size: 11.7 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 43 - Forks: 4

lormaechea/wivico
Wikipedia-Vikidia Corpus (WiViCo) - A general-purpose parallel sentence simplification dataset for French
Size: 21.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 1

thammegowda/mtdata
A tool that locates, downloads, and extracts machine translation corpora
Language: Python - Size: 6.36 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 154 - Forks: 23

Datsede04/Amharic-corps-collector-bot
A Telegram Bot for Amharic Speech Data Collection
Language: JavaScript - Size: 43.9 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 3

Elbria/xling-SemDiv
Code and data for the EMNLP 2020 paper: "Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank"
Language: Python - Size: 9.36 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 3
