Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: linguistic-corpora
korpling/ANNIS
ANNIS is an open source, versatile web browser-based search and visualization architecture for complex multilevel linguistic corpora with diverse types of annotation.
Language: Java - Size: 181 MB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 68 - Forks: 25
korpling/graphANNIS
This is a new backend implementation of the ANNIS linguistic search and visualization system.
Language: Rust - Size: 15.6 MB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 17 - Forks: 1
jcrippen/tlingit-corpus
Text corpus the of Tlingit language for linguistic research.
Language: Shell - Size: 1.37 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 6 - Forks: 2
frankier/STIFF
Sense Tagged Instances For Finnish
Language: Python - Size: 732 KB - Last synced: about 2 months ago - Pushed: over 1 year ago - Stars: 2 - Forks: 1
sorinmarti/textanalyzer
Java Software to analyze text files.
Language: Java - Size: 268 KB - Last synced: about 2 months ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0
nevmenandr/thai-language
computer tools for thai language
Language: Python - Size: 27.1 MB - Last synced: about 2 months ago - Pushed: over 6 years ago - Stars: 21 - Forks: 8
timarkh/tsakorpus
Yet another search platform for linguistic corpora.
Language: Python - Size: 3.28 MB - Last synced: 4 months ago - Pushed: 5 months ago - Stars: 16 - Forks: 12
Frobeniusnorm/AcademicTextEstimator
Language: Scala - Size: 12.3 MB - Last synced: 7 months ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 0
wzkariampuzha/EvolutionaryLinguistics
Investigations into Evolutionary Linguistics using the Google Ngrams corpus. Sub-projects include Birth and Death of English Lexemes in Closed Lexical Classes | Lexicon Size
Language: Jupyter Notebook - Size: 617 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 1 - Forks: 0
emeinhardt/switchboard-lm
Notebooks for processing various versions of the Switchboard corpus.
Language: Jupyter Notebook - Size: 3.49 MB - Last synced: 10 months ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0
emeinhardt/fisher-lm-srilm
A repository describing the construction of a unigram language model from the Fisher corpus
Language: Jupyter Notebook - Size: 8.49 MB - Last synced: 10 months ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0
emeinhardt/fisher-lm
Notebook converts the Fisher Corpus to a relational format and processes it for a language model.
Language: Jupyter Notebook - Size: 1.21 MB - Last synced: 10 months ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 0
emeinhardt/buckeye-lm
Notebooks for working with / processing the Buckeye corpus.
Language: Jupyter Notebook - Size: 3.03 MB - Last synced: 10 months ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0
agricolamz/lingcorpora.R
API for linguistic corpora
Language: R - Size: 2.27 MB - Last synced: 10 months ago - Pushed: over 7 years ago - Stars: 3 - Forks: 1
unrealtecellp/life
Linguistic Field Data Management and Analysis System [LiFE]
Language: JavaScript - Size: 261 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 3 - Forks: 0
AxelleDomingues/Memoire-2
Language: Jupyter Notebook - Size: 5.45 MB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 0 - Forks: 0
GiovanniMerici/Big-Data-in-Linguistics
Supporting code for big-data analysis in linguistics
Language: Jupyter Notebook - Size: 1.99 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 3 - Forks: 0
habecker/Orthographie-Archiv 📦
Custom search-engine for a small corpora
Language: Vue - Size: 1.81 MB - Last synced: over 1 year ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0
jklu-jaipur/Political-Biasness-Detection
Our ML model calculates the biasness of a political article based on linguistic features and classifies them as biased towards the ruling government, bias towards the opposition, or neutral.
Language: Jupyter Notebook - Size: 1.8 MB - Last synced: over 1 year ago - Pushed: over 3 years ago - Stars: 8 - Forks: 3
RoxaneSegers/CEO-Ontology
This repository contains the CEO ontology, the evaluation corpus and the CEO vocabulary.
Size: 4.01 MB - Last synced: over 1 year ago - Pushed: about 6 years ago - Stars: 6 - Forks: 5
LAAC-LSCP/datasets
DataLad superdataset including all the datasets currently managed by the LAAC/LSCP team
Language: Python - Size: 5.86 KB - Last synced: over 1 year ago - Pushed: over 3 years ago - Stars: 2 - Forks: 0
Deeptiman/php-dom-parser-translation-tool
A Simple DOM Parser and Translation Tool using PHP, HTML, and MySQL. The translation model is supported for English to Odia language. There is a built in dictionary to support the translation.
Language: PHP - Size: 4.62 MB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 3 - Forks: 1
andcarnivorous/NLD
Natural Language Decorators - A collection of decorators to implement NLTK preprocessing steps.
Language: Python - Size: 37.1 KB - Last synced: 25 days ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0
npedrazzini/averageReducedFrequency
R script to calculate the Average Reduced Frequency (ARF) of all words in a corpus
Language: R - Size: 30.3 KB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 1 - Forks: 0
avery-radmacher/Wemyss
A web scraper for the student newspaper of Covenant College.
Language: Ruby - Size: 16.6 KB - Last synced: over 1 year ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0
vgel/asl-prosody
Prosodic analysis on NCSLGR corpus data
Language: Python - Size: 967 KB - Last synced: about 1 year ago - Pushed: about 7 years ago - Stars: 1 - Forks: 0
miweru/vrt_spacy
Language: Python - Size: 20.5 KB - Last synced: about 1 month ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0
miweru/vrt_generator
Python class for creating vrt-annotated corpora
Language: Python - Size: 13.7 KB - Last synced: 1 day ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0
ipante/visualisation-diatopique
Atelier de visualisation cartographique dans le cadre de la Summer School "Phonologie de corpus", UNIL (22-26.07.2019)
Language: JavaScript - Size: 1.84 MB - Last synced: about 1 year ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 0
clemsciences/old_swedish_texts
Language: HTML - Size: 279 KB - Last synced: over 1 year ago - Pushed: almost 6 years ago - Stars: 0 - Forks: 0
rahonalab/cl-cagliari2017
Pdf, tex and data of corpus linguistics lessons delivered in Cagliari, December 2017
Language: TeX - Size: 1.3 MB - Last synced: over 1 year ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0
levindoneto/Programmierkurs
Aufgaben zum Programmierkurs - Universität Stuttgart - Wintersemester
Language: Python - Size: 669 KB - Last synced: over 1 year ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0