Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: linguistic-corpora

korpling/ANNIS

ANNIS is an open source, versatile web browser-based search and visualization architecture for complex multilevel linguistic corpora with diverse types of annotation.

Language: Java - Size: 181 MB - Last synced: 10 days ago - Pushed: 10 days ago - Stars: 68 - Forks: 25

korpling/graphANNIS

This is a new backend implementation of the ANNIS linguistic search and visualization system.

Language: Rust - Size: 15.6 MB - Last synced: 16 days ago - Pushed: 16 days ago - Stars: 17 - Forks: 1

jcrippen/tlingit-corpus

Text corpus the of Tlingit language for linguistic research.

Language: Shell - Size: 1.37 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 6 - Forks: 2

frankier/STIFF

Sense Tagged Instances For Finnish

Language: Python - Size: 732 KB - Last synced: about 2 months ago - Pushed: over 1 year ago - Stars: 2 - Forks: 1

sorinmarti/textanalyzer

Java Software to analyze text files.

Language: Java - Size: 268 KB - Last synced: about 2 months ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

nevmenandr/thai-language

computer tools for thai language

Language: Python - Size: 27.1 MB - Last synced: about 2 months ago - Pushed: over 6 years ago - Stars: 21 - Forks: 8

timarkh/tsakorpus

Yet another search platform for linguistic corpora.

Language: Python - Size: 3.28 MB - Last synced: 4 months ago - Pushed: 5 months ago - Stars: 16 - Forks: 12

Frobeniusnorm/AcademicTextEstimator

Language: Scala - Size: 12.3 MB - Last synced: 7 months ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 0

wzkariampuzha/EvolutionaryLinguistics

Investigations into Evolutionary Linguistics using the Google Ngrams corpus. Sub-projects include Birth and Death of English Lexemes in Closed Lexical Classes | Lexicon Size

Language: Jupyter Notebook - Size: 617 KB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 1 - Forks: 0

emeinhardt/switchboard-lm

Notebooks for processing various versions of the Switchboard corpus.

Language: Jupyter Notebook - Size: 3.49 MB - Last synced: 10 months ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

emeinhardt/fisher-lm-srilm

A repository describing the construction of a unigram language model from the Fisher corpus

Language: Jupyter Notebook - Size: 8.49 MB - Last synced: 10 months ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

emeinhardt/fisher-lm

Notebook converts the Fisher Corpus to a relational format and processes it for a language model.

Language: Jupyter Notebook - Size: 1.21 MB - Last synced: 10 months ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 0

emeinhardt/buckeye-lm

Notebooks for working with / processing the Buckeye corpus.

Language: Jupyter Notebook - Size: 3.03 MB - Last synced: 10 months ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

agricolamz/lingcorpora.R

API for linguistic corpora

Language: R - Size: 2.27 MB - Last synced: 10 months ago - Pushed: over 7 years ago - Stars: 3 - Forks: 1

unrealtecellp/life

Linguistic Field Data Management and Analysis System [LiFE]

Language: JavaScript - Size: 261 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 3 - Forks: 0

AxelleDomingues/Memoire-2

Language: Jupyter Notebook - Size: 5.45 MB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 0 - Forks: 0

GiovanniMerici/Big-Data-in-Linguistics

Supporting code for big-data analysis in linguistics

Language: Jupyter Notebook - Size: 1.99 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 3 - Forks: 0

habecker/Orthographie-Archiv 📦

Custom search-engine for a small corpora

Language: Vue - Size: 1.81 MB - Last synced: over 1 year ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0

jklu-jaipur/Political-Biasness-Detection

Our ML model calculates the biasness of a political article based on linguistic features and classifies them as biased towards the ruling government, bias towards the opposition, or neutral.

Language: Jupyter Notebook - Size: 1.8 MB - Last synced: over 1 year ago - Pushed: over 3 years ago - Stars: 8 - Forks: 3

RoxaneSegers/CEO-Ontology

This repository contains the CEO ontology, the evaluation corpus and the CEO vocabulary.

Size: 4.01 MB - Last synced: over 1 year ago - Pushed: about 6 years ago - Stars: 6 - Forks: 5

LAAC-LSCP/datasets

DataLad superdataset including all the datasets currently managed by the LAAC/LSCP team

Language: Python - Size: 5.86 KB - Last synced: over 1 year ago - Pushed: over 3 years ago - Stars: 2 - Forks: 0

Deeptiman/php-dom-parser-translation-tool

A Simple DOM Parser and Translation Tool using PHP, HTML, and MySQL. The translation model is supported for English to Odia language. There is a built in dictionary to support the translation.

Language: PHP - Size: 4.62 MB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 3 - Forks: 1

andcarnivorous/NLD

Natural Language Decorators - A collection of decorators to implement NLTK preprocessing steps.

Language: Python - Size: 37.1 KB - Last synced: 25 days ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0

npedrazzini/averageReducedFrequency

R script to calculate the Average Reduced Frequency (ARF) of all words in a corpus

Language: R - Size: 30.3 KB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 1 - Forks: 0

avery-radmacher/Wemyss

A web scraper for the student newspaper of Covenant College.

Language: Ruby - Size: 16.6 KB - Last synced: over 1 year ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0

vgel/asl-prosody

Prosodic analysis on NCSLGR corpus data

Language: Python - Size: 967 KB - Last synced: about 1 year ago - Pushed: about 7 years ago - Stars: 1 - Forks: 0

miweru/vrt_spacy

Language: Python - Size: 20.5 KB - Last synced: about 1 month ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

miweru/vrt_generator

Python class for creating vrt-annotated corpora

Language: Python - Size: 13.7 KB - Last synced: 1 day ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

ipante/visualisation-diatopique

Atelier de visualisation cartographique dans le cadre de la Summer School "Phonologie de corpus", UNIL (22-26.07.2019)

Language: JavaScript - Size: 1.84 MB - Last synced: about 1 year ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 0

clemsciences/old_swedish_texts

Language: HTML - Size: 279 KB - Last synced: over 1 year ago - Pushed: almost 6 years ago - Stars: 0 - Forks: 0

rahonalab/cl-cagliari2017

Pdf, tex and data of corpus linguistics lessons delivered in Cagliari, December 2017

Language: TeX - Size: 1.3 MB - Last synced: over 1 year ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

levindoneto/Programmierkurs

Aufgaben zum Programmierkurs - Universität Stuttgart - Wintersemester

Language: Python - Size: 669 KB - Last synced: over 1 year ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

Related Keywords
linguistic-corpora 32 linguistics 15 linguistic-analysis 7 python 6 linguistics-databases 4 flask 3 corpus 3 vrt 2 wrapper 2 corpora 2 linguistics-field 2 nlp 2 french-language 2 computational-linguistics 2 parser-generator 1 parallel-corpus 1 odia-language 1 parser-library 1 php 1 phpmyadmin 1 statistical-machine-translation 1 tomcat-server 1 translation-service 1 mysql 1 moses-machine-translation 1 linguist 1 dom-parser 1 corpus-tool 1 apache 1 linguistic-dataset 1 vocabulary 1 sumo 1 owl 1 ontology 1 framenet 1 events 1 causal-models 1 programming 1 assignments 1 morphological-analysis 1 latex-document 1 beamer-presentation 1 philology 1 old-swedish 1 bible-translations 1 visualization 1 unil 1 sli 1 data 1 d3 1 spacy 1 signstream 1 webscraping 1 newspaper 1 r 1 keyword-analysis 1 frequency-analysis 1 preprocessing-steps 1 nltk 1 natural-language-decorators 1 translation-tool 1 natural-language-processing 1 annotation-tool 1 api 1 google-ngram 1 evolution 1 pragmatics 1 genre-classification 1 parallel-corpora 1 media-aligned-corpora 1 language-documentation 1 elasticsearch 1 corpus-tools 1 corpus-linguistics 1 thai-language 1 opennlp 1 dialects 1 wsd 1 word-sense-disambiguation 1 corpus-processing 1 text-corpus 1 native-american 1 indigenous-languages 1 search-engine 1 hacktoberfest 1 search-interface 1 corpus-search 1 machine-learning 1 bias-detection 1 vuejs 1 duden 1 seaborn 1 pandas 1 matplotlib 1 jupyter-lab 1 big-data-visualization 1 big-data-projects 1 big-data-analytics 1 big-data 1 typology 1