An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: language-classification

nitotm/efficient-language-detector

Fast and accurate natural language detection. Detector written in PHP. Nito-ELD, ELD.

Language: PHP - Size: 44.6 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 50 - Forks: 6

pemistahl/lingua-rs

The most accurate natural language detection library for Rust, suitable for short text and mixed-language text

Language: Rust - Size: 241 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 975 - Forks: 47

pemistahl/lingua-py

The most accurate natural language detection library for Python, suitable for short text and mixed-language text

Language: Python - Size: 287 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1,385 - Forks: 45

cisnlp/GlotLID

💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023

Language: Python - Size: 438 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 136 - Forks: 8

MainHieu/LanguageDetection

# LanguageDetectionThis project showcases real-time language detection using Azure AI Language Services. With a simple interface built on ASP.NET Core, you can easily identify the language of any text input. 🐙✨

Language: CSS - Size: 1.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

pemistahl/lingua-go

The most accurate natural language detection library for Go, suitable for short text and mixed-language text

Language: Go - Size: 226 MB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 1,245 - Forks: 68

pemistahl/lingua

The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

Language: Kotlin - Size: 425 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 753 - Forks: 68

azagniotov/language-detection

This is a refined and re-implemented version of the archived plugin for ElasticSearch elasticsearch-langdetect, which itself builds upon the original work by Nakatani Shuyo, found at https://github.com/shuyo/language-detection. The aforementioned implementation by Nakatani Shuyo serves as the default language detection component within Apache Solr.

Language: Java - Size: 18.2 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 2 - Forks: 0

sileixinhua/Python_sklearn_svm_language

语言识别数据集的基本数据分析方法,包括SVM算法。

Language: Python - Size: 296 KB - Last synced at: 5 days ago - Pushed at: about 8 years ago - Stars: 5 - Forks: 1

omarmhaimdat/whatlang-pyo3

Python Binding for Rust WhatLang, a language detection library

Language: Rust - Size: 3.4 MB - Last synced at: 23 days ago - Pushed at: over 1 year ago - Stars: 14 - Forks: 0

oscar-project/ungoliant

:spider: The pipeline for the OSCAR corpus

Language: Rust - Size: 4.72 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 167 - Forks: 15

Katashynskyi/Voice_assistant_UA_EN

No api-keys | local | llama3.1 For language studying and live translation

Language: Python - Size: 1.09 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 1

Williamhmo/languagedetection

Streamlit web app

Language: Python - Size: 25.7 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

thunderpoot/isogloss

ISO 639 and IETF Language Code Lookup Tool

Language: Python - Size: 1.78 MB - Last synced at: 7 days ago - Pushed at: 9 months ago - Stars: 7 - Forks: 1

somenath203/Language-Identifier-using-Tensorflow

Click below to checkout the website

Language: Jupyter Notebook - Size: 290 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

oscar-project/goclassy 📦

An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.

Language: Go - Size: 377 KB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 85 - Forks: 6

elisiojsj/NLP-language-classification-generation-translation

NLP projects.

Language: Jupyter Notebook - Size: 1.02 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 1

AidaLog/Plain-Swahili-Dataset

Plain swahili dastaset. Public sourced from public repositories

Size: 5.03 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 1

gokadin/hyperdimensional-computing

Hyperdimensional computing explained and demonstrated

Language: Go - Size: 5.33 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 24 - Forks: 1

hb20007/greek-dialect-classifier

Classifier that identifies Greek text as Cypriot Greek or Standard Modern Greek

Language: Jupyter Notebook - Size: 1.05 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 7 - Forks: 3

AidaLog/Common-Swahili-stopwords

This curated collection brings together a dataset of common Swahili stopwords gathered from various sources on the internet. Stopwords are words that are frequently used in a language but typically don't contribute significant meaning to a text.

Size: 3.91 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

golecalicja/language-recognition-neural-network

A single-layer neural network written from scratch that predicts the language of the text.

Language: Python - Size: 1.82 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

bhavuksagar/Language-Identifier

This program identify the input text language.

Language: Jupyter Notebook - Size: 5.97 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

vishnukanduri/Language-Classification-using-Naive-Bayes-in-Python

Classified sentences into one of Slovak, Czech, and English. Implemented relevant preprocessing steps, addressed the class imbalance in training set by employing the learned theory of Naive Bayes Models, and implementing subword units.

Language: Smalltalk - Size: 1.08 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

darshanbagul/Textual_Language_Identification

Implementing a Naive Bayes Classifier for multiclass classification to identify language of a given text

Language: Scala - Size: 512 KB - Last synced at: almost 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

LouisVanLangendonck/UPC-MUD-LanguageDetection

Build and analyzed a language classifying pipeline in order to gain insight and familiarity to typical Natural Language Processing (NLP) tools and strategies.

Language: Python - Size: 5.68 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

lelouisdevn/language-classification

Language: Python - Size: 5.55 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

imSuvankar/language_classification

Language Identification Techniques and Analysis (5 encoding X 17 models)

Language: Jupyter Notebook - Size: 6.7 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

jfernsler/ASRS_Classifier

Language classifier based on BERT to classify Aviation Safety Reporting System (ASRS) narratives into categories that were clustered by using language embedding similarity.

Language: Python - Size: 28.5 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

mc-cat-tty/Language-Classification

Suite of Python modules to recognise the language of a file

Language: Python - Size: 12.9 MB - Last synced at: 3 months ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

JasonFengGit/RNN-Language-Classifier

A Language Classifier powered by Recurrent Neural Network implemented in Python without AI libraries. AI from scratch.

Language: Python - Size: 960 KB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 52 - Forks: 13

jtonglet/language-identifier

Language Identifier model : takes a sentence as input and predict its language.

Language: Python - Size: 5.15 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

AsianZeus/Personal-Voice-Assistant

Personalized voice assistant 'Alice' with Language classification, Speech Recognition, Machine Translation, Restoring Punctuation, Conversational and Speech Synthesis.

Language: Python - Size: 10.7 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

prabormukherjee/Language_classifier

Classifying English, Slovak, Czech language using Naive Bayes

Language: Smalltalk - Size: 1.06 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

sbdzdz/hate-tweet

Detecting hate speech in tweets using bag-of-trick models and bi-LSTM networks.

Language: Python - Size: 149 KB - Last synced at: about 1 year ago - Pushed at: almost 8 years ago - Stars: 1 - Forks: 1

Related Keywords
language-classification 35 nlp 19 language-detection 11 natural-language-processing 9 language-identification 8 language-recognition 8 machine-learning 7 python 7 nlp-machine-learning 5 language 4 language-detection-library 3 fasttext 3 lstm 3 rnn 3 language-processing 3 natural-language 3 language-detector 3 corpus-linguistics 2 common-crawl 2 python3 2 naive-bayes 2 language-model 2 speech-recognition 2 language-classifier 2 subwords 2 voice-assistant 2 linguistics 2 cnn-text-classification 2 burmese-nlp 2 artificial-intelligence 2 dataset 2 language-detection-lib 2 open-data 2 keras 2 swahili 2 swahili-sentences 2 nltk 1 nltk-data 1 african-languages 1 nltk-library 1 nltk3 1 notebook 1 afican-language 1 translation-models 1 aidalog 1 pytorch 1 plain-swahili 1 language-translation 1 language-generation 1 n-grams 1 jupyter-notebook 1 jupyter 1 greek 1 dialects 1 dialect-identification 1 dialect 1 cypriot 1 swahili-speaking 1 classifier 1 classification 1 vector 1 hyperdimensional 1 embeddings 1 csv 1 files 1 flask 1 frequency-table 1 itis-fermi-modena 1 language-analyzer 1 twitter 1 numpy 1 recurrent-neural-network 1 word-classifier 1 multilingual 1 conversational-ai 1 restoring-punctuation 1 speech-synthesis 1 stt 1 translation 1 tts 1 vectorizing 1 bi-lstm 1 swahili-slangs 1 swahili-stopwords 1 tanzania 1 algorithm-from-scratch 1 classification-algorithm 1 from-scratch 1 neural-network 1 neural-network-from-scratch 1 svm-classifier 1 data-cleaning 1 exploratory-data-analysis 1 naive-bayes-implementation 1 subword-units 1 vectorization 1 visualization 1 computational-linguistics 1 scala 1 bayes-classifier 1