An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: code-switching

IrinaKipyatkova/KarRusCoS

KarRusCoS – Speech Database with Karelian-Russian Code-Switching

Size: 284 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

selinah66/NeurotechUSC-Bilingual-Code-Switching

This project aims to use existing open-source eye-tracking data on code-switching in Bilingual Chinese-English individuals to train a machine learning model to predict bilingual code-switching.

Language: Python - Size: 13.5 MB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 0 - Forks: 0

microsoft/LID-tool

This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The text that includes words from two languages such as Hindi written in roman script, mixed with English.

Language: Python - Size: 2.16 MB - Last synced at: about 6 hours ago - Pushed at: almost 5 years ago - Stars: 54 - Forks: 10

sagorbrur/codeswitch

CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed data.

Language: Jupyter Notebook - Size: 23.4 KB - Last synced at: 22 days ago - Pushed at: over 4 years ago - Stars: 35 - Forks: 6

microsoft/CodeMixed-Text-Generator

This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalence Constant Theory and Matrix Language Theory.

Language: Jupyter Notebook - Size: 3.79 MB - Last synced at: about 6 hours ago - Pushed at: 10 months ago - Stars: 54 - Forks: 12

javadr/PyTorch-Detect-Code-Switching

Implementation of a deep learning model (BiLSTM) to detect code-switching

Language: Python - Size: 9.6 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 6 - Forks: 0

Tomiinek/Multilingual_Text_to_Speech 📦

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

Language: Python - Size: 42.5 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 835 - Forks: 158

feyzaakyurek/newsframing

Code repository for ACL2020 paper Multi-label and Multilingual News Framing Analysis

Language: Jupyter Notebook - Size: 2.7 MB - Last synced at: 26 days ago - Pushed at: over 4 years ago - Stars: 7 - Forks: 1

RaiBP/incidental-bilingualism

Python program for detecting unintentional bilingual and translation instances in NLP datasets.

Language: Python - Size: 266 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

gentaiscool/code-switching-papers

A curated list of research papers and resources on code-switching

Size: 178 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 304 - Forks: 38

jonathandunn/pacific_CodeSwitch

Code-switching detection for Pacific languages

Language: Python - Size: 219 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 2

lykoerber/cs-vojna-i-mir

A (computational) linguistic analysis of code-switching Russian-French in Lev Tolstoi's Война и мир (War and Peace).

Language: Jupyter Notebook - Size: 3.26 MB - Last synced at: 25 days ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

andrianllmm/tagLID

A word level Language Identification (LID) tool for Tagalog-English (Taglish) text.

Language: Python - Size: 610 KB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

AddisonDP/OTMTextDataAnalysis

Data Analysis Toolkit for On the Margins, LLC

Language: Python - Size: 35.2 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

TaghreedT/BAEC

Bangor Arabic–English Code-switching corpus

Size: 366 KB - Last synced at: 9 months ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

Wei-RongRong2/RojakLanguageSentimentAnalysis

This is a machine learning project focused on analysing and classifying sentiments in code-switched and code-mixed text, specifically targeting the unique linguistic characteristics found in Malaysian conversations.

Language: Jupyter Notebook - Size: 20.6 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Nexdata-AI/207-Hours-Japanese-Speaking-English-Speech-Data-by-Mobile-Phone

Japanese Speaking English Speech Dataset

Size: 349 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

Nexdata-AI/300-Hours-Mixed-Speech-with-Korean-and-English-Data-by-Mobile-Phone

Mixed Speech with Korean and English Dataset

Size: 444 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

yihao001/singlish-polarity-detection

Modelling code-switching in Singlish for polarity detection

Language: Python - Size: 6.32 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

b-ashford/MIAMI-Corpus

An English-Spanish code switching dataset adapted from the Miami-Corpus

Size: 2.25 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

cisnlp/MaskLID

MaskLID: Code-Switching Language Identification through Iterative Masking -- ACL 2024

Language: Python - Size: 12.7 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 2 - Forks: 0

vcyrot/Frenglish-Benchmark

A Centralized Frenglish Benchmark from Naturally Occurring Code-Switching and Code-Mixing

Size: 105 KB - Last synced at: 12 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Lidan0241/language-detection

language detection in code-switching for es/en/zh speakers

Language: Jupyter Notebook - Size: 4.6 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

andi611/CS-Tacotron-Pytorch

Pytorch implementation of CS-Tacotron, a code-switching speech synthesis end-to-end generative TTS model.

Language: Python - Size: 155 MB - Last synced at: about 1 month ago - Pushed at: about 6 years ago - Stars: 23 - Forks: 6

PPPI/POSIT

POSIT aims to segment and tag mixed-text that contains English and C-like code, such that the user both knows what a token is, and within the language it's used in, what role, such as an AST tag or PoS tag, it serves.

Language: Python - Size: 51.5 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 2

97arushisharma/Hindi-English-Code-Switching

A simple UI to translate a text written in romanised hindi form to fully english sentence

Language: Lex - Size: 10.2 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

ctarnold/jpLLM

working on llm research

Language: Python - Size: 4.41 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

gentaiscool/meta-emb

Multilingual Meta-Embeddings for Named Entity Recognition (RepL4NLP & EMNLP 2019)

Language: Python - Size: 3.02 MB - Last synced at: 30 days ago - Pushed at: over 2 years ago - Stars: 32 - Forks: 3

audioku/meta-transfer-learning

Implementation of meta-transfer-learning for ASR and LM (ACL 2020)

Language: Python - Size: 6 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 47 - Forks: 10

ishan00/translation-for-code-switching-acl

Official repository for the paper titled "From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text" accepted at ACL 2021

Language: Python - Size: 8.59 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 2

Anwarvic/truel_bilingual_nmt

The official code for the "True Bilingual NMT" paper

Language: Python - Size: 3.59 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Nativeatom/NaturalLanguageProcessing

Natural Language Procesing

Size: 165 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 34 - Forks: 9

umar1997/propaganda-codeswitched-text

[EMNLP 2023] Official repository of paper titled "Detecting Propaganda Techniques in Code-Switched Social Media Text"

Language: Jupyter Notebook - Size: 47 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

mvidaldp/cs_catalan_spanish

Catalan-Spanish code-switching web-based online experiment including a Bilingual Language Profile building questionnaire.

Language: JavaScript - Size: 1.78 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

dieuthu/sequencetagging

A sequence tagging model with active learning

Language: Python - Size: 123 MB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 7 - Forks: 0

carexl8/code-mixed-tweets

Tweet ids for code-mixed Russian-German and Russian-Hebrew tweets

Size: 20.5 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

kjgpta/NSC-Code-Switch-Analysis

Code-switching analysis based on categories like Age, Gender and part-of-speech

Language: Jupyter Notebook - Size: 57.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

kjgpta/Code-Switch-Language-Modeling-for-English-and-Malay

Code-Switched Data generation based on Part-of-speech and Language Modeling of the generated text.

Language: Jupyter Notebook - Size: 153 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ash-shar/Code-Switching-and-Swearing-Patterns-on-Twitter

Repository containing Abusive Tweet Detection, Location Detection and Gender Detection codes

Language: Python - Size: 1.97 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 6 - Forks: 2

mmaguero/josa-corpus

Jopara (Guarani-dominant mixed with Spanish) sentiment analysis corpus

Size: 8.79 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 6 - Forks: 0

kolloqe/react-kbi-si-en

Kolloqe Input Component with code-switching support between Sinhala and English

Size: 694 KB - Last synced at: 7 months ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

kolloqe/react-kbi-si-en-html

Kolloqe Input Component with code-switching support between Sinhala and English attachable via <script> tags

Language: JavaScript - Size: 849 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

pika-online/Foreign_Pronunciation_Generator_for_Code-Switch_ASR

a socket script to obtain chinese phones-sequence for any english word

Language: Python - Size: 33.2 KB - Last synced at: 7 months ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 0

gentaiscool/multi-task-cs-lm

Code-Switching Language Modeling using Syntax-Aware Multi-Task Learning (CALCS 2018, ACL)

Language: Python - Size: 978 KB - Last synced at: 30 days ago - Pushed at: almost 6 years ago - Stars: 9 - Forks: 3

vincenthuang75025/chinglish

Chrome extension for translating highlighted English text into Chinglish (a chinese + english hybrid)

Language: Python - Size: 128 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

sedflix/unsacmt

Unsupervised Sentiment Analysis for Code-mixed Data

Language: Jupyter Notebook - Size: 2.81 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 8 - Forks: 4

amsuhane/ACL20-Code-switching-patterns

Code-switching patterns can be an effective route to improve performance of downstream NLP applications: A case study of humour, sarcasm and hate speech detection

Language: Python - Size: 4.72 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 10 - Forks: 1

sachinsingh1997/code_switch_approach

Language: Jupyter Notebook - Size: 941 KB - Last synced at: 9 months ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

vsoto/crowdsourced_bangor

This repository contains crowdsourced universal part-of-speech tags for the Miami Bangor corpus.

Size: 2.33 MB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

kmi-linguistics/Code-mixing

Size: 3.91 KB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

Related Keywords
code-switching 50 nlp 16 code-mixing 13 natural-language-processing 11 deep-learning 9 language-identification 6 machine-learning 5 sentiment-analysis 5 dataset 4 asr 4 speech 4 code-switch 4 multilingual 4 code-mixed 4 linguistics 4 python3 4 speech-recognition 4 transformer 3 language-detection 3 bilstm 3 language-modeling 3 bilingual 3 speech-synthesis 2 text-to-speech 2 tts 2 input 2 bert 2 nlp-resources 2 language 2 part-of-speech 2 english-malay 2 twitter 2 english 2 low-resource-languages 2 audio 2 nlp-machine-learning 2 speech-to-text 2 machine-translation 2 translation 2 multi-lingual 2 language-model 2 bilstm-crf-model 2 spanish 2 pytorch 2 react-components 2 ner 2 data-generation 2 serverless 1 tweet-text 1 multilingual-translations 1 neural-machine-translation 1 pretrained-models 1 artificial-intelligence 1 embedding 1 universal 1 chrome-extension 1 social-media 1 aws-lambda 1 indian-languages 1 paper 1 multi-task-learning 1 propaganda-detection 1 automatically-generated 1 catalan 1 catalan-language 1 karruscos 1 lid 1 mixtral-8x7b 1 mixtral-8x7b-instruct 1 prompting 1 spanglish 1 embeddings 1 unsupervised 1 meta-embeddings 1 representation-learning 1 zero-shot-learning 1 acl 1 cross-lingual 1 meta-learning 1 meta-transfer-learning 1 mixed-language 1 acl2020 1 neural-network 1 acl2021 1 gender-detection 1 location-detection 1 kaldi 1 kolloqe 1 social-network-analysis 1 swearing 1 indian-language 1 baselines 1 bert-fine-tuning 1 corpus-linguistics 1 sentiment-classification 1 text-categorization 1 text-classification 1 traditional-machine-learning 1 sinhala 1 material-ui 1