An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: code-mixing

microsoft/LID-tool

This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The text that includes words from two languages such as Hindi written in roman script, mixed with English.

langage: Python - taille: 2,16 Mo - dernière synchronisation: il y a environ 17 heures - enregistré: il y a presque 5 ans - étoiles: 55 - forks: 10

andrianllmm/tagLID

A word-level Language Identification (LID) tool for Tagalog-English (Taglish) text

langage: Python - taille: 613 ko - dernière synchronisation: il y a environ un mois - enregistré: il y a environ un mois - étoiles: 2 - forks: 0

gentaiscool/code-switching-papers

A curated list of research papers and resources on code-switching

taille: 178 ko - dernière synchronisation: il y a environ 2 mois - enregistré: il y a 7 mois - étoiles: 315 - forks: 39

microsoft/CodeMixed-Text-Generator

This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalence Constant Theory and Matrix Language Theory.

langage: Jupyter Notebook - taille: 3,79 Mo - dernière synchronisation: il y a environ 17 heures - enregistré: il y a 12 mois - étoiles: 55 - forks: 11

praatibhsurana/Hinglish_Hindi_WSD

A pipeline for transliteration, spell correction, POS tagging and word sense disambiguation of Hinglish code mixed data to Hindi Devanagari script.

langage: Python - taille: 895 ko - dernière synchronisation: il y a 2 jours - enregistré: il y a plus d'un an - étoiles: 36 - forks: 8

jessicasaikia/hidden-markov-model-HMM

This repository implements a Hidden Markov Model (HMM) for performing Parts of Speech (POS) Tagging on Assamese-English code-mixed texts.

langage: Python - taille: 358 ko - dernière synchronisation: il y a environ 2 mois - enregistré: il y a 8 mois - étoiles: 0 - forks: 0

jessicasaikia/conditional-random-field-CRF

This repository implements a Conditional Random Field (CRF) for performing Parts-of-Speech (POS) Tagging on Assamese-English code-mixed texts.

langage: Python - taille: 10,7 ko - dernière synchronisation: il y a 4 mois - enregistré: il y a 8 mois - étoiles: 0 - forks: 0

jessicasaikia/long-short-term-memory-LSTM

This repository implements a Long Short Term Memory (LSTM) for performing Parts-of-Speech (POS) Tagging on Assamese-English code-mixed texts.

langage: Python - taille: 16,6 ko - dernière synchronisation: il y a 4 jours - enregistré: il y a 8 mois - étoiles: 0 - forks: 0

jessicasaikia/bidirectional-long-short-term-memory-BiLSTM

This repository implements a Bidirectional Long Short Term Memory (BiLSTM) for performing Parts-of-Speech (POS) Tagging on Assamese-English code-mixed texts.

langage: Python - taille: 11,7 ko - dernière synchronisation: il y a 4 mois - enregistré: il y a 8 mois - étoiles: 0 - forks: 0

jessicasaikia/multilingual-BERT-mBERT

This repository implements a Multilingual BERT (mBERT) model for performing Parts-of-Speech (POS) Tagging on Assamese-English code-mixed texts.

langage: Python - taille: 11,7 ko - dernière synchronisation: il y a 4 mois - enregistré: il y a 8 mois - étoiles: 0 - forks: 0

jessicasaikia/rule-based

This repository contains a simple Rule-Based Model for Parts-of-Speech tagging in Assamese-English code mixed texts.

langage: Python - taille: 352 ko - dernière synchronisation: il y a 8 jours - enregistré: il y a 8 mois - étoiles: 0 - forks: 0

salesforce/adversarial-polyglots

Code for the paper "Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots" (NAACL-HLT 2021)

langage: Python - taille: 45,9 ko - dernière synchronisation: il y a 3 mois - enregistré: il y a plus de 3 ans - étoiles: 10 - forks: 7

Wei-RongRong2/RojakLanguageSentimentAnalysis

This is a machine learning project focused on analysing and classifying sentiments in code-switched and code-mixed text, specifically targeting the unique linguistic characteristics found in Malaysian conversations.

langage: Jupyter Notebook - taille: 20,6 Mo - dernière synchronisation: il y a 4 mois - enregistré: il y a 12 mois - étoiles: 0 - forks: 0

Nexdata-AI/300-Person-Mandarin-Chinese-and-English-Bilingual-Spontaneous-Monologue-smartphone

300-Person-Mandarin-Chinese-and-English-Bilingual-Spontaneous-Monologue-smartphone

taille: 2,93 ko - dernière synchronisation: il y a 12 mois - enregistré: il y a 12 mois - étoiles: 0 - forks: 0

cisnlp/MaskLID

MaskLID: Code-Switching Language Identification through Iterative Masking -- ACL 2024

langage: Python - taille: 12,7 ko - dernière synchronisation: il y a environ un an - enregistré: il y a environ un an - étoiles: 2 - forks: 0

vcyrot/Frenglish-Benchmark

A Centralized Frenglish Benchmark from Naturally Occurring Code-Switching and Code-Mixing

taille: 105 ko - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 2 ans - étoiles: 0 - forks: 0

Lidan0241/language-detection

language detection in code-switching for es/en/zh speakers

langage: Jupyter Notebook - taille: 4,6 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a environ un an - étoiles: 1 - forks: 0

Bernardbyy/BahasaRojakSentimentAnalysis

Handling Bahasa Rojak (Malaysian Code Mixing Language) OOV and performing Sentiment Analysis using downstreamed XLM-R

langage: Jupyter Notebook - taille: 2,88 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a environ un an - étoiles: 0 - forks: 1

gulabpatel/Code-Mixing

will discuss code mixing algorithms evolution

langage: Jupyter Notebook - taille: 204 ko - dernière synchronisation: il y a environ 2 mois - enregistré: il y a presque 3 ans - étoiles: 2 - forks: 0

ir-nlp-csui/id-en-code-mixed

Indonesian-English code-mixed Twitter dataset

taille: 288 ko - dernière synchronisation: il y a plus d'un an - enregistré: il y a presque 3 ans - étoiles: 1 - forks: 0

Anwarvic/truel_bilingual_nmt

The official code for the "True Bilingual NMT" paper

langage: Python - taille: 3,59 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 3 ans - étoiles: 0 - forks: 0

LCS2-IIITD/HIT-ACL2021-Codemixed-Representation

This repo contains the source code of HIT: A Hierarchically Fused Deep Attention Network for RobustCode-mixed Language Representation (Accepted in ACL 2021)

langage: Python - taille: 29,2 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 3 ans - étoiles: 6 - forks: 5

carexl8/code-mixed-tweets

Tweet ids for code-mixed Russian-German and Russian-Hebrew tweets

taille: 20,5 ko - dernière synchronisation: il y a environ 2 ans - enregistré: il y a environ 2 ans - étoiles: 0 - forks: 0

ash-shar/Code-Switching-and-Swearing-Patterns-on-Twitter

Repository containing Abusive Tweet Detection, Location Detection and Gender Detection codes

langage: Python - taille: 1,97 Mo - dernière synchronisation: il y a presque 2 ans - enregistré: il y a plus de 7 ans - étoiles: 6 - forks: 2

mmaguero/josa-corpus

Jopara (Guarani-dominant mixed with Spanish) sentiment analysis corpus

taille: 8,79 ko - dernière synchronisation: il y a plus de 2 ans - enregistré: il y a environ 3 ans - étoiles: 6 - forks: 0

aparnadutta/code-mixed-lid

Word-level language identification for Bangla-English code-mixed social media data, using a BiLSTM with subword embeddings.

langage: Python - taille: 190 Mo - dernière synchronisation: il y a plus de 2 ans - enregistré: il y a environ 3 ans - étoiles: 5 - forks: 0

sumanbanerjee1/Code-Mixed-Dialog

langage: Python - taille: 13,1 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a environ 7 ans - étoiles: 33 - forks: 7

ayanc18/PsycholinguisticCodeMixing

Psycholinguistic Analysis of Code Mixing - Speech and Natural Language Processing Term Project: CS60057. Department of Computer science and Engineering, Indian Institute of Technology Kharagpur

langage: Python - taille: 2,88 Mo - dernière synchronisation: il y a environ 2 ans - enregistré: il y a plus de 7 ans - étoiles: 1 - forks: 1

poornagurram/code_mixing_sentiment

langage: Python - taille: 2,83 Mo - dernière synchronisation: il y a environ 2 ans - enregistré: il y a environ 7 ans - étoiles: 1 - forks: 1

kmi-linguistics/Code-mixing

taille: 3,91 ko - dernière synchronisation: il y a environ 2 ans - enregistré: il y a plus de 7 ans - étoiles: 0 - forks: 0

Related Keywords
code-mixing 30 nlp 17 code-switching 13 english 8 code-mixed 8 parts-of-speech 7 nlp-machine-learning 7 english-language 7 assamese 7 pos-tagging 7 pos-tagger 6 assamese-text 6 language-identification 6 parts-of-speech-tagging 5 natural-language-processing 5 sentiment-analysis 4 twitter 4 sentiment-classification 3 linguistics 3 python3 3 machine-learning 2 deep-learning 2 named-entity-recognition 2 multilingual 2 bilstm 2 lstm 2 machine-translation 2 transformer 2 bilingual 2 code-switch 2 word-level-language-model 1 transfer-learning 1 hred 1 seq2seq 1 out-of-vocabulary 1 fine-tuning 1 domain-adaptation 1 chinese-simplified 1 psycholinguistics 1 bahasa-melayu 1 identification-language 1 french-english 1 language-identifier 1 language-identification-toolkit 1 spontaneous-speech-recognition 1 speech-to-text 1 asr 1 support-vector-machine 1 computational-linguistics 1 render-deployment 1 hindi 1 indian-language 1 multinomial-naive-bayes 1 multilingual-nlp 1 malaysian-language 1 baselines 1 swearing 1 social-network-analysis 1 location-detection 1 indian-languages 1 gender-detection 1 tweets 1 russian 1 hebrew 1 german 1 attention-model 1 bert-fine-tuning 1 corpus-linguistics 1 low-resource-languages 1 language-detection 1 pretrained-models 1 neural-machine-translation 1 multilingual-translations 1 text-categorization 1 text-classification 1 social-media 1 traditional-machine-learning 1 bangla-nlp 1 lexical-normalization 1 indonesian-language 1 lid 1 xlmroberta 1 hidden-markov-model 1 wsd-dataset 1 wsd 1 word-sense-disambiguation 1 spello 1 python-package 1 python-library 1 python-3 1 lesk-algorithm 1 lesk 1 indowordnet 1 indic-transliteration 1 indic-nlp 1 indic-languages 1 hinglish-to-hindi-transliteration 1 hinglish 1 hindi-spell-correction 1 hindi-pos-tag 1