Topic: "multilingual-nlp"
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
Language: Python - Size: 34.3 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2,433 - Forks: 380

bigscience-workshop/xmtf
Crosslingual Generalization through Multitask Finetuning
Language: Jupyter Notebook - Size: 28.6 MB - Last synced at: 19 days ago - Pushed at: 7 months ago - Stars: 530 - Forks: 37

DmitryRyumin/EMNLP-2023-Papers
EMNLP 2023 Papers: Explore cutting-edge research from EMNLP 2023, the premier conference for advancing empirical methods in natural language processing. Stay updated on the latest in machine learning, deep learning, and natural language processing with code included. :star: support NLP!
Language: Python - Size: 6.43 MB - Last synced at: 15 days ago - Pushed at: 11 months ago - Stars: 107 - Forks: 7

cisnlp/Glot500
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023
Language: Python - Size: 151 KB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 100 - Forks: 4

FSoft-AI4Code/TheVault
[EMNLP 2023] The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation
Language: Jupyter Notebook - Size: 9.44 MB - Last synced at: 20 days ago - Pushed at: 8 months ago - Stars: 92 - Forks: 9

shijie-wu/crosslingual-nlp
This repo supports various cross-lingual transfer learning & multilingual NLP models.
Language: Python - Size: 125 KB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 89 - Forks: 7

epfl-dlab/llm-latent-language
Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".
Language: Jupyter Notebook - Size: 2.54 MB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 74 - Forks: 16

csebuetnlp/CrossSum
This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs" published in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23), July 9-14, 2023.
Language: Python - Size: 5.72 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 47 - Forks: 7

ceferisbarov/TUMLU
TUMLU: A Unified and Native Language Understanding Benchmark for Turkic Languages
Language: Python - Size: 38.3 MB - Last synced at: 7 days ago - Pushed at: 2 months ago - Stars: 19 - Forks: 1

BatsResearch/cross-lingual-detox
Code for "Preference Tuning For Toxicity Mitigation Generalizes Across Languages"
Language: Jupyter Notebook - Size: 309 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 14 - Forks: 0

cisnlp/MEXA
🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
Language: Python - Size: 26.4 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 10 - Forks: 0

cambridgeltl/prompt4bli
On Bilingual Lexicon Induction with Large Language Models (EMNLP 2023). Keywords: Bilingual Lexicon Induction, Word Translation, Large Language Models, LLMs.
Language: Python - Size: 86.9 KB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 10 - Forks: 2

negar-foroutan/multiLMs-lang-neutral-subnets
[EMNLP 2022] Discovering Language-neutral Sub-networks in Multilingual Language Models.
Language: Python - Size: 831 KB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 1

ramisa2108/Bangla-Complex-Named-Entity-Recognition-Challenge
Winning Solution for the Bangla Complex Named Entity Recognition Challenge - BDOSN NLP Hackathon 2023
Language: Jupyter Notebook - Size: 2.9 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 0

mobassir94/Multilingual-NLP-for-Islamic-Theology
Cross Lingual Language models for making search engines for Holy Quran and Sahih Hadiths
Language: Jupyter Notebook - Size: 151 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 0

MaLA-LM/mala-500
MaLA-500: Massive Language Adaptation of Large Language Models
Language: Python - Size: 97.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

aditi184/MultilingualQA
Chaii (Challenge in AI for India) Multilingual QnA - Google Research India
Language: Jupyter Notebook - Size: 26.4 KB - Last synced at: 3 months ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 0

cambridgeltl/sail-bli
Self-Augmented In-Context Learning for Unsupervised Word Translation (ACL 2024). Keywords: Bilingual Lexicon Induction, Word Translation, Large Language Models, LLMs.
Language: Python - Size: 445 KB - Last synced at: 8 days ago - Pushed at: 9 months ago - Stars: 3 - Forks: 1

longxudou/multispider
MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing
Language: Python - Size: 194 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

negar-foroutan/multilingual-code-switched-reasoning
[EMNLP 2023 - Findings] Breaking the Language Barrier: Improving Cross-Lingual Reasoning with Structured Self-Attention
Language: Python - Size: 41 KB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 2

thesofakillers/CLAfICLe
Official repository for the paper "CLAfICLe: Cross-Lingual Adaptation for In-Context Learning". Not Published.
Language: TeX - Size: 13.9 MB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

harmonydata/harmony_r
R library for Harmony. R package - open source tool using AI for psychology and mental health. Actively recruiting contributors.
Language: HTML - Size: 1.19 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 2 - Forks: 3

swaggy66/M-ABSA
M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis
Language: Python - Size: 30.5 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 1

deokhk/CBP
Official Repository for Cross-lingual Back-Parsing: Utterance Synthesis from Meaning Representation for Zero-Resource Semantic Parsing (EMNLP 2024)
Language: Python - Size: 7.32 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

Helsinki-NLP/lm-vs-mt
A Comparison of Language Modeling and Translation as Multilingual Pretraining Objectives
Language: Python - Size: 1.15 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

sristhilamichhane/multilingo
Its a language learning app. Using React, Material UI and Node js.
Language: JavaScript - Size: 257 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

harmonydata/harmony_original
The Harmony project
Language: Jupyter Notebook - Size: 2.77 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 1

dkalpakchi/quinductor
A multilingual data-driven method for generating reading comprehension questions
Language: Jupyter Notebook - Size: 7.21 MB - Last synced at: 27 days ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

AlokTheDataGuy/internship_projects
Multiple chatbots and NLP-based projects completed during my internship. Each project demonstrates different aspects of AI application development, from text summarization to multilingual chatbots.
Language: Python - Size: 4.53 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

GALA-MDS/Gala-External-Resources
This repository compiles and data sources created for the CHIST ERA 2025 proposal GALA.
Language: Jupyter Notebook - Size: 70.9 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

BeaEsparcia/Spanish_Text_Classification_BERT
Spanish text classifier using BERT to detect user intent (information request, complaint, or recommendation). Includes synthetic training data and custom ambiguous examples to test robustness. Portfolio project focused on intent recognition and conversational design in Spanish.
Language: Jupyter Notebook - Size: 434 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

swaggy66/MSMO
Multi-Scale and Multi-Objective Optimization for Cross-Lingual Aspect-Based Sentiment Analysis
Language: Python - Size: 1.84 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 1

joyou159/SWIZT Fork of MohamedAlaaAli/SWIZT
Exploring the use of multilingual transformers, specifically mBERT and XLM-RoBERTa, for named entity recognition (NER) in the context of Switzerland’s multi lingual environment.
Language: Jupyter Notebook - Size: 1.35 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Lucas-Granucci/MULTI-NER-OLD 📦
Repository for research project exploring the benefits of cross-lingual transfer learning and pseudo-labeling for multilingual named entity recognition.
Language: Python - Size: 312 KB - Last synced at: 5 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

pintamonas4575/NLP-text-detox-MAADM-UPM
NLP for detoxing language phrases in several languages.
Language: Jupyter Notebook - Size: 8.82 MB - Last synced at: 19 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

koushik16/Naive-Bayes-on-Multi-Language-Text
Implementation of Naive Bayes for text classification across multiple languages, focusing on natural language processing and multilingual text analysis.
Language: Python - Size: 2.93 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Wei-RongRong2/RojakLanguageSentimentAnalysis
This is a machine learning project focused on analysing and classifying sentiments in code-switched and code-mixed text, specifically targeting the unique linguistic characteristics found in Malaysian conversations.
Language: Jupyter Notebook - Size: 20.6 MB - Last synced at: 21 days ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

CristianBudala/Multilingual-Sentiment-Analysis-and-Intent-Classification
Multilingual sentiment analysis and intent classification in Romanian, Bachelors thesis
Language: Jupyter Notebook - Size: 837 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Rajarshi1001/IITK-SemEval-2024-Task-1
SemEval task 1: Semantic Textual Relatedness for the course CS779A
Language: Jupyter Notebook - Size: 9.92 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

muhammadravi251001/multilingual-qas-with-nli
Code Repository for Paper: Multilingual Question Answering System Utilizing Natural Language Inference.
Language: Jupyter Notebook - Size: 1.35 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

AnanthaRajuC/AIML_NLP
AIML Natural Language Processing - Speech, Audio
Language: Java - Size: 4.4 MB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Judy-Choi/NMT_Series
A collection of codes in a NMT series of Geultto 8th
Language: Jupyter Notebook - Size: 5.53 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

e-hossam96/CMU-CS11-737
Solutions of the CMU Multilingual Natural Language Processing Course
Language: Shell - Size: 120 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1
