GitHub topics: multilingual-models
Astrotomic/laravel-translatable
A Laravel package for multilingual models
Language: PHP - Size: 3.15 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 1,311 - Forks: 170

MilaNLProc/contextualized-topic-models
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
Language: Python - Size: 32 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 1,228 - Forks: 152

anidixit64/LexicaForge
LexicaForge is a comprehensive natural language processing (NLP) toolkit designed for multilingual text analysis and processing. It provides a robust set of tools for text preprocessing, language detection, tokenization, and advanced NLP tasks, with a focus on scalability and performance.
Size: 179 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

ZEZE1020/Lake-Guard
Lake Guard is a Flask-based application designed to monitor and protect the ecosystem of Lake Victoria. The project utilizes advanced technologies, including the Vambo Multilingual API, to provide real-time environmental data and insights.
Language: HTML - Size: 241 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Language: Python - Size: 4.49 MB - Last synced at: 18 days ago - Pushed at: about 1 month ago - Stars: 2,368 - Forks: 181

floatai/HumanEval-XL
[LREC-COLING'24] HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization
Language: Python - Size: 8 MB - Last synced at: 19 days ago - Pushed at: 2 months ago - Stars: 39 - Forks: 4

OpenNyAI/Jugalbandi-Manager
Jugalbandi (JB) Manager is a full AI-powered conversational chatbot platform. It's platform agnostic and can serve multiple channels such as WhatsApp or custom web interfaces. It can handle conversations in both text and voice across any language. It comes with Bhashini Speech models out of the box and can failover to Azure.
Language: Python - Size: 12 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 30 - Forks: 34

frotms/PaddleOCR2Pytorch
PaddleOCR inference in PyTorch. Converted from [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
Language: Python - Size: 68.2 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 933 - Forks: 180

nitya/model-mondays Fork of microsoft/model-mondays
Model Mondays is a weekly livestreamed series on Microsoft Reactor that helps you make informed model choice decisions with timely updates and model deep-dives. Watch live for the content. Join Discord for the discussions.
Language: Jupyter Notebook - Size: 6.73 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

microsoft/Litmus
AI Assistant for Building Reliable, High-performing and Fair Multilingual NLP Systems
Language: Python - Size: 8.13 MB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 46 - Forks: 9

Ashish-Soni08/cohere-mcp-server
Cohere MCP Server
Language: Python - Size: 2.93 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

sail-sg/sailor2
π± Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Size: 497 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 52 - Forks: 3

cisnlp/Glot500
Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023
Language: Python - Size: 151 KB - Last synced at: 12 minutes ago - Pushed at: about 1 year ago - Stars: 100 - Forks: 4

ai-forever/mgpt
Multilingual Generative Pretrained Model
Language: Jupyter Notebook - Size: 6.18 MB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 206 - Forks: 23

joaoaleite/ToLD-Br
Toxic Language Detection in Social Media for Brazilian Portuguese: New Dataset and Multilingual Analysis
Language: Jupyter Notebook - Size: 12.1 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 42 - Forks: 7

build-ai-applications/indic-llm-eval
Multilingual Performance Benchmarking
Language: Jupyter Notebook - Size: 565 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

cambridgeltl/sail-bli
Self-Augmented In-Context Learning for Unsupervised Word Translation (ACL 2024). Keywords: Bilingual Lexicon Induction, Word Translation, Large Language Models, LLMs.
Language: Python - Size: 445 KB - Last synced at: about 23 hours ago - Pushed at: 9 months ago - Stars: 3 - Forks: 1

cambridgeltl/prompt4bli
On Bilingual Lexicon Induction with Large Language Models (EMNLP 2023). Keywords: Bilingual Lexicon Induction, Word Translation, Large Language Models, LLMs.
Language: Python - Size: 86.9 KB - Last synced at: about 23 hours ago - Pushed at: 4 months ago - Stars: 10 - Forks: 2

sitamgithub-MSIT/PicQ
PicQ: Demo for MiniCPM-o 2.6 to answer questions about images using natural language.
Language: Python - Size: 4.74 MB - Last synced at: 5 days ago - Pushed at: 4 months ago - Stars: 4 - Forks: 1

sitamgithub-MSIT/VidiQA
VidiQA: Demo for MiniCPM-V 2.6 to answer questions about videos using natural language.
Language: Python - Size: 6.99 MB - Last synced at: 5 days ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

Blacksujit/ChatSphere
ChatSphere is an advanced conversational AI bot designed to engage users in natural and dynamic interactions. Utilizing the lightweight facebook/opt-125m model, ChatSphere delivers intelligent responses that enhance user experience and facilitate meaningful conversations. This innovative chatbot is engineered to understand context, sentiment,intent
Language: Python - Size: 163 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

PyThaiNLP/WangChanGLM
WangChanGLM π -βThe Multilingual Instruction-Following Model
Language: Jupyter Notebook - Size: 3.02 MB - Last synced at: 29 days ago - Pushed at: over 1 year ago - Stars: 94 - Forks: 7

kaistAI/LangBridge
[ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervision
Language: Python - Size: 3.73 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 71 - Forks: 8

azminewasi/AyaFestPe
Developing AyaFestPe, A Multi-lingual and Multi-cultural Festival Exploration Guide
Language: Jupyter Notebook - Size: 1.24 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

backprop-ai/backprop
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
Language: Python - Size: 5.46 MB - Last synced at: 16 days ago - Pushed at: about 4 years ago - Stars: 243 - Forks: 12

csebuetnlp/banglabert
This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: NAACL-2022.
Language: Python - Size: 1.14 MB - Last synced at: 9 months ago - Pushed at: over 2 years ago - Stars: 230 - Forks: 31

firojalam/COVID-19-disinformation
Dataset: Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society
Language: Jupyter Notebook - Size: 875 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 11 - Forks: 4

lwachowiak/Multilingual-Metaphor-Detection
The multilingual language model XLM-R fine-tuned for metaphor detection on a token-level using Huggingface
Language: Jupyter Notebook - Size: 2.37 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 19 - Forks: 5

XaverKrueckl/BaySIDshot
Master Thesis on Analyzing Slot and Intent Detection for Upper German Dialects via Zero-Shot Transfer Learning
Language: Python - Size: 3.14 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

SINGHxTUSHAR/ANUVADAK
This Project is based on multilingual Translation by using the Transformer with an encoder-decoder architecture along with the multi-head self-attention layers with the positional encoding and embedding for better result and accuracy. Overall, this model converts the English to French language using various Techniques of NLP and DL.
Language: Jupyter Notebook - Size: 27.8 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

aaaastark/NBART-Multilingual-Translator
This repository contains a Python script that uses a pre-trained NBART (Neural Bidirectional AutoRegressive Transformer) model to perform multi-lingual translation tasks between several languages. The model was trained on multiple language pairs using data parallelism, allowing it to learn representations across all languages simultaneously.
Language: Jupyter Notebook - Size: 20.5 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

uheal/machine-translation-models
This repository offers an evaluation of machine translation models for healthcare, focusing on languages like Telugu, Hindi, Arabic, and Swahili. It emphasizes accuracy and medical terminology, aiming to enhance medical communication across diverse languages. The dataset used in evaluation is provided.
Size: 62.5 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

WideSu/Vanilla-NER Fork of AmazingDD/Vanilla-NER
A multi-lingual named entity classifier to perform named entity recognition (NER) on two datasets, International: CoNLL 2003, Chinese: Weibo. We used the current state-of-the-art model to test on CoNLL++ dataset, achieved a F1-score of 94.3% with pooled-embeddings.
Language: Jupyter Notebook - Size: 125 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

jpWang/LiLT
Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)
Language: Python - Size: 1.36 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 282 - Forks: 34

TapasKumarDutta1/multilingial
Multilingual Text Classification Employing Both Single and Multilingual Models augmented using Soft Labels.
Language: Jupyter Notebook - Size: 2.09 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

LLM-low-resource-lang/LLM-low-resource-lang.github.io
LLMs for Low Resource Languages in Multilingual, Multimodal and Dialectal Settings
Language: HTML - Size: 37.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

INK-USC/XCSR
Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"
Language: Python - Size: 60.7 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 20 - Forks: 2

AI4Bharat/Indic-BERT-v1
Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.com/AI4Bharat/IndicBERT
Language: Python - Size: 600 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 259 - Forks: 41

fajri91/Multi_SummEval
Evaluating the Efficacy of Summarization Evaluation across Languages. In Findings of ACL 2021.
Language: Jupyter Notebook - Size: 65.6 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 1

Sigil-Wen/TTS Fork of coqui-ai/TTS
XTTS: Multilingual Voice Cloning TTS Model by Coqui Deployed to Replicate
Language: Python - Size: 126 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 19 - Forks: 6

esoyeon/Multilingual-StyleCLIP
Multilingual-StyleCLIP is a model that can edit StyleGAN2 's images with a multilingual text prompt
Language: Python - Size: 112 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

juletx/self-translate
Do Multilingual Language Models Think Better in English?
Language: Jupyter Notebook - Size: 60.8 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 20 - Forks: 2

yzse/llm-multilingual-colexification
Studying multi-lingual representational similarities using Facebook's fastText and Language Model Models (LLMs) for cross-linguistic analysis.
Language: Jupyter Notebook - Size: 50.3 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

franciellevargas/Deceiver
Multilingual discourse-annotated dataset for fake news detection
Size: 693 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Data-Science-kosta/Long-texts-Sentiment-Analysis-RoBERTa
PyTorch implementation of Sentiment Analysis of the long texts written in Serbian language (which is underused language) using pretrained Multilingual RoBERTa based model (XLM-R) on the small dataset.
Language: Jupyter Notebook - Size: 8.5 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 18 - Forks: 7

asimokby/formality-bias-analysis
This repo contains the annotations and other artifacts of the paper titled: In What Languages are Generative Language Models the Most Formal? Analyzing Formality Distribution across Languages
Size: 1.27 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

KnowledgeDiscovery/MuSES Fork of yvesx/MuSES
Code for "Multilingual Sentiment Elicitation System for Social Media Data" @ IEEE Intelligent Systems
Language: Python - Size: 1.37 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

MarkusSagen/Master-Thesis-Multilingual-Longformer
Master thesis with code investigating methods for incorporating long-context reasoning in low-resource languages, without the need to pre-train from scratch. We investigated if multilingual models could inherit these properties by making it an Efficient Transformer (s.a. the Longformer architecture).
Language: Jupyter Notebook - Size: 10.4 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 26 - Forks: 8

evelynkyl/xRAD_multilingual_dialog_systems
Codes for master's thesis investigating approaches for building a multilingual, knowledge-grounded dialogue system via cross-task and cross-lingual transfer learning.
Language: Python - Size: 5.01 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

mobassir94/Multilingual-Speech-to-Speech-Translator
Multilingual Speech to Speech (STS) Translator is the First Ever Code-mixed English-Arabic speech to Bangla-Arabic Speech Translator
Language: Jupyter Notebook - Size: 30.7 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 1

ishan00/meta-learning-for-multi-task-multilingual
Official Repository for the paper titled "Meta-Learning for Effective Multi-task and Multilingual Modelling" accepted at EACL 2021
Language: Python - Size: 92.8 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 7 - Forks: 2

Priyanshiguptaaa/FilterMisalignedTranslationPairs
A model-based cleaner using Laser sentence embeddings to exploit embeddings to filter misaligned segment pairs. Product scaled by asynchronously building the Task Queues, dispatching the tasks in a Round Robin method and adding multiple workers on the RabbitMQ server for consumption.
Language: Python - Size: 263 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

franciellevargas/Hurtlex Fork of valeriobasile/hurtlex
A multilingual lexicon of words to hurt.
Size: 6.24 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

vitthal-bhandari/Homophobia-Transphobia-Detection
Code for the shared task on homophobia/transphobia detection at LT-EDI Workshop @ ACL 2022
Language: Jupyter Notebook - Size: 2.24 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 1

margaritageleta/multilingual-toxicity-detector
NLP deep learning model for multilingual toxicity detection in text π
Language: Jupyter Notebook - Size: 3.74 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 4 - Forks: 0

Swayatta/Unsupervised-Cross-lingual-Alignment-of-Knowledge-Base-Triples-with-Sentences
Cross-lingual alignment model for creating an aligned corpus of Hindi sentences aligned with English fact triples.
Language: Jupyter Notebook - Size: 2.8 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

wesleykwong/Myers-Brigg-Classification
BERT classification of Myers-Brigg personality types based on Twitter tweets in four different European languages.
Language: Jupyter Notebook - Size: 16.5 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0
