An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: mbert

Soumyo001/sentiment-emotion_detection_on_bengali_product_reviews

Sentiment and emotion detection using mBERT and XLM-R. It comes with a trained model which you can download and test it. Read below for instructions.

Language: Jupyter Notebook - Size: 15.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

lirondos/lazaro

An observatory of anglicism usage in the Spanish press

Language: Python - Size: 122 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 10 - Forks: 2

cambridgeltl/ContrastiveBLI

Improving Word Translation via Two-Stage Contrastive Learning (ACL 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-Lingual Word Embeddings.

Language: Python - Size: 7.72 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 34 - Forks: 10

joyou159/SWIZT Fork of MohamedAlaaAli/SWIZT

Exploring the use of multilingual transformers, specifically mBERT and XLM-RoBERTa, for named entity recognition (NER) in the context of Switzerland’s multi lingual environment.

Language: Jupyter Notebook - Size: 1.35 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

fatemafaria142/BanglaCalamityMMD-A-Comprehensive-Benchmark-Dataset-for-Multimodal-Disaster-Identification Fork of Mukaffi28/BanglaCalamityMMD-A-Comprehensive-Benchmark-Dataset-for-Multimodal-Disaster-Identification

This study presents a novel multimodal fusion technique for disaster identification in Bangla, combining text and image data using the "BanglaCalamityMMD" dataset. Employing DisasterTextNet, DisasterImageNet, and DisasterMultFusionNet, the approach addresses a key gap in Bangla disaster research.

Language: Jupyter Notebook - Size: 290 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Ali-Mhrez/Stance-Detection-MBERT-Features

This repository provides the implementation and results of experiments using Multilingual BERT (mBERT) features as input to CNN and LSTM architectures for the task of Stance Detection. we explore two approaches to feature extraction: using the final layer, and concatenating the features from the last four layers.

Language: Jupyter Notebook - Size: 223 KB - Last synced at: 17 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Ali-Mhrez/Stance-Detection-MLLM

This repository provides code to fine-tune four multi-lingual language models (MBERT, XLM-RoBERTa, DistilmBERT, and MDeBERTa) on AraStance dataset (Alhindi et al., 2021). The repository includes notebooks for training, evaluation, and making predictions with the fine-tuned models.

Language: Jupyter Notebook - Size: 74.2 KB - Last synced at: 24 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

jessicasaikia/multilingual-BERT-mBERT

This repository implements a Multilingual BERT (mBERT) model for performing Parts-of-Speech (POS) Tagging on Assamese-English code-mixed texts.

Language: Python - Size: 11.7 KB - Last synced at: 26 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

fatemafaria142/MultiBanFakeDetect-An-Extensive-Benchmark-Dataset-for-Multimodal-Bangla-Fake-News-Detection

This study introduces MultiBanFakeDetect, a novel multimodal dataset for Bangla fake news detection, combining textual and visual information. It features TextFakeNet for text analysis and MultiFusionFake for integrating multimodal data.

Language: Jupyter Notebook - Size: 308 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 4 - Forks: 1

csebuetnlp/banglabert

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: NAACL-2022.

Language: Python - Size: 1.14 MB - Last synced at: 9 months ago - Pushed at: over 2 years ago - Stars: 230 - Forks: 31

DiFronzo/Multilingual-Models

mBERT and XLM-R for encodeing of Scandinavian languages

Language: Python - Size: 518 KB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

MusfiqDehan/Multilingual-Sentence-Alignments-Demo

Align Parallel Sentence of 104 Languages with the help of mBERT and LaBSE

Language: Python - Size: 33.2 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Revanth-Reddy-Pingala/Abusive_Comment_Detector_BERT

Fine tuned BERT, mBERT and XLMRoBERTa for Abusive Comments Detection in Telugu, Code-Mixed Telugu and Telugu-English.

Language: Jupyter Notebook - Size: 55.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Mukaffi28/Vashantor-A-Large-scale-Multilingual-Benchmark-Dataset

A Large-scale Multilingual Benchmark Dataset for Automated Translation of Bangla Regional Dialects to Bangla Language

Language: Jupyter Notebook - Size: 141 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 4

NasserMohamedEid/Text-AI-Detection

Language: Jupyter Notebook - Size: 64.1 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

mbruton0426/GalicianSRL

Collection of scripts used to create SRL datasets for Galician and Spanish using a verbal indexing method, as well as fine-tuned BERT and XLM-R models for SRL on each language

Language: Python - Size: 56.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

RobinSmits/GPT-3.5-FineTuning

GPT 3.5 FineTuning

Language: Jupyter Notebook - Size: 5.76 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ShafakatArnob/Bengali-Misogyny-Identification-Deep-Learning-LIME

Bengali Misogyny Identification with Deep Learning and LIME.

Language: Jupyter Notebook - Size: 2.45 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

michaelpeterhoffmann/masterthesis

Multilingual hate speech detection for German, Italian and Spanish Social Media Posts #machine learning #classifier

Language: Jupyter Notebook - Size: 460 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

negar-foroutan/multiLMs-lang-neutral-subnets

[EMNLP 2022] Discovering Language-neutral Sub-networks in Multilingual Language Models.

Language: Python - Size: 831 KB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 8 - Forks: 1

Koharu24/mBERT_crosslingual_rd Fork of SalamanderXing/mBERT_crosslingual_rd

This is a project proposal to implement Yan et al.'s (2020) mBERT-Unaligned for cross-lingual RDs with Japanese, German and Italian untranslatable terms

Language: Python - Size: 30.9 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

peterzee-tsien/LING484-COMP599-Final-Projects

By using the hypothesis of historical linguistics, we found a way to improve the performance of multilingual transformers with limited amount of data

Language: Jupyter Notebook - Size: 1.88 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

Koharu24/mBERT-Unaligned-fine-tuning-for-a-cross-lingual-RD-of-untranslatable-terms

This is a project proposal to implement Yan et al.'s (2020) mBERT-Unaligned for cross-lingual RDs with Japanese, German and Italian untranslatable terms

Size: 685 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

juletx/multilingual-question-answering

Zero-shot and Translation Experiments on XQuAD, MLQA and TyDiQA

Language: Jupyter Notebook - Size: 6.34 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 0

honghanhh/definition_extraction

Slovenian Definition Extraction

Language: Python - Size: 3.21 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

BassaniRiccardo/ICEBERT Fork of huggingface/transformers

ICEBERT: Interlingual-Clusters Enhanced BERT. A BERT-like model trained on clusters of monolingual subwords.

Language: Python - Size: 184 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

ishan00/meta-learning-for-multi-task-multilingual

Official Repository for the paper titled "Meta-Learning for Effective Multi-task and Multilingual Modelling" accepted at EACL 2021

Language: Python - Size: 92.8 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 2

AditiBagora/Hasoc2021CodeMix

HASOC2021: Subtask 2 a) Codemix Challenge; Contains baselines and hierarchical approach that extracts the relevant context useful for classification of hostile tweets on English-Hindi code-mix data obtained from twitter.

Language: Jupyter Notebook - Size: 26.4 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

elsheikh21/cross-natural-language-inference

ZeroShot XNLI

Language: Python - Size: 1.65 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

Elijas/lithuanian-text-summarization-model

Deployed model which can summarize Lithuanian language text by leveraging Artificial Neural Networks, Transformers, mBERT.

Language: Python - Size: 37.1 KB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

Related Keywords
mbert 30 xlm-roberta 10 transformers 7 pytorch 6 bert 6 multilingual-bert 5 fine-tuning 5 nlp 4 nlp-machine-learning 3 mt5 3 xlm-r 3 machine-translation 3 bert-fine-tuning 2 bengali-nlp 2 benchmark 2 stance-detection 2 pos-tagger 2 banglabert 2 reverse-dictionary 2 arabic-language 2 question-answering 2 multilingual-nlp 2 mdistilbert 2 multilingual-models 2 named-entity-recognition 2 natural-language-inference 2 streamlit 2 spanish 2 large-language-models 2 cross-lingual-transfer 2 multilingual 2 python 2 torch 2 deep-learning 2 dataset 2 bengali-sexism 1 crosslingual 1 bengali-misogyny 1 deep-neural-networks 1 explainable-ai 1 f1-score 1 lime 1 misogyny-detection 1 zero-shot 1 pretrained-language-models 1 bangla-bert-base 1 banglat5 1 neural-machine-translation 1 regional-dialects 1 arabert 1 llm 1 roberta-model 1 galician 1 semantic-parsing 1 semantic-role-labeling 1 srl-parser 1 deberta-v3 1 dutch-language 1 gpt-3-5-turbo 1 gpt-35-turbo 1 openai-api 1 prompt-engineering 1 accuracy 1 binary-classifier 1 language-models 1 rule-based-classifier 1 slovenian 1 xlmr 1 clustering 1 subword-segmentation 1 tokenization 1 meta-learning 1 multi-task-learning 1 paraphrase-identification 1 part-of-speech-tagging 1 reptile 1 feature-extraction 1 mlp 1 tensorflow 1 xlm 1 xnli 1 ann 1 language-model 1 summarization 1 sexism-detection 1 svm-classifier 1 transfer-learning 1 transformer 1 xlmroberta 1 lottery-ticket-hypothesis 1 multilingual-language-models 1 untranslatability 1 ner 1 swahili 1 wolof 1 yoruba 1 cross-linguistic-data 1 unaligned 1 mlqa 1 roberta 1