An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: roberta

roshana1s/spam-message-classifier

A state-of-the-art spam message classifier built with RoBERTa transformer model, fine-tuned on multiple SMS spam datasets.

Language: Jupyter Notebook - Size: 1.38 MB - Last synced at: 42 minutes ago - Pushed at: about 2 hours ago - Stars: 2 - Forks: 0

Jagadish2494/Mumbai_Hacks

🌐 Combat misinformation with autonomous AI that detects, verifies, and responds to viral content in real-time during crises.

Language: JavaScript - Size: 33.2 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

hakant66/moderation-app

End-to-end moderation system for text & images. FastAPI backend with provider toggles (OpenAI & Google Perspective/Vision), PyTorch TorchScript prefilter microservice, policy.yaml rules, token-bucket rate limiting, in-memory caching, React/Vite frontend, training scripts, and Kubernetes manifests.

Language: Python - Size: 136 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

AL1218AL/oxidized

⚙️ Build a modern Vim/Neovim experience in Rust, focusing on performance, Unicode support, and an open framework for collaboration and growth.

Language: Rust - Size: 1.58 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

Lamorati92/LLMs-from-scratch

📚 Build and train your own GPT-like Large Language Model from scratch with clear guidance and real code examples.

Language: Jupyter Notebook - Size: 12.7 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

Jon-Sina/Benchmark_Embedding_Models

🔍 Benchmark embedding models by creating custom datasets to evaluate and compare their performance effectively.

Language: Shell - Size: 2.58 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

enessah00/adaptive-classifier

A flexible, adaptive classification system for dynamic text classification

Size: 1.95 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 0

aboGrabawy/sentiment-ai-suite

💬 Analyze sentiment in real-time with this AI-powered web app, offering intuitive interface and robust model integration for text analysis.

Language: Python - Size: 6.84 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

Adeyemi0/Fintech-Review-Prediction

Fintech Text Classification Model

Language: Jupyter Notebook - Size: 12.8 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

ashbix23/Misinformation-Classifier

Advanced NLP MLOps pipeline for misinformation detection, utilizing RoBERTa with LoRA (PEFT) for efficient fine-tuning. This project focuses on cross-domain generalization across the FakeNews-Kaggle and LIAR datasets, featuring robust data engineering, mixed-precision training, and comprehensive metric evaluation.

Language: Jupyter Notebook - Size: 162 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

microsoft/LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Language: Python - Size: 33.3 MB - Last synced at: 8 days ago - Pushed at: 11 months ago - Stars: 12,841 - Forks: 851

tae898/erc

The official implementation of "EmoBERTa: Speaker-Aware Emotion Recognition in Conversation with RoBERTa"

Language: Jupyter Notebook - Size: 124 MB - Last synced at: 9 days ago - Pushed at: almost 2 years ago - Stars: 98 - Forks: 26

fhamborg/news-please

news-please - an integrated web crawler and information extractor for news that just works

Language: Python - Size: 2.99 MB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 2,340 - Forks: 448

amansrivastava17/embedding-as-service

One-Stop Solution to encode sentence to fixed length vectors from various embedding techniques

Language: Python - Size: 1.93 MB - Last synced at: 8 days ago - Pushed at: over 2 years ago - Stars: 210 - Forks: 32

microsoft/DeBERTa

The implementation of DeBERTa

Language: Python - Size: 237 KB - Last synced at: 8 days ago - Pushed at: about 2 years ago - Stars: 2,164 - Forks: 237

lonePatient/awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Language: Python - Size: 909 KB - Last synced at: 13 days ago - Pushed at: 23 days ago - Stars: 5,425 - Forks: 507

ddihora1604/Mumbai_Hacks

A multi-agent AI system for autonomous real-time detection, verification, and response to viral misinformation on social media during crises.

Language: JavaScript - Size: 31.3 KB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

IlPakoZ/dlrnaberta-dti-prediction

A RoBERTa model is pretrained on RNA-sequences through a MLM task and used in conjuction to ChemBERTa to predict the binding of a drug to a RNA-based target.

Language: Python - Size: 2.09 MB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

BotanicalAmy/ConsumerComplaints

NLP analysis of consumer complaints narratives in the financial industry

Language: Jupyter Notebook - Size: 2.8 MB - Last synced at: 17 days ago - Pushed at: 18 days ago - Stars: 1 - Forks: 0

deepset-ai/FARM 📦

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

Language: Python - Size: 6.97 MB - Last synced at: 11 days ago - Pushed at: almost 2 years ago - Stars: 1,752 - Forks: 250

guillaume-be/rust-bert

Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)

Language: Rust - Size: 3.92 MB - Last synced at: 24 days ago - Pushed at: 5 months ago - Stars: 2,958 - Forks: 235

microsoft/AdaMix

This is the implementation of the paper AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning (https://arxiv.org/abs/2205.12410).

Language: Python - Size: 16.7 MB - Last synced at: 15 days ago - Pushed at: about 2 years ago - Stars: 135 - Forks: 10

PRISM-AILAB/MFNR

Official implementation of "A BERT-Based Multi-Embedding Fusion Method Using Review Text for Recommendation" (Expert Systems, 2025)

Language: Python - Size: 38.8 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 0 - Forks: 0

haoliuhl/language-quantized-autoencoders

Language Quantized AutoEncoders

Language: Python - Size: 37.1 KB - Last synced at: 1 day ago - Pushed at: over 2 years ago - Stars: 110 - Forks: 5

BrightBlueCheese/transformers_and_chemistry

The Role of Model Architecture and Scale in Predicting Molecular Properties: Insights from Fine-Tuning RoBERTa, BART, and LLaMA

Language: Jupyter Notebook - Size: 414 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 1

codelion/adaptive-classifier

A flexible, adaptive classification system for dynamic text classification

Language: Python - Size: 5.05 MB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 467 - Forks: 31

ymcui/Chinese-BERT-wwm

Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)

Language: Python - Size: 15.5 MB - Last synced at: 28 days ago - Pushed at: 4 months ago - Stars: 10,084 - Forks: 1,393

asyml/texar-pytorch

Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/

Language: Python - Size: 3.08 MB - Last synced at: 26 days ago - Pushed at: over 3 years ago - Stars: 747 - Forks: 113

HSaurabh0919/CTransformers

Implementing wide variety of transformers, fine tuning as well as trying architectural variants from various research papers and blogs.

Language: Jupyter Notebook - Size: 20.8 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 3 - Forks: 1

oxidized-transformers/oxidized-transformers

Modular Rust transformer/LLM library using Candle

Language: Rust - Size: 142 KB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 36 - Forks: 3

explosion/curated-transformers

🤖 A PyTorch library of curated Transformer models and their composable components

Language: Python - Size: 1.47 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 893 - Forks: 35

styfeng/TinyDialogues

Code & data for the EMNLP 2024 paper: Is Child-Directed Speech Effective Training Data for Language Models?

Language: Python - Size: 279 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 11 - Forks: 2

Md-Emon-Hasan/InformaTruth

Fine-tuned roberta-base classifier on the LIAR dataset. Aaccepts multiple input types text, URLs, and PDFs and outputs a prediction with a confidence score. It also leverages google/flan-t5-base to generate explanations and uses an Agentic AI with LangGraph to orchestrate agents for planning, retrieval, execution, fallback, and reasoning.

Language: Jupyter Notebook - Size: 9.61 MB - Last synced at: 24 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 1

iflytek/MiniRBT

MiniRBT (中文小型预训练模型系列)

Language: Python - Size: 17.4 MB - Last synced at: 28 days ago - Pushed at: 4 months ago - Stars: 295 - Forks: 18

Fatima0923/Personality-Prediction-with-LLMs

Code and experiments for the paper “Navigating Pathways to Automated Personality Prediction.” Implements NLP pipelines with Hugging Face models (RoBERTa, ALBERT, DistilBERT) to predict Big Five traits, comparing accuracy, efficiency, and sustainable AI trade-offs for marketing and consumer research.

Language: Jupyter Notebook - Size: 2.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

mim-solutions/bert_for_longer_texts

BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation allows fine-tuning.

Language: Python - Size: 4.43 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 145 - Forks: 32

ShaikhBorhanUddin/Transformer-Based-Sentiment-Analysis-of-Simmons-Bar-Google-Maps-Reviews

This project performs sentiment analysis on customer reviews from all branches of Simmons Bar collected via Google Maps. Reviews were scraped using Apify and Outscraper, then processed with transformer models RoBERTa and DeBERTa for classification of sentiments.

Size: 8.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Betswish/Cross-Lingual-Consistency

Easy-to-use framework for evaluating cross-lingual consistency of factual knowledge (Supported LLaMA, BLOOM, mT5, RoBERTa, etc.) Paper here: https://aclanthology.org/2023.emnlp-main.658/

Language: Python - Size: 16 MB - Last synced at: 2 days ago - Pushed at: 3 months ago - Stars: 26 - Forks: 1

ROIM1998/APT

[ICML'24 Oral] APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference

Language: Python - Size: 4.08 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 45 - Forks: 2

nlbse2023/issue-report-classification

NLBSE'23 Tool Competition on Issue Report Classification

Language: Jupyter Notebook - Size: 163 KB - Last synced at: 21 days ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 3

rupsu98-sys/Metaphor-Project

Metaphor detection with DistilRoBERTa — bridging NLP, cognitive semantics, and literary analysis. Fine-tuned RoBERTa model for classifying metaphor vs. literal language in text. Exploring how AI understands metaphor, connecting deep learning with human cognition.

Language: Jupyter Notebook - Size: 80.1 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

Meah01/google-play-absa-and-servqual-llm-pipeline

Hybrid ABSA sentiment analysis pipeline with Mistral 7B LLM integration for business intelligence. Multi-platform e-commerce review analysis with interactive dashboards.

Language: Python - Size: 2.12 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

sudharsan13296/Getting-Started-with-Google-BERT

Build and train state-of-the-art natural language processing models using BERT

Language: Jupyter Notebook - Size: 17 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 226 - Forks: 84

CLUEbenchmark/CLUE

中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard

Language: Python - Size: 2.43 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 4,184 - Forks: 547

sambhu431/Multifunctional-ChatBot-Fine-Tuned-Using-Roberta-Bart-Transformers

The project contains code and resources for a sophisticated AI-driven chatbot designed to provide accurate, context-aware responses. It uses RoBerta and BART Transformers and advance NLP techniques. The chatbot is capable of handling a wide range of domains such as healthcare , finance , etc..

Language: Jupyter Notebook - Size: 32.3 MB - Last synced at: 20 days ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

adapter-hub/efficient-task-transfer

Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

Language: Python - Size: 98.6 KB - Last synced at: about 2 months ago - Pushed at: almost 4 years ago - Stars: 37 - Forks: 4

NakerTheFirst/Sentiment-analysis

Analyse social media sentiment of OpenAI using LinkedIn data with NLP and transfer learning

Language: Python - Size: 20.2 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

Avaneesh40585/Email-Spam-Classification

Advanced email spam detection system featuring both BERT (DistilBERT) and optimized RoBERTa implementations. RoBERTa version includes anti-overfitting techniques (layer freezing, progressive dropout, early stopping), smart model detection, and achieves 99.48% accuracy with production-ready inference capabilities.

Language: Jupyter Notebook - Size: 2.83 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

kidist-amde/amharic-ir-benchmarks

Official codebase for the ACL 2025 Findings paper: Optimized Text Embedding Models and Benchmarks for Amharic Passage Retrieval.

Language: Jupyter Notebook - Size: 3.63 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 19 - Forks: 6

dpressel/mint

MinT: Minimal Transformer Library and Tutorials

Language: Python - Size: 123 KB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 257 - Forks: 15

newking9088/gpt_llama_rag_fine_tuning_classification

A repository for implementing and evaluating state-of-the-art LLM techniques including fine-tuning, Retrieval-Augmented Generation (RAG), and model evaluation.

Language: Jupyter Notebook - Size: 22.7 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

EricFillion/happy-transformer

Happy Transformer makes it easy to fine-tune and perform inference with NLP Transformer models.

Language: Python - Size: 18.8 MB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 539 - Forks: 69

ai-forever/model-zoo

NLP model zoo for Russian

Size: 22.5 MB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 48 - Forks: 2

infinitygabri/beginner-code-lab

# Beginner Code Lab **Beginner Code Lab** is a multi-language coding playground for those starting their coding journey. 🐙 Dive into web development, backend programming, or mobile app creation and enjoy hands-on practice in a supportive environment. 🌱

Language: TypeScript - Size: 289 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

Cyber-Mood/Emotion-Detection-Project

Multi-Label Emotion Detection based in GoEmotions Dataset with ML Models and Transformers.

Language: Jupyter Notebook - Size: 21.1 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

920232796/bert_seq2seq

pytorch实现 Bert 做seq2seq任务,使用unilm方案,现在也可以做自动摘要,文本分类,情感分析,NER,词性标注等任务,支持t5模型,支持GPT2进行文章续写。

Language: Python - Size: 3.45 MB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 1,300 - Forks: 211

920232796/bert_seq2seq_DDP

bert_seq2seq的DDP版本,支持bert、roberta、nezha、t5、gpt2等模型,支持seq2seq、ner、关系抽取等任务,无需添加额外代码,轻松启动DDP多卡训练。

Language: Python - Size: 312 KB - Last synced at: 6 days ago - Pushed at: about 3 years ago - Stars: 53 - Forks: 5

science-analyse/Named_Entity_Recognition

Named Entity Recognition Model for Azerbaijani Language

Language: Jupyter Notebook - Size: 16.6 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

eduagarcia/roberta-legal-portuguese

Related resources to the paper RoBERTaLexPT: A Legal RoBERTa Model pretrained with deduplication for Portuguese.

Size: 48.8 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 0

hrnrxb/Advanced-Sentiment-Classifier-RoBERTa-BiLSTM-Attention

Sentiment Analysis w/ RoBERTa + BiLSTM + Attention

Language: Jupyter Notebook - Size: 139 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

tanujgupta18/Sentiment-Analysis-of-Amazon-Reviews

Sentiment analysis of Amazon customer reviews using VADER and RoBERTa Transformers. Compares the results of both approaches to understand their differences and insights.

Language: Jupyter Notebook - Size: 2.55 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

labteral/ernie

Simple State-of-the-Art BERT-Based Sentence Classification with Keras / TensorFlow 2. Built with HuggingFace's Transformers.

Language: Python - Size: 326 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 201 - Forks: 31

MON3EMPASHA/Question-Answering-with-Transformers

Interactive Web app for multilingual question answering using state-of-the-art Transformer models (BERT, DistilBERT, RoBERTa, DeBERTa, and multilingual BERT) for English and Arabic. Compare answers, try sample data, and explore chatbot-ready features.

Language: Python - Size: 11.7 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

utkarsh-tekriwal/DocuMitra

AI Based Smart Assistant for Research Summarization

Language: Python - Size: 482 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

Samuela31/Sanskrit-Manuscripts-Revival-Using-Deep-Learning-Techniques

Restoring destroyed text in ancient Sanskrit manuscripts by predicting missing text using deep learning techniques. Mini project done in 3rd year of college using RoBERTa LLM, Tesseract OCR, and OpenCV.

Language: Jupyter Notebook - Size: 17.7 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

lonePatient/CLUE_pytorch

CLUE baseline pytorch CLUE的pytorch版本基线

Language: Python - Size: 340 KB - Last synced at: 15 days ago - Pushed at: over 5 years ago - Stars: 75 - Forks: 17

DataAnalystAcc/yelp_user_reviews

Capstone project analyzing Yelp reviews to identify high-potential business ideas using sentiment analysis and topic modeling.

Language: Jupyter Notebook - Size: 3.99 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Sourav-Grover/Sentiment-Analysis

NLP-based Sentiment Analysis using VADER and RoBERTa

Language: Jupyter Notebook - Size: 148 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

OpenRoberta/openroberta-lab

The programming environment »Open Roberta Lab« by Fraunhofer IAIS enables children and adolescents to program robots. A variety of different programming blocks are provided to program motors and sensors of the robot. Open Roberta Lab uses an approach of graphical programming so that beginners can seamlessly start coding. As a cloud-based application, the platform can be used without prior installation of specific software but runs in any popular browser, independent of operating system and device.

Language: JavaScript - Size: 460 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 132 - Forks: 123

VATHSAN08/Mental-Health-Sentiment-Analysis-using-Deep-Learning

# Mental Health Sentiment Analysis using Deep LearningThis project leverages deep learning to classify mental health-related sentiments from text into seven categories: Anxiety, Bipolar, Depression, Normal, Personality Disorder, Stress, and Suicidal. By utilizing advanced NLP techniques, we aim to enhance understanding and support for mental well

Language: Jupyter Notebook - Size: 4.12 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

declare-lab/RECCON

This repository contains the dataset and the PyTorch implementations of the models from the paper Recognizing Emotion Cause in Conversations.

Language: Python - Size: 15.9 MB - Last synced at: 3 months ago - Pushed at: almost 3 years ago - Stars: 182 - Forks: 30

myriamgoyet/Customer-sentiment-analysis

Final project for Jedha Bootcamp - Data Science & Engineering. Sentiment and thematic analysis of customer reviews. Development of a dashboard for analysis and benchmarking of restaurants across the USA. Automated review response generation using large language models (LLMs)

Language: Jupyter Notebook - Size: 22.9 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

hyeonsangjeon/AWS-LLM-SageMaker

SageMaker Ployglot based RAG opensearch

Language: Jupyter Notebook - Size: 2.81 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 16 - Forks: 2

dsfsi/zabantu-beta

ZaBantu is a fleet of light-weight Masked Language Models for Southern Bantu Languages

Language: Python - Size: 3.12 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

AadityaArunSingh/RoBERTa-Token-Classification-with-Additional-PLODv2-Data

This repo explores token classification for abbreviation and long-form detection using RoBERTa. We evaluate the impact of adding 50% of the PLODv2-filtered dataset, achieving improved F1 and recall. The repo includes methodology, evaluation using seqeval, and confusion matrix analysis.

Language: Python - Size: 11.7 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

anjalichennupati/AI_Generated_Text_Detection_using_LORA_based_fine_tuning_of_LLM_Models

A project that identifies AI generated vs Human Written text using LLM Transformer models like RoBERTa, BART and GPT2 integrated with LoRA

Language: Jupyter Notebook - Size: 154 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 1

firojalam/crisis_datasets_benchmarks

Crisis Dataset for Benchmarks Experiments

Language: Python - Size: 1.41 MB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 17 - Forks: 4

jessevig/bertviz

BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

Language: Python - Size: 194 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 7,433 - Forks: 827

Giant77/indonesian-news-NER

Sistem yang berpotensi mampu secara otomatis mengekstrak entitas-entitas penting ini dari artikel berita politik berbahasa Indonesia. Kemampuan ini diharapkan dapat mempermudah analisis konten berita, identifikasi aktor dan isu penting, serta pemantauan sentimen publik terkait peristiwa politik.

Language: Jupyter Notebook - Size: 48.8 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

CLUEbenchmark/CLUENER2020

CLUENER2020 中文细粒度命名实体识别 Fine Grained Named Entity Recognition

Language: Python - Size: 867 KB - Last synced at: 5 months ago - Pushed at: almost 3 years ago - Stars: 1,491 - Forks: 301

brightmart/roberta_zh

RoBERTa中文预训练模型: RoBERTa for Chinese

Language: Python - Size: 308 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 2,718 - Forks: 413

SMAPPNYU/SMaBERTa

Wrapper for stable version of RoBERTa language models

Language: Python - Size: 173 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 11 - Forks: 9

KLUE-benchmark/KLUE

📖 Korean NLU Benchmark

Size: 44.7 MB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 572 - Forks: 56

brightmart/albert_zh

A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型

Language: Python - Size: 2 MB - Last synced at: 5 months ago - Pushed at: almost 3 years ago - Stars: 3,976 - Forks: 752

HHousen/TransformerSum

Models to perform neural summarization (extractive and abstractive) using machine learning transformers and a tool to convert abstractive summarization datasets to the extractive task.

Language: Python - Size: 11.7 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 431 - Forks: 57

dbiir/UER-py

Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo

Language: Python - Size: 50.5 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 3,064 - Forks: 522

MasoudKargar/RBMD

RBMD: RoBERTa-Based Module Detection in Multi-Programming Language Software Systems

Language: C++ - Size: 58.5 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

cosmaadrian/nli-stress-test

Official repository for the EMNLP 2024 paper "How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics"

Language: Python - Size: 52.7 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

grammarly/gector

Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagging" (BEA-21)

Language: Python - Size: 669 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 920 - Forks: 220

CLUEbenchmark/CLUEPretrainedModels

高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型

Language: Python - Size: 789 KB - Last synced at: 5 months ago - Pushed at: over 5 years ago - Stars: 817 - Forks: 96

Dhanush-R-git/MH-Analysis

The MHRoberta is Mental Health Roberta model. The pretrained Roberta transformer based model fine-tunned on Mental Health dataset by adopting PEFT method.

Language: Jupyter Notebook - Size: 3.67 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

Tencent/TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

Language: C++ - Size: 4.04 MB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 1,522 - Forks: 200

Tencent/TencentPretrain

Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo

Language: Python - Size: 41.2 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 1,070 - Forks: 147

jaketae/koclip

KoCLIP: Korean port of OpenAI CLIP, in Flax

Language: Python - Size: 27.9 MB - Last synced at: 6 months ago - Pushed at: about 2 years ago - Stars: 151 - Forks: 18

newking9088/product_recommendation_nlp_roberta_vader

Sentiment-Enhanced Product Recommendation System for E-Commerce: A Comparative Analysis of RoBERTa and VADER

Language: Jupyter Notebook - Size: 13.5 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

CLUEbenchmark/CLUECorpus2020

Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料

Size: 308 KB - Last synced at: 6 months ago - Pushed at: about 3 years ago - Stars: 957 - Forks: 81

nipunsadvilkar/roberta-base-mr

RoBERTa Marathi Language model trained from scratch during huggingface 🤗 x flax community week

Language: Python - Size: 440 KB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 28 - Forks: 4

hritik5102/Fake-news-classification-model

✨ Fake news classification using source adaptive framework - BE Project 🎓The repository contains Detailed Documentation of the project, Classification pipeline, Architecture, System Interface Design, Tech stack used.

Language: Jupyter Notebook - Size: 125 MB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 41 - Forks: 8

clue-ai/PromptCLUE

PromptCLUE, 全中文任务支持零样本学习模型

Language: Jupyter Notebook - Size: 15.9 MB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 663 - Forks: 67

AnirudhSinghBhadauria/whelm

Whelm provides creators with insights into audience perception and sentiment by analyzing YouTube comments. It processes these comments to identify patterns and trends, helping creators understand their viewers better and improve their content.

Language: Python - Size: 3.31 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0