An open API service providing repository metadata for many open source software ecosystems.

Topic: "instruction-tuning"

hiyouga/LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Language: Python - Size: 51.1 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 54,951 - Forks: 6,759

haotian-liu/LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language: Python - Size: 13.4 MB - Last synced at: 3 days ago - Pushed at: 12 months ago - Stars: 23,143 - Forks: 2,556

BradyFU/Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Size: 82.9 MB - Last synced at: 2 days ago - Pushed at: 18 days ago - Stars: 15,915 - Forks: 1,042

RUCAIBox/LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

Language: Python - Size: 43.1 MB - Last synced at: 3 days ago - Pushed at: 5 months ago - Stars: 11,695 - Forks: 915

modelscope/data-juicer

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Language: Python - Size: 319 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 4,878 - Forks: 252

yizhongw/self-instruct

Aligning pretrained language models with instruction data generated by themselves.

Language: Python - Size: 58.6 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 4,381 - Forks: 507

Instruction-Tuning-with-GPT-4/GPT-4-LLM

Instruction Tuning with GPT-4

Language: HTML - Size: 82.7 MB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 4,320 - Forks: 305

NExT-GPT/NExT-GPT

Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model

Language: Python - Size: 125 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 3,499 - Forks: 352

PKU-YuanGroup/Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Language: Python - Size: 113 MB - Last synced at: 27 days ago - Pushed at: 8 months ago - Stars: 3,294 - Forks: 235

EvolvingLMMs-Lab/Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Language: Python - Size: 7.39 MB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 3,263 - Forks: 210

DSXiangLi/DecryptPrompt

总结Prompt&LLM论文,开源数据&模型,AIGC应用

Size: 2.25 GB - Last synced at: 11 days ago - Pushed at: 12 days ago - Stars: 3,131 - Forks: 310

InternLM/InternLM-XComposer

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Language: Python - Size: 200 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2,834 - Forks: 172

PhoebusSi/Alpaca-CoT

We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!

Language: Jupyter Notebook - Size: 137 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 2,758 - Forks: 253

X-PLUG/mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

Language: Python - Size: 33.5 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 2,501 - Forks: 185

OpenGVLab/InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Language: Python - Size: 53.2 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1,915 - Forks: 112

cambrian-mllm/cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language: Python - Size: 1.99 MB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 1,905 - Forks: 132

bespokelabsai/curator

Synthetic data curation for post-training and structured data extraction

Language: Python - Size: 62.6 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 1,433 - Forks: 114

zjunlp/KnowLM

An Open-sourced Knowledgable Large Language Model Framework.

Language: Python - Size: 38.7 MB - Last synced at: 3 days ago - Pushed at: 7 months ago - Stars: 1,330 - Forks: 132

yaodongC/awesome-instruction-dataset

A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)

Size: 33.2 KB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 1,126 - Forks: 56

datadreamer-dev/DataDreamer

DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models.   🤖💤

Language: Python - Size: 895 KB - Last synced at: 5 days ago - Pushed at: 6 months ago - Stars: 1,039 - Forks: 54

NVlabs/DoRA

[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation

Language: Python - Size: 3.06 MB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 797 - Forks: 57

HKUDS/GraphGPT

[SIGIR'2024] "GraphGPT: Graph Instruction Tuning for Large Language Models"

Language: Python - Size: 36.5 MB - Last synced at: 26 days ago - Pushed at: about 1 year ago - Stars: 755 - Forks: 78

yaotingwangofficial/Awesome-MCoT

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Size: 11.5 MB - Last synced at: 14 days ago - Pushed at: 15 days ago - Stars: 701 - Forks: 19

FudanDISC/DISC-FinLLM

DISC-FinLLM,中文金融大语言模型(LLM),旨在为用户提供金融场景下专业、智能、全面的金融咨询服务。DISC-FinLLM, a Chinese financial large language model (LLM) designed to provide users with professional, intelligent, and comprehensive financial consulting services in financial scenarios.

Language: Python - Size: 36 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 691 - Forks: 80

ContextualAI/gritlm

Generative Representational Instruction Tuning

Language: Jupyter Notebook - Size: 11.3 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 654 - Forks: 47

hkust-nlp/deita

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

Language: Python - Size: 240 KB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 554 - Forks: 29

bigscience-workshop/xmtf

Crosslingual Generalization through Multitask Finetuning

Language: Jupyter Notebook - Size: 28.6 MB - Last synced at: 2 months ago - Pushed at: 10 months ago - Stars: 533 - Forks: 39

salesforce/DialogStudio

DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection and Instruction-Aware Models for Conversational AI

Language: Python - Size: 13 MB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 500 - Forks: 34

RenzeLou/awesome-instruction-learning

Papers and Datasets on Instruction Tuning and Following. ✨✨✨

Language: Python - Size: 6.25 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 498 - Forks: 24

mindspore-courses/step_into_llm

MindSpore online courses: Step into LLM

Language: Jupyter Notebook - Size: 246 MB - Last synced at: 3 days ago - Pushed at: 14 days ago - Stars: 476 - Forks: 122

princeton-nlp/LESS

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

Language: Jupyter Notebook - Size: 366 KB - Last synced at: 3 days ago - Pushed at: 9 months ago - Stars: 469 - Forks: 46

yuanze-lin/Olympus

[CVPR 2025 Highlight] Official code for "Olympus: A Universal Task Router for Computer Vision Tasks"

Language: Python - Size: 3.5 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 425 - Forks: 71

HugAILab/HugNLP

CIKM2023 Best Demo Paper Award. HugNLP is a unified and comprehensive NLP library based on HuggingFace Transformer. Please hugging for NLP now!😊

Language: Python - Size: 4.16 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 370 - Forks: 45

HKUDS/UrbanGPT

[KDD'2024] "UrbanGPT: Spatio-Temporal Large Language Models"

Language: Python - Size: 15.5 MB - Last synced at: 26 days ago - Pushed at: 3 months ago - Stars: 369 - Forks: 47

HenryHZY/Awesome-Multimodal-LLM

Research Trends in LLM-guided Multimodal Learning.

Size: 17.6 KB - Last synced at: 18 days ago - Pushed at: almost 2 years ago - Stars: 358 - Forks: 16

zhilizju/Awesome-instruction-tuning

A curated list of awesome instruction tuning datasets, models, papers and repositories.

Language: Python - Size: 6.01 MB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 335 - Forks: 14

ZebangCheng/Emotion-LLaMA

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning

Language: Python - Size: 12.7 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 326 - Forks: 35

ictnlp/BayLing

“百聆”是一个基于LLaMA的语言对齐增强的英语/中文大语言模型,具有优越的英语/中文能力,在多语言和通用任务等多项测试中取得ChatGPT 90%的性能。BayLing is an English/Chinese LLM equipped with advanced language alignment, showing superior capability in English/Chinese generation, instruction following and multi-turn interaction.

Language: Python - Size: 67.2 MB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 317 - Forks: 19

zjysteven/lmms-finetune

A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.

Language: Python - Size: 13 MB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 296 - Forks: 33

mlpc-ucsd/BLIVA

(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions

Language: Python - Size: 12.3 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 261 - Forks: 24

Open3DA/LL3DA

[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.

Language: Python - Size: 72.7 MB - Last synced at: 6 months ago - Pushed at: about 1 year ago - Stars: 258 - Forks: 10

WangRongsheng/Aurora

🐳 Aurora is a [Chinese Version] MoE model. Aurora is a further work based on Mixtral-8x7B, which activates the chat capability of the model's Chinese open domain.

Language: Python - Size: 9.28 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 254 - Forks: 21

SALT-NLP/LLaVAR

Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"

Language: Python - Size: 19.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 245 - Forks: 12

ZigeW/data_management_LLM

Collection of training data management explorations for large language models

Size: 329 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 238 - Forks: 24

huggingface/instruction-tuned-sd

Code for instruction-tuning Stable Diffusion.

Language: Python - Size: 105 KB - Last synced at: about 10 hours ago - Pushed at: over 1 year ago - Stars: 237 - Forks: 19

shure-dev/Awesome-LLM-Papers-Comprehensive-Topics

Awesome LLM Papers and repos on very comprehensive topics.

Size: 450 KB - Last synced at: 3 days ago - Pushed at: 11 months ago - Stars: 223 - Forks: 22

ShareGPT4Omni/ShareGPT4V

[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions

Language: Python - Size: 644 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 217 - Forks: 6

akoksal/LongForm

Reverse Instructions to generate instruction tuning data with corpus examples

Size: 29.2 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 208 - Forks: 10

LostXine/LLaRA

🔥[ICLR'25] LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

Language: Python - Size: 38.5 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 199 - Forks: 6

DaoD/INTERS

This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"

Language: Python - Size: 1.73 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 199 - Forks: 13

zjukg/KnowPAT

[Paper][ACL 2024 Findings] Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering

Language: Python - Size: 9.03 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 193 - Forks: 17

raunak-agarwal/instruction-datasets

All available datasets for Instruction Tuning of Large Language Models

Size: 50.8 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 185 - Forks: 10

sileod/tasksource

Datasets collection and preprocessings framework for NLP extreme multitask learning

Language: Python - Size: 376 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 184 - Forks: 10

wxjiao/ParroT

The ParroT framework to enhance and regulate the Translation Abilities during Chat based on open-sourced LLMs (e.g., LLaMA-7b, Bloomz-7b1-mt) and human written translation and evaluation data.

Language: Python - Size: 48.1 MB - Last synced at: 7 days ago - Pushed at: 7 months ago - Stars: 177 - Forks: 22

FSoft-AI4Code/CodeCapybara

Open-source Self-Instruction Tuning Code LLM

Language: Python - Size: 922 KB - Last synced at: 1 day ago - Pushed at: over 2 years ago - Stars: 169 - Forks: 11

xiaoya-li/Instruction-Tuning-Survey

Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`

Size: 1.84 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 148 - Forks: 6

HKUDS/GraphEdit

"GraphEdit: Large Language Models for Graph Structure Learning"

Language: Python - Size: 2.31 MB - Last synced at: 26 days ago - Pushed at: about 1 year ago - Stars: 134 - Forks: 15

HKUDS/HiGPT

[KDD'2024] "HiGPT: Heterogenous Graph Language Models"

Language: Python - Size: 6.17 MB - Last synced at: 26 days ago - Pushed at: about 1 year ago - Stars: 132 - Forks: 7

blazerye/DrugAssist

[Briefings In Bioinformatics] DrugAssist: A Large Language Model for Molecule Optimization

Language: Python - Size: 7.03 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 129 - Forks: 13

juyongjiang/CodeUp

CodeUp: A Multilingual Code Generation Llama-X Model with Parameter-Efficient Instruction-Tuning

Language: Python - Size: 18.6 MB - Last synced at: 3 days ago - Pushed at: 7 months ago - Stars: 126 - Forks: 9

zjukg/KoPA

[Paper][ACM MM 2024] Making Large Language Models Perform Better in Knowledge Graph Completion

Language: Python - Size: 2.85 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 123 - Forks: 8

HKUDS/RecLM

"RecLM: Recommendation Instruction Tuning"

Language: Python - Size: 212 MB - Last synced at: 26 days ago - Pushed at: about 2 months ago - Stars: 105 - Forks: 12

nlp-uoregon/Okapi

Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

Language: Python - Size: 262 MB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 97 - Forks: 3

simplifine-llm/Simplifine

🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨

Language: Python - Size: 844 KB - Last synced at: 9 days ago - Pushed at: 12 months ago - Stars: 93 - Forks: 4

FuxiaoLiu/MMC

[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning

Language: Python - Size: 8.41 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 88 - Forks: 4

OpenSparseLLMs/LLaMA-MoE-v2

🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training

Language: Python - Size: 2.21 MB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 84 - Forks: 12

OFA-Sys/DiverseEvol

Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning

Language: Python - Size: 62 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 79 - Forks: 2

zjr2000/Awesome-Multimodal-Chatbot

Awesome Multimodal Assistant is a curated list of multimodal chatbots/conversational assistants that utilize various modes of interaction, such as text, speech, images, and videos, to provide a seamless and versatile user experience.

Size: 17.6 KB - Last synced at: 6 days ago - Pushed at: about 2 years ago - Stars: 78 - Forks: 7

daniel-furman/sft-demos

Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.

Language: Jupyter Notebook - Size: 9.64 MB - Last synced at: 3 days ago - Pushed at: 9 months ago - Stars: 77 - Forks: 9

tamlhp/awesome-instruction-editing

Awesome Instruction Editing. Image and Media Editing with Human Instructions. Instruction-Guided Image and Media Editing.

Size: 718 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 76 - Forks: 2

ShiZhengyan/PowerfulPromptFT

[NeurIPS 2023 Main Track] This is the repository for the paper titled "Don’t Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner"

Language: Python - Size: 34.2 MB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 74 - Forks: 18

godaai/llm-table-survey

Resources on Large Language Models for Table Processing

Size: 32.2 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 70 - Forks: 6

Abbey4799/CuteGPT

An open-source conversational language model developed by the Knowledge Works Research Laboratory at Fudan University.

Language: Python - Size: 276 KB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 64 - Forks: 3

WadeYin9712/Dynosaur

Code and data for "Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation" (EMNLP 2023)

Language: Python - Size: 1.54 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 51 - Forks: 5

DreamerGPT/DreamerGPT

🌱 梦想家(DreamerGPT):中文大语言模型指令精调

Language: Python - Size: 8.93 MB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 51 - Forks: 2

AI4Bharat/IndicInstruct Fork of allenai/open-instruct

Code repository for "Introducing Airavata: Hindi Instruction-tuned LLM"

Language: Python - Size: 29.9 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 50 - Forks: 6

UCSC-REAL/DS2

[ICLR 2025] Improving Data Efficiency via Curating LLM-Driven Rating Systems

Language: Python - Size: 18 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 49 - Forks: 5

AdrianBZG/llama-multimodal-vqa

Multimodal Instruction Tuning for Llama 3

Language: Python - Size: 31.3 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 49 - Forks: 11

ParthaPRay/LLM-Learning-Sources

This repo contains a list of channels and sources from where LLMs should be learned

Size: 289 MB - Last synced at: 15 days ago - Pushed at: 11 months ago - Stars: 48 - Forks: 12

FudanDISC/ReForm-Eval

An benchmark for evaluating the capabilities of large vision-language models (LVLMs)

Language: Python - Size: 10 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 46 - Forks: 4

cxcscmu/Montessori-Instruct

Official repository for Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning [ICLR 2025]

Language: Python - Size: 27.9 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 40 - Forks: 3

OSU-NLP-Group/QA4RE

[ACL'23 Findings] "Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors"

Language: Python - Size: 50.8 KB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 40 - Forks: 5

pipixin321/HolmesVAU

[CVPR 2025] Official implementation of "Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity"

Language: Python - Size: 60.1 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 39 - Forks: 2

Spico197/MoE-SFT

🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts

Language: Python - Size: 552 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 39 - Forks: 0

ShiZhengyan/InstructionModelling

[NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"

Language: Python - Size: 22.9 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 36 - Forks: 8

2toinf/IVM

[NeurIPS-2024] The offical Implementation of "Instruction-Guided Visual Masking"

Language: Jupyter Notebook - Size: 70 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 29 - Forks: 2

inst-it/inst-it

Official repository of "Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning"

Language: Python - Size: 2.66 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 27 - Forks: 0

fanqiwan/Explore-Instruct Fork of 18907305772/Explore-Instruct

EMNLP'2023: Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration

Language: Python - Size: 2.23 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 27 - Forks: 2

InquestGeronimo/tllm

An LLM training library for instruction-tuning.

Language: Python - Size: 785 KB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 26 - Forks: 3

Reason-Wang/flan-alpaca-lora

This repository contains the code to train flan t5 with alpaca instructions and low rank adaptation.

Language: Python - Size: 6.71 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 23 - Forks: 3

zhuang-li/SCAR

[ACL 2025 main] SCAR: Data Selection via Style Consistency-Aware Response Ranking for Efficient Instruction-Tuning of Large Language Models

Language: Python - Size: 125 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 19 - Forks: 4

yichengchen24/MIG

Official code for MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space

Language: Python - Size: 10.3 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 19 - Forks: 1

ShinoharaHare/LLM-Training

A distributed training framework for large language models powered by Lightning.

Language: Python - Size: 281 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 19 - Forks: 4

patrick-tssn/LM-Research-Hub

Language Modeling Research Hub, a comprehensive compendium for enthusiasts and scholars delving into the fascinating realm of language models (LMs), with a particular focus on large language models (LLMs)

Language: Python - Size: 5 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 18 - Forks: 3

chtmp223/suri

Code for Suri: Multi-constraint instruction following for long-form text generation

Language: Python - Size: 1.56 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 17 - Forks: 0

upiterbarg/diff_history

[ICML 2024] Official code release accompanying the paper "diff History for Neural Language Agents" (Piterbarg, Pinto, Fergus)

Language: Python - Size: 204 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 17 - Forks: 2

ziegler-ingo/CRAFT

Code, datasets, and checkpoints for the paper "CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation"

Language: Python - Size: 1.55 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 15 - Forks: 5

UKPLab/arxiv2025-inherent-limits-plms

Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities"

Language: Python - Size: 711 KB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 13 - Forks: 0

RenzeLou/Muffin

MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following

Language: Python - Size: 142 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 12 - Forks: 3

longday1102/VietAI-experiment-LLaMA2

⚡ LLaMA-2 model experiment

Language: Jupyter Notebook - Size: 49.8 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 2