GitHub topics: llava-next
zjysteven/lmms-finetune
A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.
Language: Python - Size: 13 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 296 - Forks: 33

RLHF-V/RLAIF-V
[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
Language: Python - Size: 60 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 361 - Forks: 14

rng190001/CS6375-ResearchProject
Visual Language Model focusing on testing different parsing techniques from generated responses
Language: Jupyter Notebook - Size: 880 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 1 - Forks: 0

hasanar1f/HiRED
[AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Vision-Language Models (e.g., LLaVA-Next) under a fixed token budget.
Language: Python - Size: 23.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 29 - Forks: 4

chuangchuangtan/LLaVA-NeXT-Image-Llama3-Lora
LLaVA-NeXT-Image-Llama3-Lora, Modified from https://github.com/arielnlee/LLaVA-1.6-ft
Language: Python - Size: 11.6 MB - Last synced at: 2 days ago - Pushed at: 10 months ago - Stars: 44 - Forks: 4

mu-cai/matryoshka-mm
Matryoshka Multimodal Models
Language: Python - Size: 26.1 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 90 - Forks: 5
