GitHub topics: llava-next

Repositories

zjysteven/lmms-finetune

A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.

Language: Python - Size: 13 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 296 - Forks: 33

RLHF-V/RLAIF-V

[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

Language: Python - Size: 60 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 361 - Forks: 14

rng190001/CS6375-ResearchProject

Visual Language Model focusing on testing different parsing techniques from generated responses

Language: Jupyter Notebook - Size: 880 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 1 - Forks: 0

[AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Vision-Language Models (e.g., LLaVA-Next) under a fixed token budget.

Language: Python - Size: 23.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 29 - Forks: 4