ecosyste.ms

Repos

An open API service providing repository metadata for many open source software ecosystems.

Topic: "visual-instruction-tuning"

BradyFU/Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Size: 83 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 14,990 - Forks: 963

CircleRadon/Osprey

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

Language: Python - Size: 23.8 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 816 - Forks: 43

ictnlp/LLaVA-Mini

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

Language: Python - Size: 54.6 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 441 - Forks: 19

zjysteven/lmms-finetune

A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.

Language: Python - Size: 13 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 284 - Forks: 29

BAAI-DCAI/DataOptim

A collection of visual instruction tuning datasets.

Language: Python - Size: 51.8 KB - Last synced at: 10 months ago - Pushed at: about 1 year ago - Stars: 72 - Forks: 3

bigai-nlco/VideoTGB

[EMNLP 2024] A Video Chat Agent with Temporal Prior

Language: Python - Size: 51.6 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 29 - Forks: 2

hllj/Vistral-V Fork of haotian-liu/LLaVA

Vistral-V: Visual Instruction Tuning for Vistral - Vietnamese Large Vision-Language Model.

Language: Python - Size: 15.5 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 15 - Forks: 0

mixpeek/awesome-multimodal-search

Collections of multimodal search libraries, service and research papers

Size: 3.12 MB - Last synced at: 2 days ago - Pushed at: 27 days ago - Stars: 9 - Forks: 0

zjr2000/REVERIE

[ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models

Language: Python - Size: 1.12 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 6 - Forks: 0

jingyi0000/Awesome-Visual-Instruction-Tuning

Visual Instruction Tuning towards General-Purpose Multimodal Model: A Survey

Size: 249 KB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 0

fraction-ai/GAP

Gamified Adversarial Prompting (GAP): Crowdsourcing AI-weakness-targeting data through gamification. Boost model performance with community-driven, strategic data collection

Language: Python - Size: 8.92 MB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 4 - Forks: 0

yueying-teng/generate-language-image-instruction-following-data

Mistral assisted visual instruction data generation by following LLaVA

Language: Python - Size: 69.3 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 1

Related Topics

multimodal-large-language-models 5 llm 4 llava 3 vision-language-model 3 mllm 3 large-language-models 2 large-multimodal-models 2 multimodal 2 vision-language 2 instruction-tuning 2 chain-of-thought 1 qwen-vl 1 llava-next 1 large-language-model 1 foundation-models 1 finetuning 1 vision 1 video 1 llama 1 gpt4v 1 in-context-learning 1 instruction-following 1 large-vision-language-model 1 large-vision-language-models 1 multi-modality 1 multimodal-chain-of-thought 1 multimodal-in-context-learning 1 multimodal-instruction-tuning 1 awesome-list 1 cross-modal-search 1 multimodal-search 1 similarity-search 1 vector-search 1 instruction-following-data 1 langchain 1 llama-cpp-python 1 mistral 1 multimodal-learning 1 vllm 1 dataset 1 rationale 1 pixel-understanding 1 sam 1 language-model 1 open-source 1 vietnamese 1 vistral-v 1 ai 1 artificial-intelligence 1 computer-vision 1 vqa 1 vqa-dataset 1 web3 1 spatial-temporal 1 video-language 1 multi-modal-language-model 1 multi-modal-model 1 survey 1 efficient 1 gpt4o 1