GitHub topics: large-vision-language-model
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
Size: 82.9 MB - Last synced at: 8 days ago - Pushed at: 14 days ago - Stars: 15,516 - Forks: 1,006

Ruiyang-061X/VL-Uncertainty
🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".
Language: Python - Size: 7.12 MB - Last synced at: 5 days ago - Pushed at: 3 months ago - Stars: 36 - Forks: 3

PKU-YuanGroup/MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
Language: Python - Size: 16.5 MB - Last synced at: 22 days ago - Pushed at: 7 months ago - Stars: 2,170 - Forks: 135

InternLM/InternLM-XComposer
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Language: Python - Size: 200 MB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 2,834 - Forks: 172

PKU-YuanGroup/Video-LLaVA
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Language: Python - Size: 113 MB - Last synced at: 30 days ago - Pushed at: 7 months ago - Stars: 3,245 - Forks: 234

yaotingwangofficial/Awesome-MCoT
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Size: 4.63 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 576 - Forks: 15

yu-rp/apiprompting
[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models
Language: Python - Size: 8.63 MB - Last synced at: 14 days ago - Pushed at: 8 months ago - Stars: 88 - Forks: 5

SuperBruceJia/Awesome-Large-Vision-Language-Model
Awesome Large Vision-Language Model: A Curated List of Large Vision-Language Model
Size: 103 KB - Last synced at: 5 days ago - Pushed at: 9 months ago - Stars: 27 - Forks: 3

lucaswychan/quant-lvlm
Easy-to-use large vision language model pipeline for quantitative analysis
Language: Python - Size: 953 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 0

jqtangust/hawk
🔥 🔥 🔥 [NeurIPS 2024] Hawk: Learning to Understand Open-World Video Anomalies
Language: Python - Size: 5.35 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 193 - Forks: 2

Orlando-CS/Awesome-VLA
✨✨latest advancements in VLA models(VIsion Language Action)
Size: 0 Bytes - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

ADL-X/LLAVIDAL
This is the offical repository of LLAVIDAL
Language: Python - Size: 32 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 13 - Forks: 1

ai4ce/LUWA
[CVPR 2024 Highlight] The first benchmark for lithic use-wear analysis leveraging SOTA vision and vision-language models (DINOv2, GPT-4V), demonstrating AI performance surpassing that of expert archaeologists.
Language: Jupyter Notebook - Size: 10.7 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 4 - Forks: 0

richard-peng-xia/CARES
[NeurIPS'24 & ICMLW'24] CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
Language: Python - Size: 3.76 MB - Last synced at: 7 months ago - Pushed at: 9 months ago - Stars: 59 - Forks: 4

MMStar-Benchmark/MMStar
This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
Language: Python - Size: 3.41 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 84 - Forks: 1
