GitHub topics: mllm-evaluation
Lum1104/EIBench
Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models
Language: Python - Size: 11.2 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 14 - Forks: 0

zhousheng97/EgoTextVQA
[CVPR 2025] 🌟🌟 EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering
Language: Python - Size: 9.63 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 31 - Forks: 1

vulab-AI/YESBUT-v2
We introduce the YesBut-v2, a benchmark for assessing AI's ability to interpret juxtaposed comic panels with contradictory narratives. Unlike existing benchmarks, it emphasizes visual understanding, comparative reasoning, and social knowledge.
Language: JavaScript - Size: 22.3 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

AdaCheng/VidEgoThink
The official code and data for paper "VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI"
Language: Python - Size: 129 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 8 - Forks: 0

Now-Join-Us/OmniEvalKit Fork of AIDC-AI/M3Bench
The code repository for "OmniEvalKit: A Modular, Lightweight Toolbox for Evaluating Large Language Model and its Omni-Extensions"
Language: Python - Size: 3.82 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 13 - Forks: 2

simoncwang/MMO
Multimodal Multi-agent Organization and Benchmarking
Language: Python - Size: 74.2 KB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0
