An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: mllm-evaluation

path2generalist/General-Level

On Path to Multimodal Generalist: General-Level and General-Bench

Language: Python - Size: 918 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 17 - Forks: 2

Lum1104/EIBench

Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models

Language: Python - Size: 11.2 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 18 - Forks: 0

williamium3000/core-knowledge

Office codebase for ICML 2025 paper "Core Knowledge Deficits in Multi-Modal Language Models"

Size: 199 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

zhousheng97/EgoTextVQA

[CVPR'25] 🌟🌟 EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering

Language: Python - Size: 9.64 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 33 - Forks: 1

luo-junyu/FinMME

[ACL 2025] FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation

Language: Python - Size: 1.21 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 0

EchoDreamer/Modality-Preference

Modality Preference

Language: Python - Size: 23 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 0

vulab-AI/YESBUT-v2

We introduce the YesBut-v2, a benchmark for assessing AI's ability to interpret juxtaposed comic panels with contradictory narratives. Unlike existing benchmarks, it emphasizes visual understanding, comparative reasoning, and social knowledge.

Language: JavaScript - Size: 22.3 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

AdaCheng/VidEgoThink

The official code and data for paper "VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI"

Language: Python - Size: 129 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 8 - Forks: 0

Now-Join-Us/OmniEvalKit Fork of AIDC-AI/M3Bench

The code repository for "OmniEvalKit: A Modular, Lightweight Toolbox for Evaluating Large Language Model and its Omni-Extensions"

Language: Python - Size: 3.82 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 13 - Forks: 2

simoncwang/MMO

Multimodal Multi-agent Organization and Benchmarking

Language: Python - Size: 74.2 KB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0