GitHub topics: multimodal-foundation-model

Repositories

ligengen/EgoM2P

[ICCV 2025] The official implementation for EgoM2P: Egocentric Multimodal Multitask Pretraining.

Size: 1.93 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 8 - Forks: 1

We introduce temporal working memory (TWM), which aims to enhance the temporal modeling capabilities of Multimodal foundation models (MFMs). This plug-and-play module can be easily integrated into existing MFMs. With our TWM, nine state-of-the-art models exhibit significant performance improvements across QA, captioning, and retrieval tasks.

Language: Python - Size: 896 KB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 309 - Forks: 30

mahmoodlab/MADELEINE

MADELEINE: multi-stain slide representation learning (ECCV'24)

Language: Python - Size: 22.9 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 52 - Forks: 5

MJ-Bench/MJ-Bench

Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"

Language: Jupyter Notebook - Size: 218 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 43 - Forks: 5

TXH-mercury/VAST

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

Language: Jupyter Notebook - Size: 73.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 144 - Forks: 5

Related Keywords

multimodal-foundation-model 5 vision-language 1 vision-audio-subtitle-text 1 dataset 1 cross-modality-pretraining 1 audio-language 1 reward-models 1 multimodal-judge 1 llm-benchmarking 1 llm-as-a-judge 1 ssl 1 slide-representation-learning 1 pathology 1 molecular-status-prediction 1 cancer 1 working-memory 1 video-text-retrieval 1 video-captioning 1 question-answering 1 multimodal-large-language-models 1 audio-visual-learning 1 egocentric-vision 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

GitHub topics: multimodal-foundation-model

ligengen/EgoM2P

xid32/NAACL_2025_TWM

mahmoodlab/MADELEINE

MJ-Bench/MJ-Bench

TXH-mercury/VAST