GitHub / stoneMo / DeepAVFusion
Official codebase for "Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling".
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stoneMo%2FDeepAVFusion
PURL: pkg:github/stoneMo/DeepAVFusion
Stars: 12
Forks: 0
Open issues: 1
License: apache-2.0
Language: Python
Size: 26.4 MB
Dependencies parsed at: Pending
Created at: over 1 year ago
Updated at: 12 months ago
Pushed at: 12 months ago
Last synced at: 12 months ago
Topics: attention-mechanism, audio-visual-correspondence, audio-visual-learning, masked-autoencoder, masked-image-modeling, multimodal-learning, self-supervised-learning, sound-source-localization, sound-source-separation, transformer-architecture