GitHub topics: audio-visual-learning
praveena2j/Cross-Attentional-AV-Fusion
FG2021: Cross Attentional AV Fusion for Dimensional Emotion Recognition
Language: Python - Size: 92.8 KB - Last synced at: 8 days ago - Pushed at: 5 months ago - Stars: 28 - Forks: 5

praveena2j/Joint-Cross-Attention-for-Audio-Visual-Fusion
IEEE T-BIOM : "Audio-Visual Fusion for Emotion Recognition in the Valence-Arousal Space Using Joint Cross-Attention"
Language: Python - Size: 290 KB - Last synced at: 8 days ago - Pushed at: 5 months ago - Stars: 38 - Forks: 11

ali-vilab/dreamtalk
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
Language: Python - Size: 31.6 MB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 1,704 - Forks: 206

YapengTian/AVE-ECCV18
Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018
Language: Python - Size: 18.2 MB - Last synced at: 14 days ago - Pushed at: about 4 years ago - Stars: 180 - Forks: 32

Davidlequnchen/LDED-FusionNet
LDED-FusionNet: Machine Learning-Based Audio-Visual Defect Detection for LDED AM Process
Language: Jupyter Notebook - Size: 1.18 GB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 1

ttgeng233/UnAV
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)
Language: Python - Size: 19.9 MB - Last synced at: 13 days ago - Pushed at: about 1 year ago - Stars: 63 - Forks: 6

aiden200/SoundQ2
Sound event localization and detection in 360-degree audio-visual soundscapes.
Language: Python - Size: 175 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

praveena2j/RJCAforSpeakerVerification
[FG 2024] "Audio-Visual Person Verification based on Recursive Fusion of Joint Cross-Attention"
Language: Python - Size: 1 MB - Last synced at: 14 days ago - Pushed at: 5 months ago - Stars: 4 - Forks: 0

praveena2j/JointCrossAttentional-AV-Fusion
ABAW3 (CVPRW): A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition
Language: Python - Size: 148 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 43 - Forks: 9

OpenNLPLab/AVSBench
[ECCV 2022] & [IJCV 2024] Official implementation of the paper: Audio-Visual Segmentation (with Semantics)
Language: Python - Size: 43.8 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 393 - Forks: 35

aromanusc/SoundQ
Enhanced sound event localization and detection in real 360-degree audio-visual soundscapes (DCASE task3 format)
Language: Python - Size: 129 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 7 - Forks: 2

praveena2j/RecurrentJointAttentionwithLSTMs
ICASSP 2023: "Recursive Joint Attention for Audio-Visual Fusion in Regression Based Emotion Recognition"
Language: Python - Size: 253 KB - Last synced at: 8 days ago - Pushed at: 5 months ago - Stars: 12 - Forks: 0

tanshuai0219/EDTalk
[ECCV 2024 Oral] EDTalk - Official PyTorch Implementation
Language: Python - Size: 46.7 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 374 - Forks: 36

praveena2j/Dynamic-CrossAttention
IEEE ICME : "Cross-Attention is not always needed: Dynamic Cross-Attention for Audio-Visual Dimensional Emotion Recognition"
Language: Python - Size: 2.26 MB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

alvinliu0/HA2G
[CVPR 2022] Code for "Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation"
Language: Python - Size: 2.61 MB - Last synced at: 5 months ago - Pushed at: about 2 years ago - Stars: 129 - Forks: 9

stoneMo/DeepAVFusion
Official codebase for "Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling".
Language: Python - Size: 26.4 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 12 - Forks: 0

Huntersxsx/AVVP-Learning-List
Related papers about Weakly-supervised Audio-Visual Video Parsing (AVVP) & Audio-Visual Event Localization (AVE)
Size: 819 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

roger-tseng/av-superb
A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)
Language: Python - Size: 64.8 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 32 - Forks: 4

dkurzend/ClipClap-GZSL
Audio-Visual Generalized Zero-Shot Learning using Large Pre-Trained Models
Language: Python - Size: 27.6 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

kyuyeonpooh/objects-that-sound
The unofficial implementation of paper, "Objects that Sound", from ECCV 2018.
Language: Python - Size: 163 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 32 - Forks: 4

jasongief/PSP_CVPR_2021
[2021 CVPR] Positive Sample Propagation along the Audio-Visual Event Line
Language: Python - Size: 1.19 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 37 - Forks: 10

stoneMo/CIGN
Official implementation for CIGN
Language: Python - Size: 5.31 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

rhgao/co-separation
Co-Separating Sounds of Visual Objects (ICCV 2019)
Language: Python - Size: 465 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 78 - Forks: 24

jasongief/CPSP
[2023 TPAMI] Contrastive Positive Sample Propagation along the Audio-Visual Event Line
Language: Python - Size: 498 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 16 - Forks: 3

MengyuanChen21/CVPR2023-CMPAE
[CVPR 2023] Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception
Language: Python - Size: 1.4 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 10 - Forks: 0

stoneMo/MGN
Official implementation for MGN
Language: Python - Size: 16.6 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 12 - Forks: 0

stoneMo/EZ-VSL
Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)
Language: Python - Size: 17.5 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 17 - Forks: 2

yanbeic/CCL
PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning
Language: Python - Size: 4.07 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 76 - Forks: 11

stoneMo/SLAVC
Official Codebase of "A Closer Look at Weakly-Supervised Audio-Visual Source Localization" (NeurIPS 2022)
Language: Python - Size: 15.6 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 1

Tinglok/avstyle
Codebase for the Paper: Learning Visual Styles from Audio-Visual Associations (ECCV 2022, in PyTorch)
Language: Python - Size: 6.59 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 12 - Forks: 0

kvilouras/AV-SSRL
MSc Thesis "Audio-Visual Self-Supervised Representation Learning in-the-wild"
Language: Jupyter Notebook - Size: 4.61 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

ly-zhu/ly-zhu.github.io
Projects webpage
Language: HTML - Size: 41.1 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0
