GitHub topics: video-language-understanding
keshik6/HourVideo
[NeurIPS 2024] Official code for HourVideo: 1-Hour Video Language Understanding
Language: Jupyter Notebook - Size: 8.16 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 67 - Forks: 3

bigai-nlco/VideoLLaMB
Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges
Language: Python - Size: 59.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 63 - Forks: 2

whwu95/Cap4Video
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Language: Python - Size: 8.58 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 240 - Forks: 20

whwu95/BIKE
【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Language: Python - Size: 9.01 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 155 - Forks: 18

doc-doc/NExT-GQA
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
Language: Python - Size: 5.64 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 33 - Forks: 1

MikeWangWZHL/Paxion
Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight
Language: Python - Size: 2.83 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 27 - Forks: 1

zinengtang/DeCEMBERT
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
Language: Python - Size: 215 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 17 - Forks: 1

jena-shreyas/Awesome-Video-Language-Resources
A repository of Video Language papers, code and datasets.
Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

houzhijian/GroundNLQ
The champion solution for Ego4D Natural Language Queries Challenge in CVPR 2023
Language: Python - Size: 2.59 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 0

doc-doc/CoVGT
Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)
Language: Python - Size: 6.01 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 14 - Forks: 1

sail-sg/VGT
Video Graph Transformer for Video Question Answering (ECCV'22)
Language: Python - Size: 454 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 38 - Forks: 9

houzhijian/CONE
[2023 ACL] CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
Language: Python - Size: 1.66 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 12 - Forks: 1

Maddy12/SSL4VideoSurvey
The official GitHub page for the survey paper "Self-Supervised learning for Videos: A survey"
Size: 665 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0
