An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: video-grounding

TheShadow29/awesome-grounding

awesome grounding: A curated list of research papers in visual grounding

Size: 187 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1,080 - Forks: 102

showlab/UniVTG

[ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding

Language: Python - Size: 22.7 MB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 356 - Forks: 34

ttengwang/Awesome_Long_Form_Video_Understanding

Awesome papers & datasets specifically focused on long-term videos.

Size: 44.9 KB - Last synced at: 8 days ago - Pushed at: 9 months ago - Stars: 283 - Forks: 12

fletcherjiang/LLMEPET

[MM'24 Oral] Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval

Language: Python - Size: 9.38 MB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 126 - Forks: 11

Tangkfan/Awesome-Temporal-Video-Grounding

paper list on Video Moment Retrieval (VMR), or Temporal Video Grounding (TVG), Video Grounding (VG), or Temporal Sentence Grounding in Videos (TSGV)

Size: 59.6 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 6 - Forks: 0

mbzuai-oryx/Video-LLaVA

PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models

Language: Python - Size: 18.8 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 257 - Forks: 12

ekazakos/grove

Code implementation for the paper "Large-scale Pre-training for Grounded Video Caption Generation"

Size: 8.59 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 5 - Forks: 0

sming256/TimeLoc

TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long Videos

Size: 1.95 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

minjoong507/BM-DETR

[WACV 2025] Official Pytorch code for "Background-aware Moment Detection for Video Moment Retrieval"

Language: Python - Size: 3.07 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 14 - Forks: 0

ZhenZHAO/awesome-video-moment-retrieval

paper list on Video Moment Retrieval (VMR), or Natural Language Video Localization (NLVL), or Temporal Sentence Grounding in Videos (TSGV))

Size: 1.53 MB - Last synced at: 2 days ago - Pushed at: over 2 years ago - Stars: 31 - Forks: 1

sutdcv/Animal-Kingdom

[CVPR2022] Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding

Language: Python - Size: 165 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 128 - Forks: 12

wjun0830/CGDETR

Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"

Language: Python - Size: 23.3 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 105 - Forks: 11

YooEunseok/DSTC10-Track4-Task2

DSTC10 (The 10th Dialogue System Technology Challenge) Track4 (Reasoning for Audio Visual Scene) - Task2(Aware Dialog - Grounding for QAs)

Language: Python - Size: 863 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

doc-doc/NExT-GQA

Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)

Language: Python - Size: 5.64 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 33 - Forks: 1

MichiganCOG/Video-Grounding-from-Text

Source code for "Weakly-Supervised Video Object Grounding from Text by Loss Weighting and Object Interaction"

Language: Python - Size: 4.25 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 43 - Forks: 9

minjoong507/MPGN

[EMNLP 2022] Pytorch code for "Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval"

Language: Python - Size: 73.2 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 0

sangminwoo/Explore-And-Match

Official pytorch implementation of "Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos"

Language: Python - Size: 4.11 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 42 - Forks: 2

sunoh-kim/pps

Pytorch implementation of the paper 'Gaussian Mixture Proposals with Pull-Push Learning Scheme to Capture Diverse Events for Weakly Supervised Temporal Video Grounding' (AAAI2024).

Language: Python - Size: 17.9 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

sunoh-kim/PLRN

This repository contains an official PyTorch implementation of Position-aware Location Regression Network (PLRN) for temporal video grounding, which is presented in the paper Position-aware Location Regression Network for Temporal Video Grounding.

Language: Python - Size: 292 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 0

zjr2000/GVL

Official implementation for paper Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos

Language: Python - Size: 109 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 19 - Forks: 6

JaywongWang/CBP

Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction"

Language: Python - Size: 28.7 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 58 - Forks: 9

r-cui/ViGA

"Video Moment Retrieval from Text Queries via Single Frame Annotation" in SIGIR 2022.

Language: Python - Size: 4.59 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 53 - Forks: 4

henryhungle/MM_DST

Code for the paper Multimodal Dialogue State Tracking (NAACL22)

Language: Python - Size: 1.24 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 5 - Forks: 1

jshi31/NAFAE

Implementation of paper "Not All Frames Are Equal: Weakly-Supervised Video Grounding with Contextual Similarity and Visual Clustering Losses"

Language: Python - Size: 4.32 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 28 - Forks: 6

TheShadow29/vognet-pytorch

[CVPR20] Video Object Grounding using Semantic Roles in Language Description (https://arxiv.org/abs/2003.10606)

Language: Python - Size: 3.45 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 67 - Forks: 7

JaywongWang/TGN

Tensorflow Reproduction of the EMNLP-2018 paper "Temporally Grounding Natural Sentence in Video"

Language: Python - Size: 26.3 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 15 - Forks: 4

Related Keywords
video-grounding 26 video-moment-retrieval 6 moment-retrieval 4 vision-and-language 4 multimodal-learning 4 pytorch 4 temporal-sentence-grounding 3 video-understanding 3 temporal-grounding 3 grounding 3 computer-vision 3 temporal-action-localization 2 nlp 2 dense-video-captioning 2 pytorch-implementation 2 video-summarization 2 video-language 2 attention-mechanism 2 video 2 highlight-detection 2 visual-grounding 2 captioning-videos 2 action-localization 2 video-retrieval 2 llm 2 video-question-answering 1 trustworthy-vqa 1 video-language-understanding 1 videoqa 1 dstc10 1 deep-learning 1 text-video-retrieval 1 multi-modal-learning 1 detr 1 detection-transformer 1 pose-estimation 1 multi-label-action-recognition 1 meta-learning 1 long-tailed-distribution 1 vision 1 video-object-grounding 1 object-grounding 1 weakly-supervised 1 transformer-architecture 1 machinelearning 1 dialoguestatetracker 1 dialogue-systems 1 dialogue 1 video-analysis 1 temporal-localization 1 representation-learning 1 long-video-understanding 1 moment-localization 1 gaussian-mixture 1 natural-language-video-localization 1 youcook2-boundingbox 1 youcook2 1 visual-evidence-grounding 1 video-llms 1 video-large-language-models 1 video-dataset 1 temporal-action-detection 1 long-term-video 1 audio-visual-event-localization 1 pretraining 1 phrase-grounding 1 papers 1 paper-roadmap 1 paper 1 natural-language-processing 1 multimodal-deep-learning 1 language-grounding 1 image-grounding 1 embodied-agent 1 captioning-images 1 awesome-list 1 arxiv 1 ethology 1 dataset 1 cvpr2022 1 animal-behavioral-understanding 1 animal-behavior 1 action-recognition 1 natural-language-queries 1 vision-language 1 video-language-pretrainng 1 video-language-model 1 video-captioning 1 large-scale-pretraining 1 automatic-annotation 1 video-conversation 1 transcription 1 lmm 1 vllm 1 temporal-video-grounding 1 video-representation-learning 1