An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: referring-expression-comprehension

Charles-Xie/awesome-described-object-detection

A curated list of papers and resources related to Described Object Detection, Open-Vocabulary/Open-World Object Detection and Referring Expression Comprehension. Updated frequently and pull requests welcomed.

Size: 40 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 280 - Forks: 21

IDEA-Research/Rex-Thinker

Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning

Language: Python - Size: 27.1 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 20 - Forks: 0

henghuiding/ReLA

[CVPR2023 Highlight] GRES: Generalized Referring Expression Segmentation

Language: Python - Size: 2.06 MB - Last synced at: 29 days ago - Pushed at: almost 2 years ago - Stars: 697 - Forks: 20

henghuiding/MeViS

[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions

Language: Python - Size: 52.2 MB - Last synced at: 29 days ago - Pushed at: 12 months ago - Stars: 523 - Forks: 21

shikras/d-cube

A detection/segmentation dataset with labels characterized by intricate and flexible expressions. "Described Object Detection: Liberating Object Detection with Flexible Expressions" (NeurIPS 2023).

Language: Python - Size: 835 KB - Last synced at: 25 days ago - Pushed at: over 1 year ago - Stars: 123 - Forks: 7

FoundationVision/GLEE

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Language: Python - Size: 22.3 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 1,126 - Forks: 70

OFA-Sys/OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Language: Python - Size: 120 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 2,501 - Forks: 248

lparolari/harlequin

Code and DataLoader for the Harlequin dataset 🎨 described in the paper "Harlequin: Color-driven Generation of Synthetic Data for Referring Expression Comprehension", presented at ICPR'24

Language: Python - Size: 3.42 MB - Last synced at: 1 day ago - Pushed at: 7 months ago - Stars: 3 - Forks: 0

henghuiding/gRefCOCO

A benchmark dataset for GRES and GREC [CVPR2023 Highlight]

Language: Python - Size: 810 KB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 229 - Forks: 4

luogen1996/MCN

[CVPR2020] Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation, CVPR2020 (oral)

Language: Python - Size: 479 KB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 138 - Forks: 25

luogen1996/SimREC

A lightweight codebase for referring expression comprehension and segmentation

Language: Python - Size: 346 KB - Last synced at: 2 months ago - Pushed at: about 3 years ago - Stars: 53 - Forks: 4

xuyang-liu16/VGDiffZero

[ICASSP 2024] VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders

Language: Python - Size: 1.07 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 14 - Forks: 1

shenyunhang/APE

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Language: Python - Size: 49.3 MB - Last synced at: 7 months ago - Pushed at: about 1 year ago - Stars: 490 - Forks: 29

antonio-f/Florence-2-test

Florence-2 quick test

Language: Jupyter Notebook - Size: 3.91 MB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

MILVLG/rosita

ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration

Language: Python - Size: 15.9 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 55 - Forks: 13

Disguiser15/RefTeacher

RefTeacher is a strong baseline method for Semi-Supervised Referring Expression Comprehension.

Language: Python - Size: 3.76 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 11 - Forks: 0

MasterBin-IIAU/UNINEXT

[CVPR'23] Universal Instance Perception as Object Discovery and Retrieval

Language: Python - Size: 17.5 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1,332 - Forks: 150

willemsenbram/a-game-of-sorts

Repository for the paper "Collecting Visually-Grounded Dialogue with A Game Of Sorts"

Language: Shell - Size: 3.29 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

Related Keywords
referring-expression-comprehension 18 referring-expression-segmentation 7 object-detection 5 visual-grounding 4 dataset 3 referring-video-object-segmentation 3 open-vocabulary-detection 3 vision-language 2 vision-language-transformer 2 image-captioning 2 multimodal-learning 2 vision-and-language 2 open-world 2 video-object-segmentation 2 video-instance-segmentation 2 tutorial 1 python 1 vision-foundation-model 1 multimodal-large-language-models 1 jupyter-notebook 1 image-to-text 1 huggingface-transformers 1 florence-2 1 colab-notebook 1 image-segmentation 1 zero-shot-learning 1 vision-language-model 1 text-to-image-generation 1 stable-diffusion 1 image-text-retrieval 1 pre-training 1 vqa 1 semi-supervised-learning 1 instance-segmentation 1 multi-object-tracking-segmentation 1 multiple-object-tracking 1 object-tracking 1 perception 1 single-object-tracking 1 unified-model 1 dialogue 1 referring-expression-generation 1 referring-expressions 1 serious-game 1 visually-grounded-dialogue 1 awesome 1 awesome-list 1 open-world-object-detection 1 grpo 1 mllm 1 cvpr2023 1 referring-image-segmentation 1 mevis-dataset 1 mose-dataset 1 video-understanding 1 multi-modal-learning 1 foundation-model 1 interactive-segmentation 1 open-vocabulary-segmentation 1 open-vocabulary-video-segmentation 1 segment-anything 1 tracking 1 zero-shot-object-detection 1 chinese 1 multimodal 1 pretrained-models 1 pretraining 1 prompt 1 prompt-tuning 1 text-to-image-synthesis 1 visual-question-answering 1 synthetic-data-generation 1 grefcoco 1 cvpr2020 1 multi-task-learning 1 computer-vision 1