An open API service providing repository metadata for many open source software ecosystems.

Topic: "image-text-matching"

NVlabs/GroupViT

Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.

Language: Python - Size: 8.04 MB - Last synced at: 16 days ago - Pushed at: about 3 years ago - Stars: 761 - Forks: 54

Paranioar/Awesome_Matching_Pretraining_Transfering

The Paper List of Large Multi-Modality Model (Perception, Generation, Unification), Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.

Size: 369 KB - Last synced at: 8 days ago - Pushed at: 6 months ago - Stars: 422 - Forks: 48

Paranioar/SGRAF

[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”

Language: Python - Size: 794 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 197 - Forks: 37

woodfrog/vse_infty

Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021

Language: Python - Size: 3.91 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 136 - Forks: 18

kywen1119/DSRAN

Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.

Language: Python - Size: 30.7 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 68 - Forks: 12

naver-ai/eccv-caption

Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)

Language: Python - Size: 771 KB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 55 - Forks: 2

weiyx16/CLIP-pytorch 📦

A non-JIT version implementation / replication of CLIP of OpenAI in pytorch

Language: Python - Size: 233 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 34 - Forks: 4

jaisidhsingh/LoRA-CLIP

Easy wrapper for inserting LoRA layers in CLIP.

Language: Python - Size: 60.5 KB - Last synced at: 5 days ago - Pushed at: about 1 year ago - Stars: 33 - Forks: 2

jaisidhsingh/CoN-CLIP

Implementation of the "Learn No to Say Yes Better" paper.

Language: Python - Size: 4.36 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 31 - Forks: 2

slavabarkov/tidy

Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine

Language: Kotlin - Size: 99.3 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 27 - Forks: 5

eric-ai-lab/ComCLIP

Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"

Language: Python - Size: 7.86 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 22 - Forks: 0

Paranioar/RCAR

[TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”

Language: Python - Size: 1.72 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 21 - Forks: 2

MartinYuanNJU/SEMScene

Code implementation of paper "SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text Retrieval" (ACM TOMM 2024).

Language: Python - Size: 36.6 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 20 - Forks: 0

alipay/PC2-NoiseofWeb

Noise of Web (NoW) is a challenging noisy correspondence learning (NCL) benchmark containing 100K image-text pairs for robust image-text matching/retrieval models.

Language: Python - Size: 13.6 MB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 12 - Forks: 1

zabir-nabil/bangla-image-search

A dead-simple image search / retrieval and image-text matching system for Bangla using CLIP

Language: Python - Size: 234 KB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 12 - Forks: 4

zabir-nabil/bangla-CLIP

CLIP (Contrastive Language–Image Pre-training) for Bangla.

Language: Python - Size: 1.27 MB - Last synced at: 22 days ago - Pushed at: 12 months ago - Stars: 10 - Forks: 3

nhtlongcs/AIC2022-VER

Text Query based Traffic Video Event Retrieval with Global-Local Fusion Embedding

Language: Python - Size: 12.6 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 10 - Forks: 1

cuiaiyu/Text-to-Image-ReIdentification

Unofficial code of paper "Improving description-based person re-identification by multi-granularity image-text alignment." by Niu et al. (partially implemented)

Language: Jupyter Notebook - Size: 1.34 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 3

Paranioar/DBL

[TIP2024] The code of “Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching”

Language: Python - Size: 783 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 0

kaylode/tern

Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

Language: Jupyter Notebook - Size: 7.23 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 1

Paranioar/GSSF

[TIP2024] The code of "GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning"

Size: 5.86 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 5 - Forks: 0

marialymperaiou/knowledge-enhanced-multimodal-learning

A list of research papers on knowledge-enhanced multimodal learning

Size: 20.5 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

Paranioar/Awesome_Image_Text_Retrieval_Benchmark

The Unified Code of Image-Text Retrieval for Further Exploration.

Language: Python - Size: 41 KB - Last synced at: 11 days ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

mrzjy/GenshinCLIP

A simple open-sourced SigLIP model finetuned on Genshin Impact's image-text pairs.

Size: 1.06 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

basic-go-ahead/wikipedia-image-caption-matching

The 3rd place solution code for the Wikipedia - Image/Caption Matching Competition on Kaggle

Language: Jupyter Notebook - Size: 1.67 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

hthoai/image-text-matching

Image-Text Matching Model Zoo

Language: Python - Size: 12.7 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 2

shayan55579/CMSL-MLP-ImageText-Matching

A novel image-text matching model using Cross-Modal Space Learning with MLP aggregation, designed to bridge the semantic gap between images and texts for improved recall and matching efficiency.

Language: Jupyter Notebook - Size: 1.71 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

gaurav104/Image-Text-Matching

Language: Python - Size: 61.5 KB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Cbhihe/NLP_clip-bleu-meteor

Python Implementation of lexical vector embedding similarity scoring, zero-shot classification of images and n-gram based scoring to compare textual summaries

Language: Jupyter Notebook - Size: 4.81 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Related Topics
cross-modal-retrieval 13 image-text-retrieval 12 pytorch 7 clip 7 image-retrieval 6 deep-learning 6 multimodal 4 text-matching 4 vision-and-language 4 computer-vision 3 tip 3 visual-semantic 2 nlp 2 image-search 2 parameter-efficient-tuning 2 openai-clip 2 contrastive-language-image-pre-training 2 clip-image-search-engine 2 compositionality 2 multimodal-deep-learning 2 dataset 2 benchmark 2 person-identification 1 bangla-clip-search 1 text-embedding 1 mlp 1 cross-model-retrieval 1 deep-learning-image-search 1 image-search-engine 1 visual-language-models 1 search 1 image-captions 1 search-engine 1 scene-graph-models 1 bimodal 1 tutorial 1 image-matching 1 visual-storytelling 1 visual-reasoning 1 bangla-image-retrieval 1 visual-question-answering 1 visual-grounding 1 visual-dialog 1 visual-commonsense-reasoning 1 vision-language-transformer 1 vision-and-language-pre-training 1 vision-and-language-navigation 1 story-visualization 1 multimodal-retrieval 1 bangla-image-search 1 text-to-video-generation 1 text-to-image-synthesis 1 text-to-image-generation 1 video-text-recognition 1 parameter-efficient-fine-tuning 1 multimodal-pretraining 1 video-text-retrieval 1 multimodal-large-language-models 1 visual-semantic-embedding 1 lora 1 memory-efficient-tuning 1 large-vision-models 1 large-vision-language-models 1 large-language-models 1 low-rank-adaptation 1 large-language-model 1 awesome-list 1 vector-embeddings 1 scoring-algorithm 1 vision-language-pretraining 1 rouge 1 python 1 nltk 1 nlp-machine-learning 1 n-grams 1 meteor 1 bleu 1 multi-task-learning 1 kotlin 1 android 1 vse 1 vision-language 1 vl-benchmark 1 machine-learning 1 evaluation 1 eccv2022 1 re-identification 1 metric-research 1 boosting-learning 1 search-relevance 1 natural-language-processing 1 matcher 1 kaggle-competition 1 kaggle 1 stacked-cross-attention 1 image-captioning 1 noisy-correspondence 1 multimodal-learning 1 captioning-images 1 acmmm2024 1