Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: cross-modal-retrieval

MartinYuanNJU/SEMScene

Code implementation of paper "SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text Retrieval" (ACM TOMM 2024).

Language: Python - Size: 36.6 MB - Last synced: 2 days ago - Pushed: 3 days ago - Stars: 14 - Forks: 0

zjukg/KG-MM-Survey

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

Size: 82.2 MB - Last synced: 3 days ago - Pushed: 3 days ago - Stars: 204 - Forks: 13

Paranioar/Awesome_Matching_Pretraining_Transfering

The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.

Size: 215 KB - Last synced: about 13 hours ago - Pushed: 2 months ago - Stars: 354 - Forks: 46

jpthu17/DiCoSA

[IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment

Language: Python - Size: 5.56 MB - Last synced: 6 days ago - Pushed: about 1 month ago - Stars: 42 - Forks: 2

jpthu17/EMCL

[NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations

Language: Python - Size: 23.9 MB - Last synced: 6 days ago - Pushed: about 1 month ago - Stars: 98 - Forks: 7

naver-ai/pcmepp

Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)

Language: Python - Size: 15.3 MB - Last synced: 6 days ago - Pushed: about 1 month ago - Stars: 39 - Forks: 1

jpthu17/DiffusionRet

[ICCV 2023] DiffusionRet: Generative Text-Video Retrieval with Diffusion Model

Language: Python - Size: 5.36 MB - Last synced: 6 days ago - Pushed: about 1 month ago - Stars: 99 - Forks: 4

naver-ai/pcme

Official Pytorch implementation of "Probabilistic Cross-Modal Embedding" (CVPR 2021)

Language: Python - Size: 2.11 MB - Last synced: 6 days ago - Pushed: 3 months ago - Stars: 119 - Forks: 17

jpthu17/HBI

[CVPR 2023 Highlight] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning

Language: Python - Size: 51 MB - Last synced: 6 days ago - Pushed: about 1 month ago - Stars: 94 - Forks: 4

kunjmehta/cross-modal-retrieval-food-ai

Course project for 198:536 at Rutgers University. The project is about cross-modal retrieval of food recipes given the images and recipe ingredients and instructions of the recipe, using the Recipe1M dataset.

Language: Jupyter Notebook - Size: 5.17 MB - Last synced: 29 days ago - Pushed: over 1 year ago - Stars: 0 - Forks: 1

ailab-kyunghee/CM2_DVC

[CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval

Language: Python - Size: 119 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 2 - Forks: 0

naver-ai/eccv-caption

Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)

Language: Python - Size: 771 KB - Last synced: 6 days ago - Pushed: 3 months ago - Stars: 51 - Forks: 2

Paranioar/Awesome_Image_Text_Retrieval_Benchmark

The Unified Code of Image-Text Retrieval for Further Exploration.

Language: Python - Size: 41 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0

Paranioar/SGRAF

[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”

Language: Python - Size: 794 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 197 - Forks: 37

Paranioar/RCAR

[TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”

Language: Python - Size: 1.72 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 21 - Forks: 2

jina-ai/clip-as-service

🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP

Language: Python - Size: 27.4 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 12,060 - Forks: 2,055

layumi/Image-Text-Embedding

TOMM2020 Dual-Path Convolutional Image-Text Embedding :feet: https://arxiv.org/abs/1711.05535

Language: MATLAB - Size: 6.02 MB - Last synced: 8 days ago - Pushed: 11 months ago - Stars: 280 - Forks: 73

aimh-lab/visione

An AI-powered interactive video retrieval system

Language: JavaScript - Size: 187 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 6 - Forks: 0

YehLi/xmodaler

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

Language: Python - Size: 12.2 MB - Last synced: 3 months ago - Pushed: about 1 year ago - Stars: 996 - Forks: 124

kyuyeonpooh/objects-that-sound

The unofficial implementation of paper, "Objects that Sound", from ECCV 2018.

Language: Python - Size: 163 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 32 - Forks: 4

yalesong/pvse

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)

Language: Python - Size: 15.9 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 128 - Forks: 24

mariyahendriksen/ecir2022_category_to_image_retrieval

This repository contains the code for the paper "Extending CLIP for Category-to-image Retrieval in E-commerce" published at ECIR 2022.

Language: Jupyter Notebook - Size: 169 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 3 - Forks: 0

penghu-cs/UCCH

Unsupervised Contrastive Cross-modal Hashing (IEEE TPAMI 2023, PyTorch Code)

Language: Python - Size: 2.56 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 29 - Forks: 8

jaychempan/SWAN-pytorch

Reducing Semantic Confusion: Scene-aware Aggregation Network for Remote Sensing Cross-modal Retrieval (ICMR'23 Oral)

Language: Python - Size: 2.13 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 19 - Forks: 4

xiaoyuan1996/SemanticLocalizationMetrics

The first research for semantic localization

Language: Python - Size: 41.3 MB - Last synced: 27 days ago - Pushed: 6 months ago - Stars: 19 - Forks: 4

penghu-cs/DSCMR

Deep Supervised Cross-modal Retrieval (CVPR 2019, PyTorch Code)

Language: Python - Size: 10.6 MB - Last synced: 6 months ago - Pushed: over 4 years ago - Stars: 131 - Forks: 24

howard-hou/BagFormer

PyTorch code for BagFormer: Better Cross-Modal Retrieval via bag-wise interaction

Language: Python - Size: 3.44 MB - Last synced: 6 months ago - Pushed: over 1 year ago - Stars: 114 - Forks: 33

mesnico/ALADIN

Official implementation of the paper "ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval"

Language: Python - Size: 17.6 MB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 16 - Forks: 6

slavabarkov/tidy

Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine

Language: Kotlin - Size: 99.3 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 27 - Forks: 5

woodfrog/vse_infty

Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021

Language: Python - Size: 3.91 MB - Last synced: 7 months ago - Pushed: about 1 year ago - Stars: 136 - Forks: 18

gorjanradevski/SMHA

My master thesis: Siamese multi-hop attention for cross-modal retrieval.

Language: Python - Size: 2.76 MB - Last synced: 7 months ago - Pushed: about 4 years ago - Stars: 5 - Forks: 0

peri044/STT

A multi-task model which does image captioning, sentence paraphrasing and cross-modal retrieval.

Language: Python - Size: 103 KB - Last synced: 7 months ago - Pushed: over 4 years ago - Stars: 18 - Forks: 6

CLT29/semantic_neighborhoods

Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval [ECCV 2020]

Language: Python - Size: 3.17 MB - Last synced: 7 months ago - Pushed: over 3 years ago - Stars: 9 - Forks: 6

WendellGul/AGAH

Source code for paper "Adversary Guided Asymmetric Hashing for Cross-Modal Retrieval".

Language: Python - Size: 553 KB - Last synced: about 1 month ago - Pushed: over 4 years ago - Stars: 36 - Forks: 11

mariyahendriksen/ecir23-object-centric-vs-scene-centric-CMR

This repository contains the code for the paper "Object-centric vs. Scene-centric Image-Text Cross-modal Retrieval: A Reproducibility Study" published at ECIR 2023.

Language: Python - Size: 12.3 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 4 - Forks: 0

penghu-cs/MvLDAN

Multi-view Linear Discriminant Analysis Network for Cross-modal Retrieval and Cross-view Recognition (Keras&Theano Code)

Language: Python - Size: 38.5 MB - Last synced: 7 months ago - Pushed: over 4 years ago - Stars: 14 - Forks: 5

huycq1712/ViTAA Fork of Jarr0d/ViTAA

ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language

Language: Python - Size: 68.4 KB - Last synced: 8 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

penghu-cs/MRL

Learning Cross-Modal Retrieval with Noisy Labels (CVPR 2021, PyTorch Code)

Language: Python - Size: 23.9 MB - Last synced: 7 months ago - Pushed: about 1 year ago - Stars: 44 - Forks: 10

LivXue/GNN4CMR

PyTorch implementation of the AAAI-21 paper "Dual Adversarial Label-aware Graph Neural Networks for Cross-modal Retrieval" and the TPAMI-22 paper "Integrating Multi-Label Contrastive Learning with Dual Adversarial Graph Neural Networks for Cross-Modal Retrieval".

Language: Python - Size: 596 KB - Last synced: 8 months ago - Pushed: over 1 year ago - Stars: 24 - Forks: 3

GuanRunwei/VehicleFinder-CTIM

Language: Python - Size: 7.13 MB - Last synced: 9 months ago - Pushed: 10 months ago - Stars: 3 - Forks: 0

LivXue/ALGCN

This repository contains the author's implementation in PyTorch for the paper "Adaptive Label-aware Graph Convolutional Networks for Cross-Modal Retrieval".

Language: Python - Size: 906 KB - Last synced: 8 months ago - Pushed: over 2 years ago - Stars: 9 - Forks: 3

klean2050/EEG_CrossModal

[ICASSP 2022] EEG - Music Cross Modal Learning

Language: Python - Size: 849 KB - Last synced: 10 months ago - Pushed: about 2 years ago - Stars: 7 - Forks: 1

ict-bigdatalab/VNEL

Dataset and code for EMNLP 2022 "Visual Named Entity Linking: A New Dataset and A Baseline"

Size: 4.91 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 17 - Forks: 0

frank-chris/ImageTextRetrieval

In this work, we implement different cross-modal learning schemes such as Siamese Network, Correlational Network and Deep Cross-Modal Projection Learning model and study their performance. We also propose a modified Deep Cross-Modal Projection Learning model that uses a different image feature extractor. We evaluate the model’s performance on image-text retrieval on a fashion clothing dataset.

Language: Jupyter Notebook - Size: 6.88 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 9 - Forks: 2

PreferredAI/sml

Code for the paper "Sentiment-Oriented Metric Learning for Text-to-Image Retrieval", ECIR'21

Language: Python - Size: 958 KB - Last synced: 22 days ago - Pushed: over 2 years ago - Stars: 3 - Forks: 0

mako443/Text2Pos-CVPR2022

Code, dataset and models for our CVPR 2022 publication "Text2Pos"

Language: Python - Size: 450 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 30 - Forks: 3

BrandonHanx/TextReID

[BMVC 2021] Text-Based Person Search with Limited Data

Language: Python - Size: 96.7 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 37 - Forks: 7

ilaria-manco/muscall

Official implementation of "Contrastive Audio-Language Learning for Music" (ISMIR 2022)

Language: Python - Size: 193 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 63 - Forks: 5

AyanKumarBhunia/on-the-fly-FGSBIR

[CVPR 2020, Oral] "Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image Retrieval”, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2020. .

Language: Python - Size: 20.7 MB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 55 - Forks: 13

kaylode/tern

Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU

Language: Jupyter Notebook - Size: 7.23 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 6 - Forks: 1

zhouyu1996/DAQN

An implement of our paper “DEEP ADVERSARIAL QUANTIZATION NETWORK FOR CROSS-MODAL RETRIEVAL”

Language: Python - Size: 42 KB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 10 - Forks: 3

AkChen/UDIH

Tensorflow implementation of UDIH

Language: Python - Size: 39.1 KB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 2 - Forks: 1

penghu-cs/MAN

Multimodal Adversarial Network for Cross-modal Retrieval (PyTorch Code)

Language: Python - Size: 8.43 MB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 26 - Forks: 6

penghu-cs/SDML

Scalable deep multimodal learning for cross-modal retrieval (SIGIR 2019, PyTorch Code)

Language: Python - Size: 23.5 MB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 30 - Forks: 13

penghu-cs/DCHN

Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)

Language: Python - Size: 267 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 4 - Forks: 0

penghu-cs/ISVN

Deep Semisupervised Cross-modal Retrieval/Cross-view Recognition (IEEE TCYB 2022, PyTorch Code)

Language: Python - Size: 1.45 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 3 - Forks: 0

gorjanradevski/vsepp_tensorflow

Implementation of "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives" in Tensorflow.

Language: Python - Size: 49.8 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 5 - Forks: 0

idealwhite/VLDeformer

Pytorch implement of the paper "VLDeformer: Vision Language Decomposed Transformer for Fast Cross-modal Retrieval", KBS 2022

Language: Jupyter Notebook - Size: 2.42 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 27 - Forks: 3

SahilC/Cross-Modal-Style

An attempt to transfer sentence to image style.

Language: Python - Size: 27.6 MB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

PrithivirajDamodaran/WhatTheFood

An intentionally simple Image to Food cross-modal search. Created by Prithiviraj Damodaran.

Size: 1000 Bytes - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 2 - Forks: 0

b7GsWQMA2XDrdR/VNEL

VNEL(Visual Named Entity Linking) is a brand-new task that accepts the pure image and processes entity linking on it, which focus on CBIR, Cross-modal retrieve, and Multimodal fusion.

Size: 2.26 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

gorjanradevski/cross_modal_full_transfer

PyTorch code for cross-modal-retrieval on Flickr8k/30k using Bert and EfficientNet

Language: Python - Size: 72.3 KB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 3 - Forks: 1

LongLong-Jing/XMV

PyTorch implementation for Self-supervised Modal and View Invariant Feature Learning

Size: 7.27 MB - Last synced: 12 months ago - Pushed: almost 4 years ago - Stars: 0 - Forks: 0

dingyh0626/KDD-Cup-Multimodalities-Recall

KDD Cup 2020

Language: Python - Size: 283 KB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 6 - Forks: 1

hthoai/image-text-matching

Image-Text Matching Model Zoo

Language: Python - Size: 12.7 MB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 1 - Forks: 2

frank-chris/Image-Text-Retrieval-Web-App

Flask Web App for ES-654 Machine Learning course project

Language: Python - Size: 135 KB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 1 - Forks: 2

ranarag/ZSCRGAN

Language: Python - Size: 681 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 5 - Forks: 1

sontung/hci-intermodal-reasoning

Fachpraktikum project for Human-computer interaction course

Language: Jupyter Notebook - Size: 6.12 MB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 0 - Forks: 1

Related Keywords
cross-modal-retrieval 68 deep-learning 14 pytorch 13 image-text-matching 10 image-retrieval 8 image-text-retrieval 8 computer-vision 6 cross-modal 5 video-retrieval 4 vision-and-language 4 multimodal-deep-learning 4 cvpr 3 retrieval 3 visual-semantic 3 image-captioning 3 cross-modal-hashing 3 clip 3 entity-linking 3 cross-modal-learning 3 multimodal 3 tensorflow 3 large-language-models 3 nlp 3 image-text-search 3 deep-multimodal-learning 2 machine-learning 2 python 2 vse 2 transformer 2 text-matching 2 cross-modality 2 onnx 2 image-search 2 multimodal-learning 2 person-reidentification 2 cross-view-recognition 2 vision-language 2 metric-learning 2 mscoco-dataset 2 contrastive-learning 2 flask 2 unsupervised-learning 2 multi-modal-learning 2 probabilistic-machine-learning 2 probabilistic-embeddings 2 video-captioning 2 video-question-answering 2 visual-semantic-embedding 2 visual-question-answering 2 vehicle-retrieval 1 graph-convolutional-networks 1 affective-computing 1 deap 1 eeg 1 emotion-recognition 1 information-extraction 1 music 1 music-cognition 1 image-generation 1 image-classification 1 entity-alignment 1 sentiment-oriented 1 cvpr2022 1 language-processing 1 localization 1 transfer-learning 1 music-ai 1 multi-modal-fusion 1 multimodal-representation 1 knowledge-graph-embeddings 1 common-vector-space 1 sentence-paraphrasing 1 sequence-to-sequence 1 coco 1 code 1 conceptual-captions 1 doc2vec 1 eccv2020 1 goodnews 1 mscoco 1 politics 1 agah 1 knowledge-graph 1 noisy-labels 1 adversarial-networks 1 graph-neural-networks 1 increasing-views 1 semisupervised-learning 1 vsepp 1 text-to-image-search 1 style-transfer 1 bert-model 1 efficientnet 1 efficientnet-pytorch 1 transformers-library 1 point-cloud 1 self-supervised-learning 1 kddcup 1 recommender-system 1 stacked-cross-attention 1