Topic: "cross-modality"
jina-ai/clip-as-service
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
Language: Python - Size: 27.4 MB - Last synced at: 1 day ago - Pushed at: about 1 year ago - Stars: 12,640 - Forks: 2,075

THUDM/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
Language: Python - Size: 25.8 MB - Last synced at: 3 days ago - Pushed at: 11 months ago - Stars: 6,485 - Forks: 429

KimMeen/Time-LLM
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
Language: Python - Size: 1.06 MB - Last synced at: 12 days ago - Pushed at: 6 months ago - Stars: 1,928 - Forks: 333

hangzhaomit/Sound-of-Pixels
Codebase for ECCV18 "The Sound of Pixels"
Language: Python - Size: 1.24 MB - Last synced at: 18 days ago - Pushed at: almost 3 years ago - Stars: 378 - Forks: 75

layumi/Image-Text-Embedding
TOMM2020 Dual-Path Convolutional Image-Text Embedding with Instance Loss :feet: https://arxiv.org/abs/1711.05535
Language: MATLAB - Size: 6.02 MB - Last synced at: 12 days ago - Pushed at: 3 months ago - Stars: 290 - Forks: 73

movienet/movienet-tools
Tools for movie and video research
Language: C++ - Size: 6.56 MB - Last synced at: 4 days ago - Pushed at: almost 3 years ago - Stars: 288 - Forks: 34

haofanwang/awesome-conditional-content-generation
Update-to-data resources for conditional content generation, including human motion generation, image or video generation and editing.
Size: 129 KB - Last synced at: 7 days ago - Pushed at: 9 months ago - Stars: 268 - Forks: 28

bismex/Awesome-cross-modality-person-re-identification
Awesome Cross-modality Person Re-identification
Size: 43.9 KB - Last synced at: 5 days ago - Pushed at: almost 3 years ago - Stars: 147 - Forks: 32

sail-sg/ptp
[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》
Language: Python - Size: 2.37 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 133 - Forks: 3

AnjanDutta/sem-pcyc
PyTorch implementation of the paper "Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-based Image Retrieval", CVPR 2019.
Language: Python - Size: 23 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 111 - Forks: 23

Event-AHU/EventVOT_Benchmark
[CVPR-2024] The First High Definition (HD) Event based Visual Object Tracking Benchmark Dataset
Language: Python - Size: 41.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 104 - Forks: 5

mangye16/Visible-Thermal-Person-Re-Identification
Demo code for visible thermal (cross-modality) person re-identification
Language: Python - Size: 195 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 87 - Forks: 18

rhgao/co-separation
Co-Separating Sounds of Visual Objects (ICCV 2019)
Language: Python - Size: 465 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 78 - Forks: 24

ZYK100/LLCM
[CVPR 2023] Diverse Embedding Expansion Network and Low-Light Cross-Modality Benchmark for Visible-Infrared Person Re-identification
Language: Python - Size: 4.99 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 66 - Forks: 8

JDAI-CV/CM-NAS
CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification (ICCV2021)
Language: Python - Size: 33.2 KB - Last synced at: 12 days ago - Pushed at: over 3 years ago - Stars: 48 - Forks: 13

M-3LAB/awesome-multimodal-brain-image-systhesis
Size: 20.5 KB - Last synced at: 1 day ago - Pushed at: about 2 years ago - Stars: 34 - Forks: 6

AdityaLab/MM4TSA
A professional list on Multi-Modalities For Time Series Analysis (MM4TSA) Papers and Resource.
Size: 457 KB - Last synced at: 4 days ago - Pushed at: 21 days ago - Stars: 27 - Forks: 0

catalina17/VideoNavQA
An alternative EQA paradigm and informative benchmark + models (BMVC 2019, ViGIL 2019 spotlight)
Language: Python - Size: 5.17 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 23 - Forks: 1

zjzsliyang/CrossLeak
Code for the WWW'20 paper "Nowhere to Hide: Cross-modal Identity Leakage between Biometrics and Devices"
Language: Python - Size: 47.9 KB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 22 - Forks: 5

chenjingong/DN-ReID
[CVPR2024]Day-Night Cross-domain Vehicle Re-identification
Language: Python - Size: 9.46 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 21 - Forks: 1

ZYK100/MMN
Pytorch code for Towards a Unified Middle Modality Learning for Visible-Infrared Person Re-Identification
Language: Python - Size: 45.9 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 18 - Forks: 3

JacobYuan7/OCN-HOI-Benchmark
[AAAI 2022] Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics.
Language: Python - Size: 1.33 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 13 - Forks: 1

Mithunjha/EarEEG_KnowledgeDistillation
Official implementation of "A Knowledge Distillation Framework for Enhancing Ear-EEG based Sleep Staging with Scalp-EEG Data"
Language: Jupyter Notebook - Size: 174 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 2

GuiyuZhao/VRHCF
[ICME 2024] VRHCF: Cross-Source Point Cloud Registration via Voxel Representation and Hierarchical Correspondence Filtering
Language: Python - Size: 482 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

BEAM-Labs/CrossBind
Official Pytorch implementation of CrossBind: Collaborative Cross-Modal Identification of Protein Nucleic-Acid-Binding Residues.
Language: Python - Size: 7.79 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

mx-mark/SPMNet
Source code for "Visually aligned sound generation via sound-producing motion parsing" (Published at Neurocomputing)
Size: 4.88 KB - Last synced at: 12 months ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0
