Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: cross-modal-learning

whwu95/Text4Vis

【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective

Language: Python - Size: 8.66 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 149 - Forks: 13

KimMeen/Time-LLM

[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"

Language: Python - Size: 1.04 MB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 765 - Forks: 121

RunpeiDong/ACT

[ICLR 2023] Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?

Language: Python - Size: 5.49 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 87 - Forks: 5

MohamedAfham/CrossPoint

Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

Language: Python - Size: 1.99 MB - Last synced: 2 months ago - Pushed: about 1 year ago - Stars: 221 - Forks: 29

Toytiny/CMFlow

[CVPR 2023 Highlight] Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision

Language: Python - Size: 289 MB - Last synced: 3 months ago - Pushed: 11 months ago - Stars: 101 - Forks: 12

Markin-Wang/CAMANet

[IJBHI 2023] This is the official implementation of CAMANet: Class Activation Map Guided Attention Network for Radiology Report Generation accepted to IEEE Journal of Biomedical and Health Informatics (J-BHI), 2023.

Language: Python - Size: 115 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 3 - Forks: 0

codiceSpaghetti/T4SA-2.0

This project creates the T4SA 2.0 dataset, i.e. a big set of data to train visual models for Sentiment Analysis in the Twitter domain using a cross-modal student-teacher approach.

Language: Jupyter Notebook - Size: 2.73 GB - Last synced: about 2 months ago - Pushed: over 1 year ago - Stars: 3 - Forks: 1

whwu95/Cap4Video

【CVPR'2023 Highlight】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?

Language: Python - Size: 8.56 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 159 - Forks: 10

whwu95/BIKE

【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models

Language: Python - Size: 9.03 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 134 - Forks: 14

IGITUGraz/MemoryDependentComputation

Code for Limbacher, T., Özdenizci, O., & Legenstein, R. (2022). Memory-enriched computation and learning in spiking neural networks through Hebbian plasticity. arXiv preprint arXiv:2205.11276.

Language: Python - Size: 3.63 MB - Last synced: 8 months ago - Pushed: about 1 year ago - Stars: 6 - Forks: 3

verlab/StraightToThePoint_CVPR_2020

Original PyTorch implementation of the code for the paper "Straight to the Point: Fast-forwarding Videos via Reinforcement Learning Using Textual Data" at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020

Language: Python - Size: 27.4 MB - Last synced: 10 months ago - Pushed: about 2 years ago - Stars: 8 - Forks: 1

frank-chris/ImageTextRetrieval

In this work, we implement different cross-modal learning schemes such as Siamese Network, Correlational Network and Deep Cross-Modal Projection Learning model and study their performance. We also propose a modified Deep Cross-Modal Projection Learning model that uses a different image feature extractor. We evaluate the model’s performance on image-text retrieval on a fashion clothing dataset.

Language: Jupyter Notebook - Size: 6.88 MB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 9 - Forks: 2

mako443/Text2Pos-CVPR2022

Code, dataset and models for our CVPR 2022 publication "Text2Pos"

Language: Python - Size: 450 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 30 - Forks: 3

choyingw/Cross-Modal-Perceptionist

CVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?

Language: Python - Size: 7.81 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 96 - Forks: 11

Qwinpin/DanceBERT-Masked-Motion-Modeling

Language: Jupyter Notebook - Size: 3.91 MB - Last synced: over 1 year ago - Pushed: about 3 years ago - Stars: 2 - Forks: 0

PrithivirajDamodaran/WhatTheFood

An intentionally simple Image to Food cross-modal search. Created by Prithiviraj Damodaran.

Size: 1000 Bytes - Last synced: over 1 year ago - Pushed: over 2 years ago - Stars: 2 - Forks: 0

kjanjua26/Do_Cross_Modal_Systems_Leverage_Semantic_Relationships

This is the code for our ICCV'19 paper on cross-modal learning and retrieval.

Size: 3.21 MB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 1 - Forks: 1

Related Keywords
cross-modal-learning 17 deep-learning 5 computer-vision 4 pytorch 4 video-understanding 3 cvpr 3 cross-modal-retrieval 3 cvpr2022 2 cross-modal 2 tensorflow 2 nlp 2 video-language-understanding 2 self-supervised-learning 2 3d-point-clouds 2 action-recognition 2 transfer-learning 2 video-recognition 2 reinforcement-learning 2 machine-learning 2 multimodal-deep-learning 2 video-summarization 1 vision-and-language 1 fast-forward 1 hyperlapse 1 flask 1 image-text-retrieval 1 video-processing 1 multimodal-learning 1 agent 1 video-fast-forward 1 video-analysis 1 text-and-image 1 semantic-similarity 1 scene-understanding 1 retrieval-systems 1 retrieval 1 multi-modal-learning 1 iccv 1 caption-retreival 1 multimodal 1 motion-generation 1 dance-generation 1 bert 1 speech-to-face 1 speech-synthesis 1 speech 1 cognitive-science 1 biometrics 1 3dmm 1 3d-models 1 3d 1 localization 1 language-processing 1 spiking-neural-networks 1 mobile-robotics 1 ego-motion-estimation 1 autonomous-driving 1 automotive-radar 1 4d-radar 1 unsupervised-learning 1 point-cloud 1 object-classification 1 few-shot-learning 1 representation-learning 1 time-series-forecasting 1 time-series-forecast 1 time-series-analysis 1 time-series 1 prompt-tuning 1 multimodal-time-series 1 large-language-models 1 language-model 1 cross-modality 1 recurrent-neural-networks 1 question-answering 1 pythorch 1 python 1 one-shot-learning 1 neural-networks 1 memory-networks 1 hebbian-learning 1 babi-tasks 1 associations 1 video-text-retrieval 1 twitter-sentiment-analysis 1 student-teacher-learning 1 dataset-creation 1 radiology-report-generation 1 medical-report-generation 1 scene-flow 1 optical-flow 1 motion-segmentation 1