multi-modal-learning | Topic | Ecosyste.ms: Repos

Topic: "multi-modal-learning"

mlfoundations/open_clip

An open source implementation of CLIP.

Language: Python - Size: 15 MB - Last synced at: 4 days ago - Pushed at: 23 days ago - Stars: 11,705 - Forks: 1,102

OFA-Sys/Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Language: Python - Size: 2.5 MB - Last synced at: 3 days ago - Pushed at: 9 months ago - Stars: 5,180 - Forks: 497

lyuchenyang/Macaw-LLM

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Language: Python - Size: 35.9 MB - Last synced at: 2 days ago - Pushed at: 5 months ago - Stars: 1,564 - Forks: 126

NVlabs/prismer

The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".

Language: Python - Size: 4.25 MB - Last synced at: about 16 hours ago - Pushed at: over 1 year ago - Stars: 1,310 - Forks: 73

lucidrains/x-clip

A concise but complete implementation of CLIP with various experimental improvements from recent papers

Language: Python - Size: 1.46 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 711 - Forks: 47

jokieleung/awesome-visual-question-answering

A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Commonsense Reasoning and related area.

Size: 179 KB - Last synced at: 9 days ago - Pushed at: almost 2 years ago - Stars: 662 - Forks: 95

OpenRobotLab/EmbodiedScan

[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

Language: Python - Size: 23.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 551 - Forks: 40

kyegomez/zeta

Build high-performance AI models with modular building blocks

Language: Python - Size: 41.3 MB - Last synced at: 2 days ago - Pushed at: 4 days ago - Stars: 509 - Forks: 52

DmitryRyumin/CVPR-2023-24-Papers

CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!

Language: Python - Size: 10.3 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 447 - Forks: 29

zjukg/KG-MM-Survey

Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey

Size: 82.3 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 401 - Forks: 19

zhengli97/PromptKD

[CVPR 2024] Official PyTorch Code for "PromptKD: Unsupervised Prompt Distillation for Vision-Language Models"

Language: Python - Size: 11.2 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 281 - Forks: 3

Ysz2022/NeRCo

[ICCV 2023] Implicit Neural Representation for Cooperative Low-light Image Enhancement

Language: Python - Size: 1.87 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 231 - Forks: 16

huggingface/chug

Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

Language: Python - Size: 146 KB - Last synced at: about 23 hours ago - Pushed at: about 1 year ago - Stars: 157 - Forks: 11

moabarar/nemar

[CVPR2020] Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation

Language: Python - Size: 161 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 153 - Forks: 25

qizekun/ReCon

[ICML 2023] Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining

Language: Python - Size: 1.97 MB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 142 - Forks: 13

GuanRunwei/Achelous

Achelous: A Fast Unified Water-surface Panoptic Perception Framework based on Fusion of Monocular Camera and 4D mmWave Radar

Language: Python - Size: 67.3 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 135 - Forks: 7

shikras/d-cube

A detection/segmentation dataset with labels characterized by intricate and flexible expressions. "Described Object Detection: Liberating Object Detection with Flexible Expressions" (NeurIPS 2023).

Language: Python - Size: 835 KB - Last synced at: 20 days ago - Pushed at: about 1 year ago - Stars: 119 - Forks: 7

wjun0830/CGDETR

Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"

Language: Python - Size: 23.3 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 105 - Forks: 11

924973292/EDITOR

【CVPR2024】Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification

Language: Python - Size: 10.6 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 77 - Forks: 5

likyoo/Multimodal-Remote-Sensing-Toolkit

A python tool to perform deep learning experiments on multimodal remote sensing data.

Language: Python - Size: 1.01 MB - Last synced at: 11 months ago - Pushed at: over 3 years ago - Stars: 74 - Forks: 12

josedolz/HyperDenseNet_pytorch

Pytorch version of the HyperDenseNet deep neural network for multi-modal image segmentation

Language: Python - Size: 396 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 74 - Forks: 12

rinnakk/japanese-clip

Japanese CLIP by rinna Co., Ltd.

Language: Python - Size: 574 KB - Last synced at: about 12 hours ago - Pushed at: over 1 year ago - Stars: 72 - Forks: 9

rentainhe/TRAR-VQA

[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"

Language: Python - Size: 927 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 66 - Forks: 18

vishalned/MMEarth-data

This repository contains code to download data for the preprint "MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning"

Language: Python - Size: 114 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 65 - Forks: 4

ttgeng233/UnAV

Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)

Language: Python - Size: 19.9 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 63 - Forks: 6

zhjohnchan/awesome-vision-and-language-pretraining

A curated list of vision-and-language pre-training (VLP). :-)

Size: 125 KB - Last synced at: 3 days ago - Pushed at: almost 3 years ago - Stars: 58 - Forks: 7

924973292/Awesome-Multi-Modal-Object-Re-Identification

Welcome to the Awesome Multi-Modal Object Re-Identification Repository! This repository is dedicated to curating and sharing the latest methods, datasets, and resources focused specifically on the domain of multi-modal object re-identification. It brings together cutting-edge research, tools, and papers aimed at advancing the study and application.

Size: 37.1 KB - Last synced at: 5 days ago - Pushed at: 25 days ago - Stars: 55 - Forks: 4

RAIVNLab/sugar-crepe

[NeurIPS 2023] A faithful benchmark for vision-language compositionality

Language: Python - Size: 3.33 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 54 - Forks: 7

3dlg-hcvc/DuoduoCLIP

[ICLR 2025] Duoduo CLIP: Efficient 3D Understanding with Multi-View Images

Language: Python - Size: 51.5 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 51 - Forks: 3

WillDreamer/Aurora

[NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model

Language: Python - Size: 118 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 48 - Forks: 3

deep-symbolic-mathematics/Multimodal-Math-Pretraining

[ICLR 2024 Spotlight] This is the official code for the paper "SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training"

Language: Python - Size: 989 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 42 - Forks: 5

richard-peng-xia/HGCLIP

[COLING'25] HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding

Language: Python - Size: 1.69 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 33 - Forks: 1

filipbasara0/simple-clip

A minimal, but effective implementation of CLIP (Contrastive Language-Image Pretraining) in PyTorch

Language: Jupyter Notebook - Size: 83 KB - Last synced at: 14 minutes ago - Pushed at: over 1 year ago - Stars: 31 - Forks: 5

liyichen-cly/MMEA

MMEA: Entity Alignment for Multi-Modal Knowledge Graphs

Language: Python - Size: 319 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 31 - Forks: 4

YuanGongND/uavm

Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".

Language: Python - Size: 3.28 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 29 - Forks: 0

kyegomez/MegaVIT

The open source implementation of the model from "Scaling Vision Transformers to 22 Billion Parameters"

Language: Python - Size: 211 KB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 28 - Forks: 1

YunzeMan/Situation3D

[CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning

Language: Python - Size: 63.3 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 28 - Forks: 2

RL4M/MRM-pytorch

An official implementation of Advancing Radiograph Representation Learning with Masked Record Modeling (ICLR'23)

Language: Python - Size: 224 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 28 - Forks: 0

depshad/Deep-Learning-Framework-for-Multi-modal-Product-Classification

Code repository for Rakuten Data Challenge: Multimodal Product Classification and Retrieval.

Language: Jupyter Notebook - Size: 174 KB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 26 - Forks: 9

peymanbateni/multimodal-emotion-analysis-in-conversations

Multi-model analysis of sentiment and emotion in multi-speaker conversations.

Language: Jupyter Notebook - Size: 36.2 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 24 - Forks: 6

KnowledgeDiscovery/rca_baselines

Code for "LEMMA-RCA: A Large Multi-modal Multi-domain Dataset for Root Cause Analysis" paper

Language: Python - Size: 1.8 MB - Last synced at: 34 minutes ago - Pushed at: about 2 hours ago - Stars: 20 - Forks: 5

sandipan211/ZSD-SC-Resolver

Resolving semantic confusions for improved zero-shot detection (BMVC 2022)

Language: Python - Size: 77 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 20 - Forks: 4

gaurav104/WSS-CMER

Code for the paper : "Weakly supervised segmentation with cross-modality equivariant constraints", available at https://arxiv.org/pdf/2104.02488.pdf

Language: Python - Size: 656 KB - Last synced at: 10 months ago - Pushed at: over 2 years ago - Stars: 20 - Forks: 3

jackyjsy/SAM-SLR-v2

SAM-SLR-v2 is an improved version of SAM-SLR for sign language recognition.

Language: Python - Size: 191 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 20 - Forks: 4

ivclab/NeuralMerger

Yi-Min Chou, Yi-Ming Chan, Jia-Hong Lee, Chih-Yi Chiu, Chu-Song Chen, "Unifying and Merging Well-trained Deep Neural Networks for Inference Stage," International Joint Conference on Artificial Intelligence (IJCAI), 2018

Language: Python - Size: 18.5 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 20 - Forks: 3

HackerHyper/ACMVH

Adaptive Confidence Multi-View Hashing

Language: Python - Size: 19.5 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 19 - Forks: 0

chenxi52/FrozenSeg

Open-Vocabulary Panoptic Segmentation

Language: Python - Size: 1.11 MB - Last synced at: 6 months ago - Pushed at: 8 months ago - Stars: 18 - Forks: 1

kyegomez/NeVA

The open source implementation of "NeVA: NeMo Vision and Language Assistant"

Language: Python - Size: 253 KB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 18 - Forks: 1

924973292/IDEA

【CVPR2025】IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification

Language: Python - Size: 34.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 17 - Forks: 3

fmenat/MultiviewCropClassification

Public repository of our IGARSS 2023 submission

Language: Python - Size: 132 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 16 - Forks: 1

abhrac/xmodal-vit

Official implementation of "Cross-Modal Fusion Distillation for Fine-Grained Sketch-Based Image Retrieval", BMVC 2022.

Language: Python - Size: 330 KB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 16 - Forks: 1

graphprojects/CM-GCL

Source code of NeurIPS 2022 paper “Co-Modality Graph Contrastive Learning for Imbalanced Node Classification”

Language: Python - Size: 3.58 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 16 - Forks: 2

zhengli97/ATPrompt

Official PyTorch Code for "ATPrompt: Textual Prompt Learning with Embedded Attributes"

Language: Python - Size: 11.2 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 12 - Forks: 0

deep-symbolic-mathematics/Multimodal-Symbolic-Regression

[ICLR 2024 Spotlight] SNIP on Symbolic Regression: Deep Symbolic Regression with Multimodal Pretraining

Language: Python - Size: 1.29 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 12 - Forks: 3

MMintLab/VIRDO

Github repository of a Visio-tactile Implicit Representations of Deformable Objects (ICRA 2022)

Language: Jupyter Notebook - Size: 733 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 1

tudelft-iv/UniBEV

[IVS'24] UniBEV: the official implementation of UniBEV

Language: Python - Size: 12.3 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 10 - Forks: 1

sayakpaul/Multimodal-Entailment-Baseline

This repository shows how to implement a basic model for multimodal entailment.

Language: Jupyter Notebook - Size: 3.17 MB - Last synced at: 9 days ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 4

mailcorahul/auto_labeler

auto_labeler - An all-in-one library to automatically label vision data

Language: Python - Size: 32.2 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 9 - Forks: 1

LooperXX/ManagerTower

Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning

Language: Python - Size: 6.71 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 1

Agora-Lab-AI/EKR

Elysium Knowledge Repository is an open source initiative to embed all of Humanity's multi-modal knowledge and wisdom.

Language: Python - Size: 2.15 MB - Last synced at: 21 days ago - Pushed at: almost 2 years ago - Stars: 9 - Forks: 1

raphaelmemmesheimer/gimme_signals_action_recognition

Multi-Modal action recognition for skeleton sequences, inertial measurements, motion capturing data and Wi-Fi CSI fingerprints.

Language: Python - Size: 521 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 2

liveseongho/DramaQA

DramaQA Starter Code (2021)

Language: Python - Size: 69.9 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 3

TianyiFranklinWang/MIRROR

MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention

Language: Python - Size: 8.12 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 8 - Forks: 0

fullscreen-triangle/four-sided-triangle

A sophisticated multi-model optimization pipeline for domain-expert knowledge extraction RAG systems

Language: Python - Size: 6.56 MB - Last synced at: 3 days ago - Pushed at: 16 days ago - Stars: 8 - Forks: 0

v-iashin/CORSMAL

🏆 🏆 Top-1 Submission to CORSMAL Challenge 2020 (at ICPR). The winning solution for the CORSMAL Challenge (on Intelligent Sensing Summer School 2020)

Language: Jupyter Notebook - Size: 752 MB - Last synced at: 7 months ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 4

fmenat/optimal-multiview-crop-classifier

Public repository of our work in the search for an optimal multi-view crop classifier (considering encoder architectures and fusion strategies)

Language: Python - Size: 9.35 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 6 - Forks: 0

fpsluozi/tofindwaldo

Official Repo for "To Find Waldo You Need Contextual Cues: Debiasing Who’s Waldo", ACL 2022 (Short)

Size: 205 KB - Last synced at: 7 months ago - Pushed at: about 2 years ago - Stars: 6 - Forks: 0

iCVTEAM/M3TR

M3TR: Multi-modal Multi-label Recognition with Transformer. ACM MM 2021

Language: Python - Size: 1.33 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 2

yihedeng9/STIC

Enhancing Large Vision Language Models with Self-Training on Image Comprehension.

Language: Python - Size: 5.98 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 5 - Forks: 0

ArsamAryandoust/UniversalGNNs

Universal graph neural networks for multi-task transfer learning

Language: Python - Size: 8.58 MB - Last synced at: 7 days ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 2

peixinlei/M2HSE

PyTorch code for the paper "Complementarity is the king: A multi-modal and multi-grained hierarchical semantic enhancement network for cross-modal retrieval"

Language: Python - Size: 82 KB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

murali1996/nlp-notes

A curated list of papers and experiments in the field of Natural Language Processing (NLP)

Size: 547 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 3

kdhht2334/Hidden_Emotion_Detection_using_MM_Signals

[CHI2021] Hidden emotion detection using multi-modal signals

Language: Python - Size: 48 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 1

JHKim-snu/PGA

[IROS 2024] PGA: Personalizing Grasping Agents with Single Human-Robot Interaction

Language: Python - Size: 34.8 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 4 - Forks: 0

lemma-rca/lemma-rca.github.io

Code for LEMMA-RCA website

Language: HTML - Size: 3.6 MB - Last synced at: about 7 hours ago - Pushed at: about 8 hours ago - Stars: 3 - Forks: 2

MunzerDw/Gen3DQA

My paper (BMVC23) on 3D visual question answering at the lab of Prof. Dr. Niessner at Technical University of Munich.

Language: Python - Size: 16.5 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

GuanRunwei/VehicleFinder-CTIM

Language: Python - Size: 7.13 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

lyuchenyang/Semantic-aware-VideoQA

Code for ACL SRW 2023 paepr "Semantic-aware Dynamic Retrospective-Prospective Reasoning for Event-level Video Question Answering"

Language: Python - Size: 31.3 KB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

Hleephilip/MLVU-project

Modality Translation through Conditional Encoder-Decoder (2023-1 Machine Learning for Visual Understanding Team project)

Language: Python - Size: 1.64 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

Bekyilma/VA_RecSys

Learning Latent Semantic Representations of Paintings for Personalized Recommendation

Language: PHP - Size: 10.7 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

fmenat/missingviews-study-EO

Public repository of our IGARSS 2024 work

Language: Python - Size: 455 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2 - Forks: 0

ZINZINBIN/Disruption-Prediciton-based-on-Multimodal-Deep-Learning

Research-repository: Disruption Prediction and Analysis through Multimodal Deep Learning in KSTAR

Language: Jupyter Notebook - Size: 196 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

lif314/NeAF

[AAAI 2025] Representing Sounds as Neural Amplitude Fields: A Benchmark of Coordinate-MLPs and A Fourier Kolmogorov-Arnold Framework

Language: Python - Size: 3.85 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

amazon-science/contrastive_emc2

Code the ICML 2024 paper: "EMC^2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence"

Language: Python - Size: 7.77 MB - Last synced at: 3 months ago - Pushed at: 12 months ago - Stars: 2 - Forks: 0

WangJingyao07/ST-F2M

🌈 Official Code for **Spatio-Temporal Fuzzy-oriented Multi-modal Meta-learning for Fine-grained Emotion Recognition**

Language: Python - Size: 12.7 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

jianzhnie/MultimodalTransformers

lmmtoolkit is a toolkit for Multi-Modal Learning

Language: Python - Size: 22.5 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

lyuchenyang/Efficient-VideoQA

Code for ACL SustaiNLP 2023 paper "Is a Video worth n × n Images? A Highly Efficient Approach to Transformer-based Video Question Answering"

Language: Python - Size: 28.3 KB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

yookyungkho/Multimodal-Entailment-pytorch

Pytorch Implementation of Multimodal Entailment baseline

Language: Jupyter Notebook - Size: 801 KB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

talipucar/DomainTranslation

Pytorch implementation of "Multi-domain translation between single-cell imaging and sequencing data using autoencoders" (https://www.nature.com/articles/s41467-020-20249-2) with custom models.

Language: Python - Size: 2.17 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

STiFLeR7/Multi-Modal-Learning-for-Image-and-Text-Analysis

Develops approaches for jointly analyzing images and text using deep learning. Covers applications like image-text matching, visual question answering, image captioning, and sentiment analysis with visual context.

Language: Python - Size: 873 KB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

LIU42/Contrastive

项目取材自 2024 年 ”泰迪杯“ 数据挖掘挑战赛 B 题，基于共享特征空间对比学习的跨模态图文互检模型

Language: Python - Size: 20.5 KB - Last synced at: 7 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

ammarlodhi255/metadata-augmented-neural-networks-for-wild-animal-classification

This repository contains the implementation code for the paper "Metadata Augmented Neural Networks For Wild Animal Classification": https://www.sciencedirect.com/science/article/pii/S1574954124003479.

Language: Jupyter Notebook - Size: 8.83 MB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

DFKI-Earth-And-Space-Applications/MVCC_IGARSS

Public repository of our IGARSS 2023 submission

Language: Python - Size: 132 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 1

outta-ai/2023_OUTTA_AIBootcamp_final_project

Language: Python - Size: 1.87 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

joannahong/Visagesyntalk

The video demo of ECCV2022 paper titled "Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection"

Size: 44.5 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Karami-m/Deep-Probabilistic-Multi-View

The code of the paper: M. Karami, D. Schuurmans, "Deep Probabilistic Canonical Correlation Analysis" AAAI 2021

Language: Python - Size: 11.2 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

itsShnik/allForOne

PyTorch implementation of the paper: All For One: Multi-modal Multi-Task Learning

Language: Python - Size: 230 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

kjanjua26/Do_Cross_Modal_Systems_Leverage_Semantic_Relationships

This is the code for our ICCV'19 paper on cross-modal learning and retrieval.

Size: 3.21 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 1

aneeuk/four-sided-triangle

A sophisticated multi-model optimization pipeline for domain-expert knowledge extraction RAG systems

Language: Python - Size: 6.57 MB - Last synced at: about 6 hours ago - Pushed at: about 7 hours ago - Stars: 0 - Forks: 0

eieye/BLN_ABC

A Primer for basic literacy development | Ein "Vorkurs" zur Alphabetisierung in Deutsch als Zweitsprache

Language: JavaScript - Size: 1.5 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0