GitHub topics: multimodal-deep-learning
eftekhar-hossain/Multimodal-Sentiment-LREC2022
This repository contains the relevant materials of the LREC-22 paper.
Language: Jupyter Notebook - Size: 5.56 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

UofLBioinformatics/circDeep
End-to-End learning framework for circular RNA classification from other long non-coding RNA using multimodal deep learning
Language: Python - Size: 47.2 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 21 - Forks: 14

cap-ntu/Video-to-Retail-Platform
An intelligent multimodal-learning based system for video, product and ads analysis. Based on the system, people can build a lot of downstream applications such as product recommendation, video retrieval, etc.
Language: Python - Size: 65.7 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 138 - Forks: 43

XavierSpycy/Deep-Learning
Deep learning projects
Language: Jupyter Notebook - Size: 51.7 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

penghu-cs/MRL
Learning Cross-Modal Retrieval with Noisy Labels (CVPR 2021, PyTorch Code)
Language: Python - Size: 23.9 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 44 - Forks: 10

talipucar/DomainAdaptation
A model for Domain Adaptation, Alignment and Translation using multiple sources of data.
Language: Python - Size: 46.5 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 1

liuzwin98/DSCMT
code released
Language: Python - Size: 64.5 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

iamdanialkamali/MemotionAnalysis
Meme Sentiment Analysis SemEval 2020 Task 9
Language: Jupyter Notebook - Size: 1.93 MB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 0

IsaacRodgz/ConcatBERT
Baseline model for multimodal classification based on images and text. Text representation obtained from pretrained BERT base model and image representation obtained from VGG16 pretrained model.
Language: Jupyter Notebook - Size: 306 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 29 - Forks: 6

husseinmozannar/multimodal-deep-learning-for-disaster-response
Damage Identification in Social Media Posts using Multimodal Deep Learning: code and dataset
Language: Python - Size: 62.5 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 43 - Forks: 16

Merterm/COSMic
Public repo for the paper: "COSMic: A Coherence-Aware Generation Metric for Image Descriptions" by Mert İnan, Piyush Sharma, Baber Khalid, Radu Soricut, Matthew Stone, Malihe Alikhani
Language: Python - Size: 396 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 4 - Forks: 0

HackerHyper/CLIPMH
CLIPMH:CLIP Multi-modal Hashing
Language: Python - Size: 1.12 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 13 - Forks: 0

Etienne-bobo/Skimlit-Nlp
The purpose of this project is to build an NLP model to make reading medical abtracts easier.
Language: Jupyter Notebook - Size: 1.95 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

shOh-ai/Personalized_Emotion-Analysis_using_Multi-modal_DL
2023 1st semester -BigDataProject Team Project Page
Language: Jupyter Notebook - Size: 233 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 2

SmithaUpadhyaya/fashion_image_caption
Automate Fashion Image Captioning using BLIP-2. Automatic generating descriptions of clothes on shopping websites, which can help customers without fashion knowledge to better understand the features (attributes, style, functionality etc.) of the items and increase online sales by enticing more customers.
Language: Jupyter Notebook - Size: 26.6 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 12 - Forks: 1

marcomoldovan/multimodal-self-distillation
A generalized self-supervised training paradigm for unimodal and multimodal alignment and fusion.
Language: Python - Size: 526 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 4 - Forks: 2

abhishekpaul11/Disturbance-Detection-MIM
Instant Messaging App built on React Native with backend deployed on AWS DynamoDB and S3 using AWS Amplify (API Querying handled by GraphQL). The distracting content is filtered with the help of a Multi-modal Deep Learning architecture hosted on an AWS EC2 instance.
Language: Jupyter Notebook - Size: 40.4 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 1

bupt-mmai/S2TD
code for "S2TD: A Tree-Structured Decoder for Image Paragraph Captioning" accepted by MMAsia 2021
Language: Python - Size: 68.7 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 0

clairecyq/whos-waldo
Who's Waldo? Linking People Across Text and Images. ICCV 2021.
Language: Python - Size: 2.86 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 12 - Forks: 4

benoriol/memes_processing
Language: Python - Size: 41 KB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 6

gchochla/Deep-Representations-of-Visual-Descriptions
Pytorch implementation of CVPR'16 paper "Learning Deep Representations of Fine-Grained Visual Descriptions", by Reed et al.
Language: Python - Size: 6.83 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 14 - Forks: 1

HWH-2000/Awesome-paper-for-multimodal
record some related papers on multimodality
Size: 14.6 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

vijayvee/text-to-image-synthesis Fork of artifacia/text-to-image-synthesis
Project to transform a natural language description into an image using Generative Adversarial Networks.
Language: Python - Size: 68.4 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 1

mobled37/utils
Deeplearning utils for multimodal research
Language: Python - Size: 2.93 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

soloist97/densecap-pytorch
A simplified pytorch version of densecap
Language: Jupyter Notebook - Size: 5.1 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 34 - Forks: 8

zch42/BiFusion
Language: Python - Size: 2.08 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 32 - Forks: 9

AndreiMoraru123/ContextCollector
Mixed vision-language Attention Model that gets better by making mistakes
Language: Python - Size: 149 MB - Last synced at: about 10 hours ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

efthymisgeo/multimodal-masking
This repo contains source code for the MultiModal Masking (M^3) Interspeech 2021 paper.
Language: Python - Size: 1.91 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 0

nicolopinci/deepgravilens
Language: Python - Size: 2.97 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

selamgit/EfficientNet_for_Endoscopic_Images_Response_Prediction
PyTorch implementation of EfficientNet for response prediction
Language: Jupyter Notebook - Size: 11.8 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

vishaal27/Multimodal-Video-Emotion-Recognition-Pytorch
A Pytorch implementation of emotion recognition from videos
Language: Python - Size: 1.19 MB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 14 - Forks: 1

TIBHannover/MM_Claims
Official code repository for the paper: Gullal Singh Cheema, Sherzod Hakimov, Abdul Sittar, Eric Müller-Budack, Christian Otto, and Ralph Ewerth. 2022. “MM-Claims: A Dataset for Multimodal Claim Detection in Social Media.“ In Findings of the Association for Computational Linguistics: NAACL 2022, pages 962–979, Seattle, United States.
Language: Python - Size: 42.3 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 2

georgesterpu/Taris
Transformer-based online speech recognition system with TensorFlow 2
Language: Python - Size: 5.4 MB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 25 - Forks: 6

danadascalescu00/MultimodalOpinionAnalysis 📦
Bachelor Thesis: Opinion Polarity Classification - Given a tweet consisting of an image and text, classify the post on three-point scale
Language: Python - Size: 101 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

AmbiTyga/MemSem
A Multi-modal Framework for Sentimental Analysis of Meme
Language: Python - Size: 4.59 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 16 - Forks: 5

annkamsk/mvae
Multimodal Variational Autoencoder dedicated to omics data integration
Language: Jupyter Notebook - Size: 544 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

HLTCHKUST/VG-GPLMs
The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".
Language: Python - Size: 9.32 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 49 - Forks: 8

fabiopernisi/Visual-WSD
This repository contains the code for our solution to the Task 1 of the 17th international workshop about Semantic Evaluation (SemEval-2023)
Language: Jupyter Notebook - Size: 9.35 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

PrachiJainxD/AmbientAI_IMU2CLIP
COMPSCI 696DS Industry Mentorship Program with Meta Reality Labs: Ambient AI: Multimodal Wearable Sensor Understanding (Experiments in Distilling Knowledge in Cross-Modal Contrastive Learning.)
Language: Python - Size: 23.1 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

VityaVitalich/IMAD
IMAD: IMage Augmented multi-modal Dialogue
Language: Python - Size: 1.11 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

UsefGamal/Visual-Question-Answering-VQA
A Multimodal project in which a vision model used to understand images concatenated with NLP model to understand questions in order to provide answers based on both questions and images
Language: Jupyter Notebook - Size: 2.19 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

theavicaster/featurehallucination-cgan
Uses C-GAN for feature hallucination of missing modalities for hyperspectral data. TensorFlow implementation of ICCV '19 paper
Language: Python - Size: 564 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 9 - Forks: 1

verlab/StraightToThePoint_CVPR_2020
Original PyTorch implementation of the code for the paper "Straight to the Point: Fast-forwarding Videos via Reinforcement Learning Using Textual Data" at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
Language: Python - Size: 27.4 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 8 - Forks: 1

scotthlee/enriched-LSTMs
Classifying multimodal health data with LSTMs
Language: Python - Size: 36.1 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 2

04mayukh/R2D2-at-SemEval-2022-Task-5-MAMI
This repository contains the code for submission made at SemEval 2022 Task 5: MAMI
Language: Jupyter Notebook - Size: 562 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

denizlab/MIMICCXR-MultiModal-SelfSupervision
Multi-Modal and Self-Supervised learning Benchmark for MIMIC-CXR
Language: Python - Size: 191 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

abs711/The-way-of-the-future
A dataset of egocentric vision, eye-tracking and full body kinematics from human locomotion in out-of-the-lab environments. Also, different use cases of the dataset along with example code.
Language: Python - Size: 48.3 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 0

sutdcv/multi-modal-video-reasoning
[ICCV2021 Workshop] Multi-Modal Video Reasoning and Analyzing Competition
Language: JavaScript - Size: 8.77 MB - Last synced at: 9 months ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

sk-aravind/3D-Bounding-Boxes-From-Monocular-Images
A two stage multi-modal loss model along with rigid body transformations to regress 3D bounding boxes
Language: Python - Size: 9.46 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 43 - Forks: 18

Netherlands-Cancer-Institute/Multimodal_attention_DeepLearning
Multi-modal deep learning with attention mechanism
Language: Python - Size: 2.39 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 0

asnelt/mmae
Package for Multimodal Autoencoders in TensorFlow / Keras
Language: Python - Size: 28.3 KB - Last synced at: 13 days ago - Pushed at: almost 5 years ago - Stars: 18 - Forks: 12

sverma88/DeepCU-IJCAI19
DeepCU: Integrating Both Common and Unique Latent Information for Multimodal Sentiment Analysis, IJCAI-19
Language: Python - Size: 36.7 MB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 19 - Forks: 8

SAIC-MONTREAL/multimodal-dynamics
Code for AAAI 2021 paper "Learning Intuitive Physics with Multimodal Generative Models"
Language: Python - Size: 192 KB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 11 - Forks: 2

cfcooney/BiModNeuroCNN
Package for bimodal training of deep neural networks on neurological data. Pypi: https://pypi.org/project/BiModNeuroCNN/
Language: Python - Size: 137 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 1

basiralab/MultiGraphGAN
MultiGraphGAN for predicting multiple target graphs from a source graph using geometric deep learning.
Language: Python - Size: 21.8 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 17 - Forks: 4

Fuzzytariy/CMF-DGCN
A Chinese Sentiment Analysis Model based on Transmembrane State Attention for Modal Fusion and Multimodal Dynamic Gradient Regulation.
Language: Python - Size: 4.04 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

nmagal/modality_drop_for_colearning
Repo containing code for Negative Co-learning to Positive Co-learning with Aggressive Modality Drop
Language: Jupyter Notebook - Size: 134 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

marialymperaiou/knowledge-enhanced-multimodal-learning
A list of research papers on knowledge-enhanced multimodal learning
Size: 20.5 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

RunyuFan/STNet
Code for JAG 2022 paper "Urban informal settlements classification via a transformer-based spatial-temporal fusion network using multimodal remote sensing and time-series human activity data"
Language: Python - Size: 71.3 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

04mayukh/Memebusters-at-SemEval-2020-Task-8-Memotion-Analysis
This repository contains the code for submission made at SemEval 2020: Task 8 Memotion analysis.
Language: Jupyter Notebook - Size: 55 MB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 4 - Forks: 0

marcomoldovan/cross-modal-speech-segment-retrieval
Learning a common representation space from speech and text for cross-modal retrieval given textual queries and speech files.
Language: Python - Size: 216 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

marcomoldovan/3d-attention-video-understanding
Using a 3D Nearby Self-Attention Transformer to leverage the spatiotemporal nature of video for representation learning.
Language: Python - Size: 33.2 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

A2Zadeh/Social-IQ
[CVPR 2019 Oral] Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence
Language: Python - Size: 2.71 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 41 - Forks: 5

JianqiangWan/VLPT-STD
Vision-Language Pre-Training for Boosting Scene Text Detectors (CVPR2022)
Size: 4.88 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 11 - Forks: 0

carlosholivan/AudioGenerationDiffusion
State-of-the-art of Audio Generation with Diffusion Models
Size: 179 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 1

Neerajj9/Stacked-Attention-Networks-for-Visual-Question-Answering
Implementation of the paper "Stacked Attention Networks for Image Question Answering" in Tensorflow
Language: Python - Size: 15.3 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 13 - Forks: 4

eslambakr/LAR-Look-Around-and-Refer
This is the official implementation for our paper;"LAR:Look Around and Refer".
Language: C++ - Size: 45 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 15 - Forks: 2

oskar-j/awesome-multimodal-ml
List of materials for the topic of multimodal models
Size: 3.91 KB - Last synced at: 7 days ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

ShowMeModel/transformers-multimodal-example
Example of a multimodal (end-to-end) deep learning model with transformers architecture
Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

ikmb/PIA-inference
the peptide immune annotator pipeline (PIA-P) a collection of bash and Python scripts used running peptide HLA interaction from a variety of inputs
Language: Python - Size: 14.8 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

TIBHannover/multimodal-misogyny-detection-mami-2022
Multimodal Misogyny Detection - SemEval 2022 - MAMI Challenge
Language: Python - Size: 648 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 2

david-yoon/attentive-modality-hopping-for-SER
TensorFlow implementation of "Attentive Modality Hopping for Speech Emotion Recognition," ICASSP-20
Language: Python - Size: 53.7 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 27 - Forks: 8

SAGNIKMJR/move2hear-active-AV-separation
Code and datasets for 'Move2Hear: Active Audio-Visual Source Separation' (ICCV 2021)
Language: Python - Size: 1.31 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 12 - Forks: 0

ag2307/ConVIRT-Federated
Language: Jupyter Notebook - Size: 600 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

kritika-gupta/multi-modal-music-genre-classification
Final project for CS 7643 : Deep Learning (Fall 2022, Georgia Tech)
Language: Jupyter Notebook - Size: 20.7 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Damorgal/Multimodal-Research-experiments
All experiments were done to classify multimodal data.
Size: 161 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Nithin-Holla/meme_challenge
Repository containing code from team Kingsterdam for the Hateful Memes Challenge
Language: Python - Size: 1.36 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 17 - Forks: 8

eftekhar-hossain/Multimodal-Disaster_IEEE-Access
This repository contains the related resources of a multimodal deep learning project.
Language: Jupyter Notebook - Size: 4.05 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

l-yohai/Look-Attend-and-Generate-Poem Fork of boostcampaitech2/final-project-level3-nlp-08
AI Poet who looks at the images and writes poems Web service.
Language: Python - Size: 24.8 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

IsaacRodgz/Multimodal-Transformer
Multimodal version of transformer for classification using text and image
Language: Python - Size: 2.93 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 4 - Forks: 1

soloist97/region-hierarchical-pytorch
Implementation of a baseline method for image paragraph captioning
Language: Python - Size: 69.2 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 1

koushikvikram/multimodal-image-retrieval
📝🔍🖼️ A deep learning application for retrieving images by searching with text.
Language: Jupyter Notebook - Size: 382 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

koninik/multimodal_machine_translation
A PyTorch implementation of a Transformer Network for Machine Translation that incorporates image features to enhance the performance of the translation
Language: Python - Size: 59.1 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

prasoonvarshney/Multimodal-Transformer Fork of yaohungt/Multimodal-Transformer
Adding Bottlenecked Fusion to [ACL'19] Multimodal Transformer
Language: Python - Size: 351 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

El-Zag/Multimodal-Video-Captioning
Master Thesis on Multimodal Video Captioning, done at Huawei's Research Center in Amsterdam.
Language: Python - Size: 2.3 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

gorjanradevski/vsepp_tensorflow
Implementation of "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives" in Tensorflow.
Language: Python - Size: 49.8 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 0

GUT-AI/automated-data-preprocessing
Automated Data Preprocessing
Size: 48.8 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

xiaoxiaoheimei/SeqDialN
Code for reproducing results in our paper SeqDialN: Sequential Visual Dialog Networks in Joint Visual-Linguistic Representation Space.
Language: Python - Size: 76.2 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 1

IsaacRodgz/Multimodal-Adapters
Adapter modules with support for multimodal fusion of information (text, video, audio, etc.) using pre-trained BERT base model
Language: Jupyter Notebook - Size: 6.46 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

IsaacRodgz/multimodal-transformers-movies
Experiments with multimodal deep learning models based on transformers
Language: Jupyter Notebook - Size: 10.7 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 1

gtatiya/Deep-Multi-Sensory-Object-Categorization
Deep Multi-Sensory Object Category Recognition Using Interactive Behavioral Exploration
Language: Jupyter Notebook - Size: 2.65 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 15 - Forks: 8

celestialxevermore/CLIP2AE
AI-multimodal : Modeling the new text - video retrieval framework
Language: Jupyter Notebook - Size: 1.68 GB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 1

shbz80/fb_marketplace_reco
Facebook Marketplace is a platform for buying and selling products on Facebook. This project involves training a multimodal deep neural network model that predicts the category of a product based on its image and text description.
Language: Jupyter Notebook - Size: 4.17 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

gorjanradevski/cross_modal_full_transfer
PyTorch code for cross-modal-retrieval on Flickr8k/30k using Bert and EfficientNet
Language: Python - Size: 72.3 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 1

aidotse/multimodal-skin-lesion-classification
Mutlimodality for skin lesions classification
Language: Python - Size: 10.7 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 1

yubin1219/deep_learning_music
Deep Learning for Music & Audio - Multi modal project
Language: Jupyter Notebook - Size: 4.51 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

SRM-IST-KTR/disturbance-detection-in-messaging-apps-using-machine-learning-e5d7h9m7
A Fully Deployable React-Native mobile app that seeks to classify incoming messages in messaging apps into important or disturbing categories. using a Multi-Modal Machine Learning Architecture to achieve Text classification, Image classification and YouTube Video Link classification.
Language: Jupyter Notebook - Size: 40.4 MB - Last synced at: 8 months ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

candacelax/bias-in-vision-and-language
Code for paper "Measuring Social Biases in Grounded Vision and Language Embeddings"
Language: Shell - Size: 11.7 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 1

dh1105/Multi-modal-movie-genre-prediction
A multi-modal deep learning model trained to predict a movie's genre given the movie poster and overview as an input.
Language: Jupyter Notebook - Size: 362 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 12 - Forks: 10

library-of-code/deep-learning Fork of pclubiitk/model-zoo
Implementations of various Deep Learning models in PyTorch and TensorFlow.
Language: Jupyter Notebook - Size: 56.1 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 1
