GitHub topics: multimodal-transformer
Snehil-Shah/Multimodal-Image-Search-Engine
Text to Image & Reverse Image Search Engine built upon Vector Similarity Search utilizing CLIP VL-Transformer for Semantic Embeddings & Qdrant as the Vector-Store
Language: Jupyter Notebook - Size: 10.9 MB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 3

VachanVY/Transfusion.torch
PyTorch Implementation of Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Language: Python - Size: 2.07 MB - Last synced at: 28 days ago - Pushed at: 7 months ago - Stars: 17 - Forks: 4

pabloggarc/TFG
Clasificación de imágenes y asignación de textos mediante redes neuronales convolucionales y transformers multimodales
Language: Jupyter Notebook - Size: 275 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

MILVLG/mt-captioning
A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning
Language: Python - Size: 101 MB - Last synced at: 12 months ago - Pushed at: over 4 years ago - Stars: 24 - Forks: 7

yikaiw/TokenFusion
[CVPR 2022] Code release for "Multimodal Token Fusion for Vision Transformers"
Language: Python - Size: 3.98 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 98 - Forks: 9

Bachfischer/COMP90042-Rumour-Detection-on-Twitter
Source code for COMP90042 Project 2021
Language: Jupyter Notebook - Size: 8.46 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0
