An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: multimodal-transformer

Snehil-Shah/Multimodal-Image-Search-Engine

Text to Image & Reverse Image Search Engine built upon Vector Similarity Search utilizing CLIP VL-Transformer for Semantic Embeddings & Qdrant as the Vector-Store

Language: Jupyter Notebook - Size: 10.9 MB - Last synced at: 22 days ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 3

VachanVY/Transfusion.torch

PyTorch Implementation of Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Language: Python - Size: 2.07 MB - Last synced at: 28 days ago - Pushed at: 7 months ago - Stars: 17 - Forks: 4

pabloggarc/TFG

Clasificación de imágenes y asignación de textos mediante redes neuronales convolucionales y transformers multimodales

Language: Jupyter Notebook - Size: 275 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

MILVLG/mt-captioning

A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning

Language: Python - Size: 101 MB - Last synced at: 12 months ago - Pushed at: over 4 years ago - Stars: 24 - Forks: 7

yikaiw/TokenFusion

[CVPR 2022] Code release for "Multimodal Token Fusion for Vision Transformers"

Language: Python - Size: 3.98 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 98 - Forks: 9

Bachfischer/COMP90042-Rumour-Detection-on-Twitter

Source code for COMP90042 Project 2021

Language: Jupyter Notebook - Size: 8.46 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0