multimodal-datasets | Topic | Ecosyste.ms: Repos

Topic: "multimodal-datasets"

salesforce/LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language: Jupyter Notebook - Size: 79.3 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 10,558 - Forks: 1,031

remyxai/VQASynth

Compose multimodal datasets 🎹

Language: Python - Size: 17.5 MB - Last synced at: 5 days ago - Pushed at: 17 days ago - Stars: 413 - Forks: 17

AnkurDeria/MFT

Pytorch implementation of Multimodal Fusion Transformer for Remote Sensing Image Classification.

Language: Jupyter Notebook - Size: 2.13 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 130 - Forks: 8

yuanxiaosc/Multimodal-short-video-dataset-and-baseline-classification-model

500,000 multimodal short video data and baseline models. 50万条多模态短视频数据集和基线模型（TensorFlow2.0）。

Language: Jupyter Notebook - Size: 16.1 MB - Last synced at: about 2 months ago - Pushed at: almost 6 years ago - Stars: 128 - Forks: 36

wisdomikezogwo/quilt1m

[NeurIPS 2023 Oral] Quilt-1M: One Million Image-Text Pairs for Histopathology.

Language: Python - Size: 1.08 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 110 - Forks: 9

marslanm/Multimodality-Representation-Learning

This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl.acm.org/doi/abs/10.1145/3617833 .

Size: 63.3 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 75 - Forks: 7

drmuskangarg/Multimodal-datasets

This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers". As a part of this release we share the information about recent multimodal datasets which are available for research purposes. We found that although 100+ multimodal language resources are available in literature for various NLP tasks, still publicly available multimodal datasets are under-explored for its re-usage in subsequent problem domains.

Size: 243 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 68 - Forks: 4

roboflow/rf100-vl

Code from the paper "Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models"

Language: Python - Size: 8.57 MB - Last synced at: about 10 hours ago - Pushed at: 25 days ago - Stars: 67 - Forks: 3

Yuco-Z/Awesome-Multi-Modal-Dialog

[Paperlist] Awesome paper list of multimodal dialog, including methods, datasets and metrics

Size: 169 KB - Last synced at: 1 day ago - Pushed at: 5 months ago - Stars: 39 - Forks: 4

piresramon/gpt-4-enem

Code and data to evaluate LLMs on the ENEM, the main standardized Brazilian university admission exams.

Language: Python - Size: 34.3 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 37 - Forks: 10

JunweiLiang/FVTA_MemexQA

Real-world photo sequence question answering system (MemexQA). CVPR'18 and TPAMI'19

Language: Python - Size: 723 KB - Last synced at: 2 months ago - Pushed at: almost 6 years ago - Stars: 32 - Forks: 15

ddw2AIGROUP2CQUPT/Large-Scale-Multimodal-Face-Datasets

Millions-Level Face/Human-Scene Image-Text Datasets

Size: 19.5 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 15 - Forks: 0

lujiaying/MUG-Bench

Data and code of the Findings of EMNLP'23 paper MuG: A Multimodal Classification Benchmark on Game Data with Tabular, Textual, and Visual Fields

Language: Python - Size: 97.7 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 1

OlehOnyshchak/pyWikiMM

Collects a multimodal dataset of Wikipedia articles and their images

Language: Python - Size: 7.78 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 1

gcunhase/AnnotatedMV-PreProcessing

Pre-Processing of Annotated Music Video Corpora (COGNIMUSE and DEAP)

Language: Python - Size: 1.1 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 0

clp-research/language-models-multimodal-tasks

Official Git repository for "Hakimov, S., and Schlangen, D., (2023). Images in Language Space: Exploring the Suitability of Large Language Models for Vision & Language Tasks. Findings of the Association for Computational Linguistics (ACL 2023 Findings)"

Language: Python - Size: 17.7 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos