cross-modal-pretraining | Topic | Ecosyste.ms: Repos

Topic: "cross-modal-pretraining"

DAMO-NLP-SG/Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Language: Python - Size: 19.6 MB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 3,006 - Forks: 272

JacobYuan7/RLIP

[NeurIPS 2022 Spotlight] RLIP: Relational Language-Image Pre-training and a series of other methods to solve HOI detection and Scene Graph Generation.

Language: Python - Size: 15.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 67 - Forks: 3

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

Topic: "cross-modal-pretraining"

DAMO-NLP-SG/Video-LLaMA

JacobYuan7/RLIP