GitHub topics: visual-understanding
HySonLab/Design2Code
Large Language Model in combination with Large Vision Model for the task of code generation given design sketch.
Language: Python - Size: 270 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

bcmi/Causal-VidQA
[CVPR 2022] A large-scale public benchmark dataset for video question-answering, especially about evidence and commonsense reasoning. The code used in our paper "From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering", CVPR2022.
Language: Python - Size: 20.6 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 57 - Forks: 4

AInnovateLab/ViRED
ViRED: Prediction of Visual Relations in Engineering Drawings
Size: 2.68 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

jaleedkhan/neusire
NeuSyRE: A Neuro-Symbolic Visual Understanding and Reasoning Framework based on Scene Graph Enrichment
Language: Jupyter Notebook - Size: 46.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 3
