An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: visual-understanding

HySonLab/Design2Code

Large Language Model in combination with Large Vision Model for the task of code generation given design sketch.

Language: Python - Size: 270 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

bcmi/Causal-VidQA

[CVPR 2022] A large-scale public benchmark dataset for video question-answering, especially about evidence and commonsense reasoning. The code used in our paper "From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering", CVPR2022.

Language: Python - Size: 20.6 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 57 - Forks: 4

AInnovateLab/ViRED

ViRED: Prediction of Visual Relations in Engineering Drawings

Size: 2.68 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

jaleedkhan/neusire

NeuSyRE: A Neuro-Symbolic Visual Understanding and Reasoning Framework based on Scene Graph Enrichment

Language: Jupyter Notebook - Size: 46.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 3