An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: scene-understanding

astra-vision/FAMix

[CVPR 2024] Domain generalization by interpolating original feature styles with styles obtained using random descriptions in natural language

Language: Python - Size: 54.3 MB - Last synced at: about 4 hours ago - Pushed at: about 5 hours ago - Stars: 51 - Forks: 3

runjtu/vpr-arxiv-daily Fork of Vincentqyw/cv-arxiv-daily

Automatically Update Visual Place Recognition Papers Daily using Github Actions (Update Every 12th hours)

Language: Python - Size: 24.7 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 1 - Forks: 0

isLinXu/paper-list

autoupdate paper list

Language: Python - Size: 104 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 79 - Forks: 9

phi-wol/sparc

Official codebase of SpaRC

Language: TypeScript - Size: 20.7 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

Davidyao99/uni4d

[CVPR 2025] Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video

Language: Python - Size: 520 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 63 - Forks: 4

vevenom/pytorchgeonodes

PyTorchGeoNodes is a differentiable module for interpretable shape programs. It can be used to translate Blender programs directly to PyTorch code. Example applications include 3D object reconstruction.

Language: Python - Size: 83.6 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 27 - Forks: 1

zchoi/Awesome-Embodied-Robotics-and-Agent

This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥

Size: 1.53 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 1,283 - Forks: 74

visinf/cups

Scene-Centric Unsupervised Panoptic Segmentation (CVPR 2025)

Language: Python - Size: 27 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 32 - Forks: 3

phi-wol/hydra

Official codebase of HyDRa.

Size: 24.4 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 43 - Forks: 2

fraunhoferhhi/spvloc

[ECCV 2024 Oral] SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments

Language: Python - Size: 2.99 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 31 - Forks: 2

NVlabs/FB-BEV

Official PyTorch implementation of FB-BEV & FB-OCC - Forward-backward view transformation for vision-centric autonomous driving perception

Language: Python - Size: 11.7 MB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 721 - Forks: 55

Yangzhangcst/RGBD-semantic-segmentation

A paper list of RGBD semantic segmentation (processing)

Size: 338 KB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 398 - Forks: 35

bertjiazheng/awesome-scene-understanding

😎 A list of awesome scene understanding papers.

Size: 285 KB - Last synced at: 9 days ago - Pushed at: about 2 months ago - Stars: 753 - Forks: 97

perseus784/Vehicle_Collision_Prediction_Using_CNN-LSTMs

Predict Vehicle collision moments before it happens in Carla!. CNN and LSTM hybrid architecture is used to understand a series of images.

Language: Python - Size: 55.1 MB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 142 - Forks: 29

bertjiazheng/Structured3D

[ECCV'20] Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling

Language: Python - Size: 12.9 MB - Last synced at: 13 days ago - Pushed at: about 2 months ago - Stars: 570 - Forks: 66

SimonVandenhende/Multi-Task-Learning-PyTorch

PyTorch implementation of multi-task learning architectures, incl. MTI-Net (ECCV2020).

Language: Python - Size: 75.7 MB - Last synced at: 16 days ago - Pushed at: over 3 years ago - Stars: 802 - Forks: 111

stalhabukhari/comp-sdf-dyn-nav

Code for the ICRA'25 paper: "Differentiable Composite Neural Signed Distance Fields for Robot Navigation in Dynamic Indoor Environments"

Language: Python - Size: 62.5 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 2 - Forks: 0

srama2512/PONI

PONI: Potential Functions for ObjectGoal Navigation with Interaction-free Learning. CVPR 2022 (Oral).

Language: Python - Size: 9.56 MB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 95 - Forks: 14

vinthony/ghost-free-shadow-removal

[AAAI 2020] Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation Network and Shadow Matting GAN

Language: Jupyter Notebook - Size: 3.8 MB - Last synced at: 14 days ago - Pushed at: over 1 year ago - Stars: 310 - Forks: 60

zubair-irshad/shapo

Pytorch code for ECCV'22 paper. ShAPO: Implicit Representations for Multi-Object Shape, Appearance and Pose Optimization

Language: Python - Size: 36.8 MB - Last synced at: 20 days ago - Pushed at: 10 months ago - Stars: 188 - Forks: 11

xiaoyufenfei/Efficient-Segmentation-Networks

Lightweight models for real-time semantic segmentationon PyTorch (include SQNet, LinkNet, SegNet, UNet, ENet, ERFNet, EDANet, ESPNet, ESPNetv2, LEDNet, ESNet, FSSNet, CGNet, DABNet, Fast-SCNN, ContextNet, FPENet, etc.)

Language: Python - Size: 1.12 MB - Last synced at: 23 days ago - Pushed at: 9 months ago - Stars: 946 - Forks: 163

manycore-research/SpatialLM

SpatialLM: Large Language Model for Spatial Understanding

Language: Python - Size: 6.22 MB - Last synced at: 27 days ago - Pushed at: about 1 month ago - Stars: 2,182 - Forks: 145

nidhiyashwanth/SpatialLM

Trying out SpatialLM (SpatialLM: Large Language Model for Spatial Understanding). Impressed with results 💖

Language: Jupyter Notebook - Size: 41 KB - Last synced at: 17 days ago - Pushed at: 28 days ago - Stars: 0 - Forks: 0

hisanusman/Violent-activities-detection-and-scene-understanding

This project will be using a multi-modal approach to detect violent activities in CCTV footages, along with integrating scene understanding through CLIP and automatic AI report generation via LangChain (OpenAI & FLAN-T5). Criminal detection and recognition will also be implemented in order to identify criminals in CCTV feeds and alert authorities.

Language: Jupyter Notebook - Size: 1.01 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

ZhaoJ9014/Multi-Human-Parsing

🔥🔥Official Repository for Multi-Human-Parsing (MHP)🔥🔥

Language: JavaScript - Size: 30.9 MB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 672 - Forks: 103

moatifbutt/r2s100k

we introduce R2S100K---a large-scale dataset and benchmark for training and evaluation of road segmentation in challenging unstructured roadways.

Language: Jupyter Notebook - Size: 11.1 MB - Last synced at: 21 days ago - Pushed at: 8 months ago - Stars: 7 - Forks: 0

HrishavBakulBarua/Social-Robots-F-formation

ACM THRI: Enabling Social Robots to Perceive and Join Socially Interacting Groups using F-formation: A Comprehensive Overview

Size: 13.9 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

kkaiwwana/MVPbev

[ACM MM24 Poster] Official implementation of paper "MVPbev: Multi-view Perspective Image Generation from BEV with Test-time Controllability and Generalizability"

Language: Python - Size: 13.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 17 - Forks: 3

scale-lab/MTLoRA

The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR '24)

Language: Python - Size: 137 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 41 - Forks: 2

bowen-upenn/Multi-Agent-VQA

[CVPR 2024 CVinW] Multi-Agent VQA: Exploring Multi-Agent Foundation Models on Zero-Shot Visual Question Answering

Language: Python - Size: 10.6 MB - Last synced at: 1 day ago - Pushed at: 7 months ago - Stars: 11 - Forks: 0

Gatedip/GDIP-Yolo

Gated Differentiable Image Processing (GDIP) for Object Detection in Adverse Conditions | Accepted at ICRA 2023

Language: Python - Size: 6.84 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 55 - Forks: 5

astra-vision/PODA

[ICCV 2023] Official implementation of "PØDA: Prompt-driven Zero-shot Domain Adaptation"

Language: Python - Size: 112 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 112 - Forks: 12

yanx27/CLEVR3D

CLEVR3D Dataset: Comprehensive Visual Question Answering on Point Clouds through Compositional Scene Manipulation

Language: Python - Size: 5.22 MB - Last synced at: 15 days ago - Pushed at: about 1 year ago - Stars: 17 - Forks: 1

jhcho99/CoFormer

[CVPR'22] Official PyTorch Implementation of "Collaborative Transformers for Grounded Situation Recognition"

Language: Python - Size: 12.9 MB - Last synced at: 29 days ago - Pushed at: about 2 years ago - Stars: 48 - Forks: 7

yinyunie/3D-Shape-Analysis-Paper-List

A list of recent papers, libraries and datasets about 3D shape/scene analysis (by topics, updating).

Language: Python - Size: 1.04 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 951 - Forks: 114

vLAR-group/OGC

🔥OGC in PyTorch (NeurIPS 2022 & TPAMI 2024)

Language: Python - Size: 106 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 105 - Forks: 8

ARResearch-1/DiverseAR-Dataset

Advancing the Understanding and Evaluation of AR-Generated Scenes: When Vision-Language Models Shine and Stumble

Size: 4.59 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

YunzeMan/Lexicon3D

[NeurIPS 2024] Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding

Language: Python - Size: 61.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 69 - Forks: 4

hollydinkel/astrobeecd

[IAC 2023, AA 2024] This repository contains the code used in our paper, "AstrobeeCD: Change Detection in Microgravity with Free-Flying Robots." This method is useful for detecting 3D scene changes given a 3D model, a sequence of images, and a sequence of camera poses.

Language: Python - Size: 7.89 MB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

CVRP-SOLE/SOLE

[ICLR 2025] Official code of "Segment any 3D Object with Language"

Language: Python - Size: 178 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 39 - Forks: 0

mbhurtel/sceneUnderstanding

Efficient and Non-redundant Objects Allocation Using Object-Aware and Depth-Aware Clustering for Battlefield Scenario

Language: Python - Size: 66.6 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Open3DA/LL3DA

[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.

Language: Python - Size: 72.7 MB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 258 - Forks: 10

OpenRobotLab/Grounded_3D-LLM

Code&Data for Grounded 3D-LLM with Referent Tokens

Language: Python - Size: 7.47 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 97 - Forks: 2

dilmerv/MetaAdvancedFeatures

Few demos with advanced meta features such as scene understanding and shared anchors

Language: C++ - Size: 649 MB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 39 - Forks: 8

aipixel/MVC-PSU

The official implementation of "Multi-view Consistent 3D Panoptic Scene Understanding". (Liu et al., AAAI 2025)

Size: 424 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Gen-XR/TheiaEngine

All in one API to serve all Vision AI task

Language: Python - Size: 11.4 MB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 21 - Forks: 3

tii-racing/drone-racing-dataset

A fully-annotated, open-design dataset of autonomous and piloted high-speed flight

Language: Python - Size: 2.22 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 51 - Forks: 11

ivanDonadello/Visual-Relationship-Detection-LTN

This repository contains the dataset and the source code for the detection of visual relationships with the Logic Tensor Networks framework.

Language: Python - Size: 61.5 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 27 - Forks: 6

jhcho99/GSRTR

[BMVC'21] Official PyTorch Implementation of "Grounded Situation Recognition with Transformers"

Language: Python - Size: 12.9 MB - Last synced at: 29 days ago - Pushed at: about 3 years ago - Stars: 26 - Forks: 11

angeladai/3DMV

[ECCV'18] 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation

Language: Python - Size: 771 KB - Last synced at: 7 days ago - Pushed at: about 3 years ago - Stars: 210 - Forks: 40

ChirikjianLab/Marching-Primitives

[CVPR2023 Highlight] Marching-Primitives: Shape Abstraction from Signed Distance Function

Language: MATLAB - Size: 33.9 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 119 - Forks: 6

basilevh/gcd

Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis (ECCV 2024 Oral) - Official Implementation

Language: Python - Size: 22.9 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 176 - Forks: 3

bowen-upenn/scene_graph_commonsense

[WACV 2025] This is the official implementation of the paper "Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge" in PyTorch.

Language: Python - Size: 72.9 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 18 - Forks: 2

TSTMotion/TSTMotion.github.io Fork of nerfies/nerfies.github.io

Official Page of paper "TSTMotion: Training-free Scene-aware Text-to-motion Generation"

Language: JavaScript - Size: 145 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

moatifbutt/CARL-dataset

CARL-D comprises large-scale stereo vision-based driving videos captured from more than 100 cities of Pakistan, including motorways, dense and unpattern traffic scenarios of urban, rural, and hilly areas.

Language: Python - Size: 36.1 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

moatifbutt/Drivable-Road-Region-Detection-and-Steering-Angle-Estimation-Method

A practical implementation of pixel level segmentation based road detection and steering angle estimation methods.

Language: Python - Size: 101 KB - Last synced at: 16 days ago - Pushed at: almost 3 years ago - Stars: 16 - Forks: 3

Jingkang50/OpenPSG

Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22

Language: Python - Size: 6.22 MB - Last synced at: 6 months ago - Pushed at: about 2 years ago - Stars: 414 - Forks: 68

SNU-VGILab/InstaOrder

Official repository for the paper "Instance-Wise Holistic Order Prediction in Natural Scenes".

Language: Python - Size: 16.1 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 17 - Forks: 1

lyqun/FPConv

CVPR 2020, "FPConv: Learning Local Flattening for Point Convolution"

Language: Python - Size: 1.53 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 131 - Forks: 16

jena-shreyas/Efficient-VidQA

Part of my work for my Bachelor's Thesis Project on Counterfactual Reasoning for Videos.

Language: Python - Size: 11.5 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

ShunChengWu/3DSSG

Language: Python - Size: 1.67 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 127 - Forks: 19

Suraj6013/depth-estimation-model

This depth estimation model generates a depth map and a downloadable text file containing depth values for a given input image

Language: Jupyter Notebook - Size: 54.5 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

prismformore/Multi-Task-Transformer

Code of ICLR2023 paper "TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding" and ECCV2022 paper "Inverted Pyramid Multi-task Transformer for Dense Scene Understanding"

Language: Python - Size: 14.4 MB - Last synced at: 10 months ago - Pushed at: 12 months ago - Stars: 283 - Forks: 21

ymxlzgy/echoscene

EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion.

Language: Python - Size: 5.92 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 15 - Forks: 0

tb2-sy/TSP-Transformer

[WACV 2024] TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding

Language: Python - Size: 84 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 6 - Forks: 0

oskarnatan/RGBDVS-fusion

Implementation code for: Semantic Segmentation and Depth Estimation with RGB and DVS Sensor Fusion for Multi-view Driving Perception, Proc. Asian Conf. Pattern Recognition (ACPR), 2021.

Language: Python - Size: 32.2 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 8 - Forks: 1

oskarnatan/compact-perception

Implementation code for: Towards Compact Autonomous Driving Perception with Balanced Learning and Multi-sensor Fusion, IEEE Trans. Intelligent Transportation Systems, 2022.

Language: Python - Size: 32.2 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 8 - Forks: 2

srama2512/EPC-SSL

Environment Predictive Coding for Visual Navigation. ICLR 2022.

Language: Python - Size: 3.23 MB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 15 - Forks: 1

LilyDaytoy/OpenPVSG

Benchmarking Panoptic Video Scene Graph Generation (PVSG), CVPR'23

Language: Jupyter Notebook - Size: 3.92 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 53 - Forks: 3

JasonQSY/Articulation3D

[CVPR 2022] Understanding 3D Object Articulation in Internet Videos

Language: Python - Size: 1.36 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 25 - Forks: 4

Nikronic/ObjectNet

PyTorch implementation of "Pyramid Scene Parsing Network".

Language: Python - Size: 2.05 MB - Last synced at: 12 months ago - Pushed at: over 3 years ago - Stars: 16 - Forks: 1

GAP-LAB-CUHK-SZ/RfDNet

Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

Language: Python - Size: 26.9 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 197 - Forks: 29

GAP-LAB-CUHK-SZ/Total3DUnderstanding

Implementation of CVPR'20 Oral: Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Language: Python - Size: 4.21 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 403 - Forks: 50

tangjiapeng/DiffuScene

[CVPR 2024] DiffuScene: Denoising Diffusion Models for Generative Indoor Scene Synthesis

Language: Python - Size: 1.48 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 141 - Forks: 9

ajzhai/NeRF2Physics

[CVPR 2024] Physical Property Understanding from Language-Embedded Feature Fields

Language: Python - Size: 27 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

ViLab-UCSD/OpenRooms

This is the dataset and code release of the OpenRooms Dataset. For more information, please refer to our webpage below. Thanks a lot for your interest in our research!

Size: 164 KB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 121 - Forks: 8

yinyunie/Pose2Room

Implementation of ECCV'2022: Pose2Room: Understanding 3D Scenes from Human Activities

Language: Python - Size: 40.2 MB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 80 - Forks: 4

coohom/MINERVAS

[CGF-PG 2022] MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis

Language: CSS - Size: 61.8 MB - Last synced at: 9 months ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 1

jayant1211/A-Multi-Modal-Approach-to-Improve-Scene-Context

This GitHub repository focuses on an integrated approach to scene classification and image caption generation, aiming to improve the accuracy of scene evaluation in computer vision applications.

Language: Jupyter Notebook - Size: 10.3 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

addtt/attend-infer-repeat-pytorch

Attend Infer Repeat (AIR) in PyTorch

Language: Python - Size: 19 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 20 - Forks: 3

lorjul/haystack

[SG2RL@ICCV 2023] Code for our paper "Haystack: A Panoptic Scene Graph Dataset to Evaluate Rare Predicate Classes"

Language: Python - Size: 302 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

yrcong/Learning_Similarity_between_Graphs_Images

Pytorch Implementation of Learning Similarity between Scene Graphs and Images with Transformers (GICON))

Language: Python - Size: 65.4 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

ShunChengWu/SceneGraphFusion

Language: C++ - Size: 710 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 131 - Forks: 23

Sbrunoberenguel/FreDSNet

Code to test FreDSNet: Frequential Depth estimation and Semantic segmentation Network

Language: Python - Size: 55.5 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 23 - Forks: 4

apgeorg/Semantic-Segmentation

Semantic Segmentation

Language: Python - Size: 41.7 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

waterljwant/SSC

Semantic Scene Completion

Language: Python - Size: 33.2 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 85 - Forks: 15

ghattab/user-study

Investigating the utility of VR for spatial understanding in surgical planning: evaluation of head-mounted to desktop display

Language: TeX - Size: 279 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

usercontext/SceneTask

Towards Task Understanding in Visual Settings

Language: Jupyter Notebook - Size: 27.1 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

vevenom/RoomLayout3D_RandC

Language: Python - Size: 20.8 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 42 - Forks: 8

nalindas9/Scene-Understanding-for-Autonomous-Driving

Scene understanding is a vital aspect of safe and effective autonomous driving. And with the increase of high-quality datasets in recent years, the models have sufficient data to train on. However, the underlying models are important factors in determining the overall effect

Size: 7.81 KB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

CYang0515/NonCuboidRoom

Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image

Language: Python - Size: 64.9 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 85 - Forks: 10

xxm19/jacobinerf

[CVPR'23] JacobiNeRF: NeRF Shaping with Mutual Information Gradient

Language: Python - Size: 68.4 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 26 - Forks: 2

itailang/SCOOP

Self-Supervised Correspondence and Optimization-Based Scene Flow (CVPR 2023)

Language: Python - Size: 30.8 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 37 - Forks: 1

heng-hw/SpaCap3D

[IJCAI 2022] Spatiality-guided Transformer for 3D Dense Captioning on Point Clouds (official pytorch implementation)

Language: Python - Size: 91 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 15 - Forks: 6

chaoyivision/SGGpoint

[CVPR 2021] Exploiting Edge-Oriented Reasoning for 3D Point-based Scene Graph Analysis (official pytorch implementation)

Language: Jupyter Notebook - Size: 2.32 MB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 43 - Forks: 8

jchibane/Box2Mask

Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes

Language: Python - Size: 3.36 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 31 - Forks: 2

charlesCXK/TorchSSC

Implement some state-of-the-art methods of Semantic Scene Completion (SSC) task in PyTorch. [1] 3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure Prior (CVPR 2020)

Language: Python - Size: 10 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 42 - Forks: 12

ASMIftekhar/VSGNet

VSGNet:Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions.

Language: Python - Size: 5.66 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 98 - Forks: 21

parseh-ux/Scene40-Dataset

A visual scene dataset created based on the "VisualGenome" [https://visualgenome.org/] dataset.

Language: HTML - Size: 1.19 GB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

bertjiazheng/indoor-layout-evaluation

[ECCV Workshop'20] General Room Layout Estimation Track in Holistic 3D Vision Challenge

Language: Python - Size: 23.4 KB - Last synced at: 1 day ago - Pushed at: almost 3 years ago - Stars: 8 - Forks: 5

Related Keywords
scene-understanding 110 deep-learning 35 computer-vision 33 semantic-segmentation 19 pytorch 16 scene-graph 14 autonomous-driving 11 object-detection 9 dataset 8 scene-graph-generation 8 segmentation 7 robotics 7 autonomous-vehicles 6 llm 6 panoptic-segmentation 6 point-cloud 6 3d 6 computer-graphics 6 3d-reconstruction 6 3d-vision 5 machine-learning 5 scene-recognition 5 room-layout 5 multi-task-learning 5 sensor-fusion 4 3d-perception 4 scene-reconstruction 4 depth-estimation 4 instance-segmentation 4 action-recognition 4 image-segmentation 4 tensorflow 4 awesome 3 unsupervised-learning 3 3d-computer-vision 3 transformer 3 cvpr2022 3 pascal 3 nuscenes 3 3d-object-detection 3 transfer-learning 3 multi-modal 3 cvpr 3 multimodal 3 nerf 3 diffusion-models 3 self-driving-cars 3 cvpr2024 2 scannet 2 pytorch-implementation 2 visual-navigation 2 3d-detection 2 fcn 2 self-driving-car 2 semantic-scene-completion 2 shape-reconstruction 2 signed-distance-functions 2 convolutional-neural-networks 2 gpt 2 research 2 generative-ai 2 house-designs 2 generative-model 2 vision 2 annotations 2 visual-relationship-detection 2 human-object-interaction 2 python 2 retrieval 2 domain-adaptation 2 zero-shot-learning 2 grounded-situation-recognition 2 large-language-models 2 foundation-models 2 cvpr2021 2 generative-models 2 vision-and-language 2 transformers 2 scene-segmentation 2 scene-generation 2 caption-generation 2 eccv2022 2 human-parsing 2 vision-transformer 2 cvpr2020 2 3d-instance-segmentation 2 spatial-intelligence 2 point-clouds 2 mllm 2 kitti-dataset 2 vgg16 2 cityscapes 2 eccv 2 image-retrieval 2 eccv2024 2 augmented-reality 2 6dof-pose 2 image-generation 2 vision-language 2 localization 2