Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-scene-understanding
😎 A list of awesome scene understanding papers.
https://github.com/bertjiazheng/awesome-scene-understanding
Last synced: 4 days ago
JSON representation
-
Primitive
-
Line Segment and Wireframe
- A Novel Linelet-Based Representation for Line Segment Detection - |
- LSD: A Fast Line Segment Detector with a False Detection Control - |
- A Novel Linelet-Based Representation for Line Segment Detection - |
- LSD: A Fast Line Segment Detector with a False Detection Control - |
- A Novel Linelet-Based Representation for Line Segment Detection - |
- LSD: A Fast Line Segment Detector with a False Detection Control - |
- Volumetric Wireframe Parsing from Neural Attraction Fields
- NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images
- DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients
- Holistically-Attracted Wireframe Parsing: From Supervised to Self-Supervised Learning - |
- Learning to Construct 3D Building Wireframes from 3D Line Clouds
- HoW-3D: Holistic 3D Wireframe Perception from a Single Image - M/HoW-3D) |
- Semantic Room Wireframe Detection from a Single View - Net) |
- Towards Real-time and Light-weight Line Segment Detection
- Hole-robust Wireframe Detection - |
- Fully Convolutional Line Parsing - Xili/F-Clip) |
- ELSD: Efficient Line Segment Detector and Descriptor - |
- SOLD<sup>2</sup>: Self-supervised Occlusion-aware Line Description and Detection
- Line Segment Detection Using Transformers without Edges - ucsd/LETR/) |
- PlueckerNet: Learn to Register 3D Line Reconstructions
- LGNN: A Context-aware Line Segment Detector - |
- TP-LSD: Tri-Points Based Line Segment Detector - LSD) |
- Deep Hough-Transform Line Priors - Hough-Transform-Line-Priors) |
- Deep Hough Transform for Semantic Line Detection - hough-transform) |
- Holistically-Attracted Wireframe Parsing
- Learning to Reconstruct 3D Manhattan Wireframes from a Single Image
- End-to-End Wireframe Parsing
- PPGNet: Learning Point-Pair Graph for Line Segment Detection - lab/PPGNet) |
- Learning Attraction Field Representation for Robust Line Segment Detection
- Novel Single View Constraints for Manhattan 3D Line Reconstruction - |
- Learning to Parse Wireframes in Images of Man-Made Environments
- A Novel Linelet-Based Representation for Line Segment Detection - |
- MCMLSD: A Dynamic Programming Approach to Line Segment Detection - |
- Lifting 3D Manhattan Lines from a Single Image - |
- LSD: A Fast Line Segment Detector with a False Detection Control - |
- A Novel Linelet-Based Representation for Line Segment Detection - |
- LSD: A Fast Line Segment Detector with a False Detection Control - |
- A Novel Linelet-Based Representation for Line Segment Detection - |
- LSD: A Fast Line Segment Detector with a False Detection Control - |
- A Novel Linelet-Based Representation for Line Segment Detection - |
- LSD: A Fast Line Segment Detector with a False Detection Control - |
- A Novel Linelet-Based Representation for Line Segment Detection - |
- LSD: A Fast Line Segment Detector with a False Detection Control - |
- A Novel Linelet-Based Representation for Line Segment Detection - |
- LSD: A Fast Line Segment Detector with a False Detection Control - |
- A Novel Linelet-Based Representation for Line Segment Detection - |
- LSD: A Fast Line Segment Detector with a False Detection Control - |
- A Novel Linelet-Based Representation for Line Segment Detection - |
- LSD: A Fast Line Segment Detector with a False Detection Control - |
-
Junction
-
Outdoor Architecture
- HEAT: Holistic Edge Attention Transformer for Structured Reconstruction - structured-reconstruction.github.io/) |
- Structured Outdoor Architecture Reconsruction by Exploration and Classification
- Roof-GAN: Learning to Generate Roof Geometry and Relations for Residential Houses - ming-qian/roofgan) |
- Vectorizing World Buildings: Planar Graph Reconstruction by Primitive Detection and Relationship Inference
- Conv-MPN: Convolutional Message Passing Neural Network for Structured Outdoor Architecture Reconstruction
-
Plane
- PlaneRecTR: Unified Query learning for 3D Plane Recovery from a Single View
- NOPE-SAC: Neural One-Plane RANSAC for Sparse-View Planar 3D Reconstruction
- PlaneFormers: From Sparse View Planes to 3D Reconstruction
- PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos - vi.github.io/planarrecon/) |
- PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image
- PlaneTR: Structure-Guided Transformers for 3D Plane Recovery
- Planar Surface Reconstruction From Sparse Views
- Indoor Panorama Planar 3D Reconstruction via Divide and Conquer
- Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction - ming-qian/interplane) |
- Peek-a-Boo: Occlusion Reasoning in Indoor Scenes with Plane Representations - labs.com/~mas/peekaboo/) |
- Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding - lab/PlanarReconstruction) |
- PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image - 06_PlaneRCNN) [[code]](https://github.com/NVlabs/planercnn) |
- Recovering 3D Planes from a Single Image via Convolutional Neural Networks
- PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image - programmer/PlaneNet) |
- AirPlanes: Accurate Plane Estimation via 3D-Consistent Embeddings
- UniPlane: Unified Plane Detection and Reconstruction from Posed Monocular Videos
-
-
Workshops and Tutorials
-
Survey
- Advances in Data-Driven Analysis and Synthesis of 3D Indoor Scenes - |
- State-of-the-art in Automatic 3D Reconstruction of Structured Indoor Environments - bin/bib-page.cgi?id=%27Pintore:2020:SI3%27) |
- Indoor Scene Understanding in 2.5/3D for Autonomous Agents: A Survey - |
- RGBD Datasets: Past, Present and Future
- Neural Fields in Robotics: A Survey - |
-
Dataset
-
Realistic Dataset
- ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes
- ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data
- Zillow Indoor Dataset: Annotated Floor Plans With 360Ëš Panoramas and 3D Room Layouts
- HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures
- OASIS: A Large-Scale Dataset for Single Image 3D in the Wild
- 3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera
- The Replica Dataset: A Digital Replica of Indoor Spaces - Dataset) |
- Matterport3D: Learning from RGB-D Data in Indoor Environments
- Joint 2D-3D-Semantic Data for Indoor Scene Understanding
- ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes - net.org/) |
- SceneNN: a Scene Meshes Dataset with aNNotations - vgd.ust.hk/scenenn/home/) |
- SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite
- SUN3D: A Database of Big Spaces Reconstructed using SfM and Object Labels
- Indoor Segmentation and Support Inference from RGBD Images
-
Synthetic Dataset
- GeoSynth: A Photorealistic Synthetic Indoor Dataset for Scene Understanding
- MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis
- 3D-FRONT: 3D Furnished Rooms with layOuts and semaNTics - 3d-scene-dataset) |
- Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding
- OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets - openrooms.github.io/) |
- Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling - dataset.org) |
- InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset
- SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation? - rgbd.html) |
- Semantic Scene Completion from a Single Depth Image - |
- SceneNet: Understanding Real World Indoor Scenes With Synthetic Data
- The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes - dataset.net/) |
- R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding - hcvc.github.io/r3ds/) |
- FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing Scenes - |
- Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation
-
-
Holistic Scene Understanding
-
Perspective Image
- Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture - jack.github.io/SSR) |
- Towards High-Fidelity Single-view Holistic Reconstruction of Indoor Scenes
- Holistic 3D Scene Understanding from a Single Image with Implicit Representation
- Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image
- PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points - |
- Hoilistc++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense
- Complete 3D Scene Parsing from an RGBD Image - |
- Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation
- Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image
- Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene
- Im2CAD
- DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding
- Emptying, Refurnishing, and Relighting Indoor Spaces
- Scene Parsing by Integrating Function, Geometry and Appearance Models - |
- Understanding Indoor Scenes using 3D Geometric Phrases - |
- Recovering Free Space of Indoor Scenes from a Single Image - |
- Efficient Exact Inference for 3D Indoor Scene Understanding - |
- Efficient Structured Prediction for 3D Indoor Scene Understanding - |
- Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces - |
- Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry - |
-
Panoramic Image
- PanoContext-Former: Panoramic Total Scene Understanding with a Transformer - |
- PanelNet: Understanding 360 Indoor Environment via Panel Representation - |
- DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization
- HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features
- Automatic 3D Indoor Scene Modeling from Single Panorama - |
- Pano2CAD: Room Layout From A Single Panorama Image - |
- PanoContext: A Whole-room 3D Context Model for Panoramic Scene Understanding
- Automatic 3D Indoor Scene Modeling from Single Panorama - |
-
-
Room Layout Estimation
-
Perspective Image
- Polygon Detection for Room Layout Estimation using Heterogeneous Graphs and Wireframes - HGT) |
- ST-RoomNet: Learning Room Layout Estimation From Single Image Through Unsupervised Spatial Transformations - |
- Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image
- RoomStructNet: Learning to Rank Non-Cuboidal Room Layouts From Single View - |
- GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of Planes - Layout]
- Structural Deep Metric Learning for Room Layout Estimation - |
- General 3D Room Layout from a Single View by Render-and-Compare - lepetit/research-projects/general-3d-room-layout-from-a-single-view-by-render-and-compare/) [[ScanNet-Layout Dataset]][ScanNet-Layout] [[code]](https://github.com/vevenom/RoomLayout3D_RandC) |
- Smart Hypothesis Generation for Efficient and Robust Room Layout Estimation - |
- Flat2Layout: Flat Representation for Estimating Layout of General Room Types - |
- Thinking Outside the Box: Generation of Unconstrained 3D Room Layouts - |
- RoomNet: End-to-End Room Layout Estimation - |
- Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation - pio/) |
- A Coarse-to-Fine Indoor Layout Estimation (CFILE) Method - |
- DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes - |
- Learning Informative Edge Maps for Indoor Scene Layout Prediction - |
- Rent3D: Floor-Plan Priors for Monocular Layout Estimation - |
- Box In the Box: Joint 3D Layout and Object Reasoning from Single Images - |
- Estimating the 3D Layout of Indoor Scenes and its Clutter from Depth Sensors
- Recovering the Spatial Layout of Cluttered Rooms - |
-
Panoramic Image
- Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction
- iBARLE: imBalance-Aware Room Layout Estimation
- GPR-Net: Multi-view Layout Estimation via a Geometry-aware Panorama Registration Network - |
- Shape-Net: Room Layout Estimation from Panoramic Images Robust to Occlusion using Knowledge Distillation with 3D Shapes as Additional Inputs
- U2RLE: Uncertainty-Guided 2-Stage Room Layout Estimation
- Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness - bjtu/DOPNet)] |
- 360-MLC: Multi-view Layout Consistency for Self-training and Hyper-parameter Tuning - mlc/) |
- 3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough Transform - Net) |
- 3D Room Layout Recovery Generalizing across Manhattan and Non-Manhattan Worlds - |
- PSMNet: Position-aware Stereo Merging Network for Room Layout Estimation - layout) |
- LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network - |
- Deep3DLayout: 3D Reconstruction of an Indoor Layout from a Spherical Panoramic Image - bin/bib-page.cgi?id=%27Pintore:2021:D3R%27) |
- Transferable End-to-end Room Layout Estimation via Implicit Encoding
- OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas
- SSLayout360: Semi-Supervised Indoor Layout Estimation from 360 Panorama - |
- Single-Shot Cuboids: Geodesics-based End-to-end Manhattan Aligned Layout Estimation from Spherical Panoramas
- Manhattan Room Layout Reconstruction from a Single 360 image: A Comparative Study of State-of-the-art Methods
- Training and Post Processing 3D Room Layout Beyond the Manhattan World Assumption - |
- Joint 3D Layout and Depth Prediction from a Single Indoor Panorama Image - |
- AtlantaNet: Inferring the 3D Indoor Layout from a Single 360 Image Beyond the Manhattan World Assumption - bin/bib-page.cgi?id=%27Pintore:2020:AI3%27) [[code]](https://github.com/crs4/AtlantaNet)
- Corners for Layout: End-to-End Layout Recovery from 360 Images - End-to-End-Layout-Recovery-from-360-Images)
- DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama
- HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation
- Layouts from Panoramic Images with Geometry and Deep Learning - and-Vanishing-Points-directly-on-Panoramas)
- LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image
- Efficient 3D Room Shape Recovery From a Single Panorama - H/Panoramix)
- No More Ambiguity in 360â—¦ Room Layout via Bi-Layout Estimation
- Self-supervised 360Ëš Room Layout Estimation - 360-Layout)
- LED<sup>2</sup>-Net: Monocular 360Ëš Layout Estimation via Differentiable Depth Rendering - Net)
-
-
Floorplan
-
Panoramic Image
- Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries
- Floorplan Restoration by Structure Hallucinating Transformer Cascades - |
- MVLayoutNet: 3D Layout Reconstruction with Multi-View Panoramas - |
- Extreme Structure From Motion for Indoor Panoramas Without Visual Overlaps - indoor-sfm) |
- MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans - |
- Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor Scenes - |
- Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path - sp/) [[code]](https://github.com/woodfrog/floor-sp) |
- Floorplan-Jigsaw: Jointly Estimating Scene Layout and Aligning Partial Scans - li.github.io/projects/indoorRecons/floorplanJigsaw.html) |
- DeepPerimeter: Indoor Boundary Estimation from Posed Monocular Sequences - |
- FloorNet: A unified framework for floorplan reconstruction from 3D scans - programmer.github.io/floornet.html) [[code]](https://github.com/art-programmer/FloorNet) |
- PolyRoom: Room-aware Transformer for Floorplan Reconstruction - casia/PolyRoom/) |
- PolyDiffuse: Polygonal Shape Reconstruction via Guided Set Diffusion Models - diffuse.github.io/) |
- FRI-Net: Floorplan Reconstruction via Room-wise Implicit Representation - 1227/FRI-Net) |
-
Floorplan Vectorization
- VectorFloorSeg: Two-Stream Graph Attention Network for Vectorized Roughcast Floorplan Segmentation
- Parsing Line Segments of Floor Plan Images Using Graph Neural Networks - |
- Residential floor plan recognition and reconstruction - |
- Versailles-FP dataset: Wall Detection in Ancient Floor Plans - |
- Deep Floor Plan Recognition using a Multi-task Network with Room-boundary-Guided Attention
- CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image Analysis
- Raster-to-Vector: Revisiting Floorplan Transformation - programmer.github.io/floorplan-transformation.html) [[code]](https://github.com/art-programmer/FloorplanTransformation) |
-
Visual Localization
- LaLaLoc++: Global Floor Plan Comprehension for Layout Localisation in Unvisited Environments
- LASER: LAtent SpacE Rendering for 2D Visual Localization - |
- LaLaLoc: Latent Layout Localisation in Dynamic, Unvisited Environments - |
- SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments
-
-
Vanishing Point
-
Plane
- Vanishing Point Estimation in Uncalibrated Images with Prior Gravity Direction - Estimation-with-Prior-Gravity) |
- Transformer Based Line Segment Classifier with Image Context for Real-Time Vanishing Point Detection in Manhattan World - |
- Deep Vanishing Point Detection: Geometric Priors Make Dataset Variations Vanish - |
- VaPiD: A Rapid Vanishing Point Detector via Learned Optimizers - |
- NeurVPS: Neural Vanishing Point Scanning via Conic Convolution
-
-
Related Resources
Programming Languages
Categories
Sub Categories