Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/bertjiazheng/awesome-scene-understanding
😎 A list of awesome scene understanding papers.
https://github.com/bertjiazheng/awesome-scene-understanding
List: awesome-scene-understanding
3d-scene awesome computer-vision deep-learning indoor-scenes scene-understanding
Last synced: 3 months ago
JSON representation
😎 A list of awesome scene understanding papers.
- Host: GitHub
- URL: https://github.com/bertjiazheng/awesome-scene-understanding
- Owner: bertjiazheng
- License: mit
- Created: 2019-03-11T08:48:17.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2024-04-07T08:43:14.000Z (7 months ago)
- Last Synced: 2024-05-23T06:29:45.878Z (5 months ago)
- Topics: 3d-scene, awesome, computer-vision, deep-learning, indoor-scenes, scene-understanding
- Homepage:
- Size: 493 KB
- Stars: 649
- Watchers: 47
- Forks: 89
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-lists-machine-learning - Scene understanding
- awesome-computer-vision - Awesome Scene Understanding
- ultimate-awesome - awesome-scene-understanding - 😎 A list of awesome scene understanding papers. (Other Lists / PowerShell Lists)
README
# Awesome Scene Understanding [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)
A curated list of awesome scene understanding papers, inspired by [awesome-computer-vision](https://github.com/jbhuang0604/awesome-computer-vision).
* 📷 Multi-view images
* 🎲 Point cloud## Related Resources
* [3D Machine Learning](https://github.com/timzhang642/3D-Machine-Learning)
* [Awesome Holistic 3D](https://github.com/holistic-3d/awesome-holistic-3d)
* [Awesome Planar Reconstruction](https://github.com/chenzhaiyu/awesome-planar-reconstruction)
* [Wireframe](https://github.com/Delay-Xili/Wireframe)
* [Line Segment](https://github.com/lh9171338/Line-Segment-Detection-Papers)
## Workshops and Tutorials
* [Holistic Structures for 3D Vision Workshop at ICCV 2021](https://holistic-3d.github.io/iccv21/)
* [Holistic Scene Structures for 3D Vision Workshop at ECCV 2020](https://holistic-3d.github.io/eccv20/)
* [Holistic 3D Reconstruction: Learning to Reconstruct Holistic 3D Structures from Sensorial Data at ICCV 2019](https://holistic-3d.github.io/iccv19/)
## Survey
| Papers | Venue | Links |
|--------|-------|-------|
| [Advances in Data-Driven Analysis and Synthesis of 3D Indoor Scenes](https://arxiv.org/abs/2304.03188) | CGF 2023 | - |
| [State-of-the-art in Automatic 3D Reconstruction of Structured Indoor Environments](http://vic.crs4.it/data/papers/eg2020-star-indoor.pdf) | CGF 2020 | [[project]](http://vic.crs4.it/vic/cgi-bin/bib-page.cgi?id=%27Pintore:2020:SI3%27) |
| [Indoor Scene Understanding in 2.5/3D for Autonomous Agents: A Survey](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8573760&tag=1) | IEEE Access 2019 | - |
| [RGBD Datasets: Past, Present and Future](https://arxiv.org/abs/1604.00999) | CVPR Workshop 2016 | [[project]](http://www.michaelfirman.co.uk/RGBDdatasets/) |## Dataset
### Realistic Dataset
| Papers | Venue | Links |
|--------|-------|-------|
| [ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes](https://arxiv.org/abs/2308.11417) | ICCV 2023 | [[project]](https://kaldir.vc.in.tum.de/scannetpp/) |
| [ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data](https://openreview.net/forum?id=tjZjv_qh_CE) | NeurIPS 2021 Dataset Track | [[code]](https://github.com/apple/ARKitScenes) |
| [Zillow Indoor Dataset: Annotated Floor Plans With 360Ëš Panoramas and 3D Room Layouts](https://openaccess.thecvf.com/content/CVPR2021/papers/Cruz_Zillow_Indoor_Dataset_Annotated_Floor_Plans_With_360deg_Panoramas_and_CVPR_2021_paper.pdf) | CVPR 2021 | [[code]](https://github.com/zillow/zind) |
| [HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures](https://arxiv.org/abs/2008.03286) | CoRR 2020 | [[project]](https://holicity.io/) |
| [OASIS: A Large-Scale Dataset for Single Image 3D in the Wild](https://arxiv.org/abs/2007.13215) | CVPR 2020 | [[project]](https://oasis.cs.princeton.edu/) |
| [3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera](https://arxiv.org/abs/1910.02527) | ICCV 2019 | [[project]](https://3dscenegraph.stanford.edu/) |
| [The Replica Dataset: A Digital Replica of Indoor Spaces](https://arxiv.org/abs/1906.05797) | CoRR 2019 | [[code]](https://github.com/facebookresearch/Replica-Dataset) |
| [Matterport3D: Learning from RGB-D Data in Indoor Environments](https://arxiv.org/abs/1709.06158) | 3DV 2017 | [[project]](https://niessner.github.io/Matterport/) |
| [Joint 2D-3D-Semantic Data for Indoor Scene Understanding](https://arxiv.org/abs/1702.01105) | CoRR 2017 | [[project]](http://buildingparser.stanford.edu/dataset.html) |
| [ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes](https://arxiv.org/abs/1702.04405) | CVPR 2017 | [[project]](http://www.scan-net.org/) |
| [SceneNN: a Scene Meshes Dataset with aNNotations](http://hkust-vgd.ust.hk/scenenn/home/pdf/dataset_3dv16.pdf) | 3DV 2016 | [[project]](http://hkust-vgd.ust.hk/scenenn/home/) |
| [SUN RGB-D: A RGB-D Scene Understanding Benchmark Suite](https://rgbd.cs.princeton.edu/paper.pdf) | CVPR 2015 | [[project]](http://rgbd.cs.princeton.edu/) |
| [SUN3D: A Database of Big Spaces Reconstructed using SfM and Object Labels](http://3dvision.princeton.edu/projects/2013/SUN3D/paper.pdf) | ICCV 2013 | [[project]](http://sun3d.cs.princeton.edu/) |
| [Indoor Segmentation and Support Inference from RGBD Images](http://cs.nyu.edu/~silberman/papers/indoor_seg_support.pdf) | ECCV 2012 | [[project]](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2) |### Synthetic Dataset
| Papers | Venue | Links |
|--------|-------|-------|
| [Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation](https://arxiv.org/abs/2406.11824) | CVPR 2024 | [[project]](https://infinigen.org/) |
| [R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding](https://arxiv.org/abs/2403.12301) | CoRR 2024 | [[project]](https://3dlg-hcvc.github.io/r3ds/) |
| [FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing Scenes](https://arxiv.org/abs/2401.03470) | CoRR 2024 | - |
| [GeoSynth: A Photorealistic Synthetic Indoor Dataset for Scene Understanding](https://ieeexplore.ieee.org/document/10050341) | VR 2023 | [[code]](https://github.com/geomagical/GeoSynth) |
| [MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis](https://arxiv.org/abs/2107.06149) | CGF 2022 | [[project]](https://coohom.github.io/MINERVAS/) |
| [3D-FRONT: 3D Furnished Rooms with layOuts and semaNTics](https://arxiv.org/abs/2011.09127) | ICCV 2021 | [[project]](https://tianchi.aliyun.com/specials/promotion/alibaba-3d-scene-dataset) |
| [Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding](https://arxiv.org/abs/2011.02523) | ICCV 2021 | [[project]](https://mikeroberts3000.github.io/papers/hypersim) |
| [OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets](https://arxiv.org/abs/2007.12868) | CVPR 2021 | [[project]](https://ucsd-openrooms.github.io/) |
| [Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling](https://arxiv.org/abs/1908.00222) | ECCV 2020 | [[project]](https://structured3d-dataset.org) |
| [InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset](https://arxiv.org/abs/1809.00716) | BMVC 2018 | [[project]](https://interiornet.org/) |
| [SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation?](https://arxiv.org/abs/1612.05079) | ICCV 2017 | [[project]](https://robotvault.bitbucket.io/scenenet-rgbd.html) |
| [Semantic Scene Completion from a Single Depth Image](https://arxiv.org/abs/1611.08974) | CVPR 2017 | - |
| [SceneNet: Understanding Real World Indoor Scenes With Synthetic Data](http://arxiv.org/abs/1511.07041) | CVPR 2016 | [[project]](https://robotvault.bitbucket.io/) |
| [The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes](https://openaccess.thecvf.com/content_cvpr_2016/html/Ros_The_SYNTHIA_Dataset_CVPR_2016_paper.html) | CVPR 2016 | [[project]](http://synthia-dataset.net/) |## Holistic Scene Understanding
### Perspective Image
| Papers | Venue | Links |
|--------|-------|-------|
| [Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture](https://arxiv.org/abs/2311.00457) | 3DV 2024 | [[project]](https://dali-jack.github.io/SSR) |
| [Towards High-Fidelity Single-view Holistic Reconstruction of Indoor Scenes](https://arxiv.org/abs/2207.08656) | ECCV 2022 | [[code]](https://github.com/UncleMEDM/InstPIFu) |
| [Holistic 3D Scene Understanding from a Single Image with Implicit Representation](https://arxiv.org/abs/2103.06422) | CVPR 2021 | [[project]](https://chengzhag.github.io/publication/im3d/) [[code]](https://github.com/chengzhag/Implicit3DUnderstanding) |
| [Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image](https://arxiv.org/abs/2002.12212) | CVPR 2020 | [[code]](https://github.com/yinyunie/Total3DUnderstanding) |
| [PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points](https://arxiv.org/abs/1912.07744) | NeurIPS 2019 | - |
| [Hoilistc++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense](https://arxiv.org/abs/1909.01507) | ICCV 2019 | [[project]](https://yixchen.github.io/holisticpp/) [[code]](https://github.com/yixchen/holistic_scene_human) |
| [Complete 3D Scene Parsing from an RGBD Image](https://arxiv.org/abs/1710.09490) | IJCV 2018 | - |
| [Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation](https://arxiv.org/abs/1810.13049) | NeurIPS 2018 | [[project]](http://siyuanhuang.com/cooperative_parsing/main.html) [[code]](https://github.com/thusiyuan/cooperative_scene_parsing) |
| [Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image](https://arxiv.org/abs/1808.02201) | ECCV 2018 | [[project]](http://siyuanhuang.com/holistic_parsing/main.html) [[code]](https://github.com/thusiyuan/holistic_scene_parsing) |
| [Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene](https://arxiv.org/abs/1712.01812) | CVPR 2018 | [[project]](https://shubhtuls.github.io/factored3d/) [[code]](https://github.com/shubhtuls/factored3d) |
| [Im2CAD](https://arxiv.org/abs/1608.05137) | CVPR 2018 | [[project]](https://homes.cs.washington.edu/~izadinia/im2cad.html) |
| [DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding](https://arxiv.org/abs/1603.04922) | ICCV 2017 | [[project]](http://deepcontext.cs.princeton.edu) |
| [Emptying, Refurnishing, and Relighting Indoor Spaces](https://grail.cs.washington.edu/projects/emptying/emptying.pdf) | SIGGRAPH Asia 2016 | [[project]](https://grail.cs.washington.edu/projects/emptying/) |
| [Scene Parsing by Integrating Function, Geometry and Appearance Models](http://openaccess.thecvf.com/content_cvpr_2013/papers/Zhao_Scene_Parsing_by_2013_CVPR_paper.pdf) | CVPR 2013 | - |
| [Understanding Indoor Scenes using 3D Geometric Phrases](http://openaccess.thecvf.com/content_cvpr_2013/papers/Choi_Understanding_Indoor_Scenes_2013_CVPR_paper.pdf) | (CVPR 2013) | - |
| [Recovering Free Space of Indoor Scenes from a Single Image](http://vision.cs.uiuc.edu/~vhedau2/Publications/cvpr2012_freespace.pdf) | CVPR 2012 | - |
| [Efficient Exact Inference for 3D Indoor Scene Understanding](http://schwingag.de/papers/SchwingEtAl_ECCV2012.pdf) | ECCV 2012 | - |
| [Efficient Structured Prediction for 3D Indoor Scene Understanding](https://ttic.uchicago.edu/~rurtasun/publications/schwing_et_al_cvpr12.pdf) | CVPR 2012 | - |
| [Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces](https://papers.NeurIPS.cc/paper/4120-estimating-spatial-layout-of-rooms-using-volumetric-reasoning-about-objects-and-surfaces.pdf) | NeurIPS 2010 | - |
| [Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry](http://vision.cs.uiuc.edu/~vhedau2/Publications/eccv2010_think_inside.pdf) | ECCV 2010 | - |### Panoramic Image
| Papers | Venue | Links |
|--------|-------|-------|
| [PanoContext-Former: Panoramic Total Scene Understanding with a Transformer](https://arxiv.org/abs/2305.12497) | CVPR 2024 | - |
| [PanelNet: Understanding 360 Indoor Environment via Panel Representation](https://arxiv.org/abs/2305.09078) | CVPR 2023 | - |
| [DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization](https://arxiv.org/abs/2108.10743) | ICCV 2021 | [[code]](https://github.com/chengzhag/DeepPanoContext) |
| [HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features](https://arxiv.org/abs/2011.11498) | CVPR 2021 | [[Code]](https://github.com/sunset1995/HoHoNet) |
| [Automatic 3D Indoor Scene Modeling from Single Panorama](http://openaccess.thecvf.com/content_cvpr_2018/papers/Yang_Automatic_3D_Indoor_CVPR_2018_paper.pdf) | CVPR 2018 | - |
| [Pano2CAD: Room Layout From A Single Panorama Image](http://bjornstenger.github.io/papers/xu_wacv2017.pdf) | WACV 2017 | - |
| [PanoContext: A Whole-room 3D Context Model for Panoramic Scene Understanding](http://panocontext.cs.princeton.edu/paper.pdf) | ECCV 2014 | [[project]](http://panocontext.cs.princeton.edu/) |## Room Layout Estimation
### Perspective Image
(AW: Atlanta-world, SS: single-floor and single-ceiling, PP: Piece-wise Planarity.)
| Dataset | Year | Modality | #Frames | Prior | Source |
|---------|------|----------|---------|-------|--------|
| [CAD-Estate][cad-estate] | 2023 | RGB Video | - | Generic | RealEstate-10K |
| [Matterport3D-Layout][Matterport3D-Layout] | 2020 | RGB-D | 7360 | PP | Matterport |
| [ScanNet-Layout][ScanNet-Layout] | 2020 | RGB-D | 293 | PP | ScanNet |
| Structured3D | 2020 | RGB-D | 82027 | AW+SS | Structured3D |
| LSUN Room Layout | 2016 | RGB | 5394 | Cuboid | SUN |
| SUN RGB-D | 2015 | RGB-D | 10335 | AW+SS | NYUv2, Berkeley B3DO, and SUN3D |
| NYUv2 303 | 2013 | RGB-D | 303 | Cuboid | NYUv2 |
| Hedau | 2009 | RGB | 366 | Cuboid | - || Papers | Venue | Links |
|--------|-------|-------|
| [Polygon Detection for Room Layout Estimation using Heterogeneous Graphs and Wireframes](https://arxiv.org/abs/2306.12203) | ICCV Workshop 2023 | [[code]](https://github.com/DavidGillsjo/polygon-HGT) |
| [ST-RoomNet: Learning Room Layout Estimation From Single Image Through Unsupervised Spatial Transformations](https://openaccess.thecvf.com/content/CVPR2023W/VOCVALC/html/Ibrahem_ST-RoomNet_Learning_Room_Layout_Estimation_From_Single_Image_Through_Unsupervised_CVPRW_2023_paper) | CVPR Workshop 2023 | - |
| [Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image](https://arxiv.org/abs/2104.07986) | WACV 2022 | [[code]](https://github.com/CYang0515/NonCuboidRoom) |
| [RoomStructNet: Learning to Rank Non-Cuboidal Room Layouts From Single View](https://arxiv.org/abs/2110.00644) | CoRR 2021 | - |
| [GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of Planes](https://arxiv.org/abs/2008.06286) | ECCV 2020 | [[Matterport3D Layout Dataset]][Matterport3D-Layout]
| [Structural Deep Metric Learning for Room Layout Estimation](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123630715.pdf) | ECCV 2020 | - |
| [General 3D Room Layout from a Single View by Render-and-Compare](http://arxiv.org/abs/2001.02149) | ECCV 2020 | [[project]](https://www.tugraz.at/institute/icg/research/team-lepetit/research-projects/general-3d-room-layout-from-a-single-view-by-render-and-compare/) [[ScanNet-Layout Dataset]][ScanNet-Layout] [[code]](https://github.com/vevenom/RoomLayout3D_RandC) |
| [Smart Hypothesis Generation for Efficient and Robust Room Layout Estimation](https://arxiv.org/abs/1910.12257) | WACV 2020 | - |
| [Flat2Layout: Flat Representation for Estimating Layout of General Room Types](https://arxiv.org/abs/1905.12571) | CoRR 2019 | - |
| [Thinking Outside the Box: Generation of Unconstrained 3D Room Layouts](https://arxiv.org/abs/1905.03105) | ACCV 2018 | - |
| [RoomNet: End-to-End Room Layout Estimation](https://arxiv.org/abs/1703.06241) | ICCV 2017 | - |
| [Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation](https://arxiv.org/abs/1707.00383) | CVPR 2017 | [[project]](https://sites.google.com/view/st-pio/) |
| [A Coarse-to-Fine Indoor Layout Estimation (CFILE) Method](https://arxiv.org/abs/1607.00598) | ACCV 2016 | - |
| [DeLay: Robust Spatial Layout Estimation for Cluttered Indoor Scenes](http://openaccess.thecvf.com/content_cvpr_2016/papers/Dasgupta_DeLay_Robust_Spatial_CVPR_2016_paper.pdf) | CVPR 2016 | - |
| [Learning Informative Edge Maps for Indoor Scene Layout Prediction](http://openaccess.thecvf.com/content_iccv_2015/papers/Mallya_Learning_Informative_Edge_ICCV_2015_paper.pdf) | ICCV 2015 | - |
| [Rent3D: Floor-Plan Priors for Monocular Layout Estimation](http://openaccess.thecvf.com/content_cvpr_2015/papers/Liu_Rent3D_Floor-Plan_Priors_2015_CVPR_paper.pdf) | CVPR 2015 | [[project]](http://www.cs.toronto.edu/~fidler/projects/rent3D.html) | - |
| [Box In the Box: Joint 3D Layout and Object Reasoning from Single Images](http://openaccess.thecvf.com/content_iccv_2013/papers/Schwing_Box_in_the_2013_ICCV_paper.pdf) | CVPR 2013 | - |
| [Estimating the 3D Layout of Indoor Scenes and its Clutter from Depth Sensors](http://openaccess.thecvf.com/content_iccv_2013/papers/Zhang_Estimating_the_3D_2013_ICCV_paper.pdf) | ICCV 2013 | [[project]](https://cs.stanford.edu/people/zjian/project/ICCV13DepthLayout/ICCV13DepthLayout.html) |
| [Recovering the Spatial Layout of Cluttered Rooms](http://dhoiem.cs.illinois.edu/publications/iccv2009_hedau_indoor.pdf) | ICCV 2009 | - |### Panoramic Image
(MW: Manhattan world, AW: Atlanta world, SS: single-floor and single-ceiling.)
| Dataset | Year | Modality | #Frames | Prior | Source |
|---------|------|----------|---------|-------|--------|
| [ZInD][ZInD] | 2021 | RGB | 71474 | AW+SS | ZinD |
| [MatterportLayout][MatterportLayout] | 2020 | RGB-D | 2295 | MW+SS | Matterport |
| Structured3D | 2020 | RGB-D | 196515 | AW+SS | Structured3D |
| [LayoutMP3D][LayoutMP3D] | 2020 | RGB-D | 2505 | MW+SS | Matterport |
| 2D-3D-S | 2018 | RGB-D | 571 | Cuboid | 2D-3D-S |
| PanoContext | 2014 | RGB | 500 | Cuboid | SUN360 || Papers | Venue | Links |
|--------|-------|-------|
| [No More Ambiguity in 360â—¦ Room Layout via Bi-Layout Estimation](https://www.amazon.science/publications/no-more-ambiguity-in-360-room-layout-via-bi-layout-estimation) | CVPR 2024 | |
| [Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction](https://arxiv.org/abs/2311.18695) | CVPR 2024 | |
| [iBARLE: imBalance-Aware Room Layout Estimation](https://arxiv.org/abs/2308.15050) | CoRR 2023 | |
| 📷 [GPR-Net: Multi-view Layout Estimation via a Geometry-aware Panorama Registration Network](https://arxiv.org/abs/2210.11419) | CVPR Workshop 2023 | - |
| [Shape-Net: Room Layout Estimation from Panoramic Images Robust to Occlusion using Knowledge Distillation with 3D Shapes as Additional Inputs](https://arxiv.org/abs/2304.12624) | CVPR Workshop 2023 | |
| [U2RLE: Uncertainty-Guided 2-Stage Room Layout Estimation](https://arxiv.org/abs/2304.08580) | CVPR 2023 | |
| [Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness](https://arxiv.org/abs/2303.00971) | CVPR 2023 | [[Code](https://github.com/zhijieshen-bjtu/DOPNet)] |
| 📷 [360-MLC: Multi-view Layout Consistency for Self-training and Hyper-parameter Tuning](https://arxiv.org/abs/2210.12935) | NeurIPS 2022 | [[Project]](https://enriquesolarte.github.io/360-mlc/) |
| [3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough Transform](https://arxiv.org/abs/2207.09291) | ECCV 2022 | [[Code]](https://github.com/Starrah/DMH-Net) |
| [3D Room Layout Recovery Generalizing across Manhattan and Non-Manhattan Worlds](https://openaccess.thecvf.com/content/CVPR2022W/OmniCV/papers/Jia_3D_Room_Layout_Recovery_Generalizing_Across_Manhattan_and_Non-Manhattan_Worlds_CVPRW_2022_paper.pdf) | CVPR 2022 | - |
| 📷 [PSMNet: Position-aware Stereo Merging Network for Room Layout Estimation](https://arxiv.org/abs/2203.15965) | CVPR 2022 | [[code]](https://github.com/zillow/psmnet-layout) |
| [Self-supervised 360Ëš Room Layout Estimation](https://arxiv.org/abs/2203.16057) | CoRR 2022 | [[code]](https://github.com/joshua049/Stereo-360-Layout)
| [LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network](https://arxiv.org/abs/2203.01824) | CVPR 2022 | - |
| [Deep3DLayout: 3D Reconstruction of an Indoor Layout from a Spherical Panoramic Image](http://publications.crs4.it/pubdocs/2021/PAAG21/sigasia2021-deep3dlayout.pdf) | SIGGRAPH Asia 2021 | [[project]](http://vic.crs4.it/vic/cgi-bin/bib-page.cgi?id=%27Pintore:2021:D3R%27) |
| [Transferable End-to-end Room Layout Estimation via Implicit Encoding](https://arxiv.org/abs/2112.11340) | CoRR 2021 | [[project]](https://sites.google.com/view/transferrl/)
| [OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas](https://arxiv.org/abs/2104.09403) | CVPR Workshop 2021 | [[code]](https://github.com/rshivansh/OmniLayout)
| [LED2-Net: Monocular 360Ëš Layout Estimation via Differentiable Depth Rendering](https://arxiv.org/abs/2104.00568) | CVPR 2021 | [[project]](https://fuenwang.ml/project/led2net/) [[code]](https://github.com/fuenwang/LED2-Net)
| [SSLayout360: Semi-Supervised Indoor Layout Estimation from 360 Panorama](https://arxiv.org/abs/2103.13696) | CVPR 2021 | - |
| [Single-Shot Cuboids: Geodesics-based End-to-end Manhattan Aligned Layout Estimation from Spherical Panoramas](https://arxiv.org/abs/2102.03939) | Image and Vision Computing 2021 | [[project]](https://vcl3d.github.io/SingleShotCuboids/) [[code]](https://github.com/VCL3D/SingleShotCuboids)
| [Manhattan Room Layout Reconstruction from a Single 360 image: A Comparative Study of State-of-the-art Methods](https://arxiv.org/abs/1910.04099) | IJCV 2021 | [[code]](https://github.com/zouchuhang/LayoutNetv2) [[MatterportLayout Dataset]][MatterportLayout]
| [Training and Post Processing 3D Room Layout Beyond the Manhattan World Assumption](https://arxiv.org/abs/2009.02857) | ECCV Workshop 2020 | - |
| [Joint 3D Layout and Depth Prediction from a Single Indoor Panorama Image](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123610647.pdf) | ECCV 2020 | - |
| [AtlantaNet: Inferring the 3D Indoor Layout from a Single 360 Image Beyond the Manhattan World Assumption](http://vic.crs4.it/data/papers/eccv2020-atlantanet.pdf) | ECCV 2020 | [[project]](https://www.crs4.it/vic/cgi-bin/bib-page.cgi?id=%27Pintore:2020:AI3%27) [[code]](https://github.com/crs4/AtlantaNet)
| [Corners for Layout: End-to-End Layout Recovery from 360 Images](https://arxiv.org/abs/1903.08094) | ICRA 2019 | [[project]](https://cfernandezlab.github.io/CFL/) [[code]](https://github.com/cfernandezlab/CFL-End-to-End-Layout-Recovery-from-360-Images)
| [DuLa-Net: A Dual-Projection Network for Estimating Room Layouts from a Single RGB Panorama](https://arxiv.org/abs/1811.11977) | CVPR 2019 | [[project]](https://cgv.cs.nthu.edu.tw/projects/dulanet)
| [HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation](https://arxiv.org/abs/1901.03861) | CVPR 2019 | [[code]](https://github.com/sunset1995/HorizonNet)
| [Layouts from Panoramic Images with Geometry and Deep Learning](https://arxiv.org/abs/1806.08294) | IROS 2018 | [[code]](https://github.com/cfernandezlab/Lines-and-Vanishing-Points-directly-on-Panoramas)
| [LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image](https://arxiv.org/abs/1803.08999) | (CVPR 2018) | [[code]](https://github.com/zouchuhang/LayoutNet)
| [Efficient 3D Room Shape Recovery From a Single Panorama](http://openaccess.thecvf.com/content_cvpr_2016/papers/Yang_Efficient_3D_Room_CVPR_2016_paper.pdf) | CVPR 2016 | [[code]](https://github.com/YANG-H/Panoramix)## Floorplan
| Papers | Venue | Links |
|--------|-------|-------|
| 🎲 [FRI-Net: Floorplan Reconstruction via Room-wise Implicit Representation](https://arxiv.org/abs/2407.10687) | ECCV 2024 | [[code]](https://github.com/Daisy-1227/FRI-Net) |
| 🎲 [PolyRoom: Room-aware Transformer for Floorplan Reconstruction](https://arxiv.org/abs/2407.10439) | ECCV 2024 | [[code]](https://github.com/3dv-casia/PolyRoom/) |
| 🎲 [PolyDiffuse: Polygonal Shape Reconstruction via Guided Set Diffusion Models](https://arxiv.org/abs/2306.01461) | NeurIPS 2023 | [[project]](https://poly-diffuse.github.io/) |
| 🎲 [Connecting the Dots: Floorplan Reconstruction Using Two-Level Queries](https://arxiv.org/abs/2211.15658) | CVPR 2023 | [[project]](https://ywyue.github.io/RoomFormer/) [[code]](https://github.com/ywyue/RoomFormer) |
| 📷 [Floorplan Restoration by Structure Hallucinating Transformer Cascades](https://arxiv.org/abs/2206.00645) | CoRR 2022 | - |
| 📷 [MVLayoutNet: 3D Layout Reconstruction with Multi-View Panoramas](https://arxiv.org/abs/2112.06133) | CoRR 2021 | - |
| 📷 [Extreme Structure From Motion for Indoor Panoramas Without Visual Overlaps](https://openaccess.thecvf.com/content/ICCV2021/papers/Shabani_Extreme_Structure_From_Motion_for_Indoor_Panoramas_Without_Visual_Overlaps_ICCV_2021_paper.pdf) | ICCV 2021 | [[code]](https://github.com/aminshabani/extreme-indoor-sfm) |
| 🎲 [MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans](https://arxiv.org/abs/2103.11161) | ICCV 2021 | - |
| 🎲 [Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor Scenes](http://arxiv.org/abs/2003.07356) | CoRR 2020 | - |
| 🎲 [Floor-SP: Inverse CAD for Floorplans by Sequential Room-wise Shortest Path](https://arxiv.org/abs/1908.06702) | ICCV 2019 | [[project]](https://jcchen.me/floor-sp/) [[code]](https://github.com/woodfrog/floor-sp) |
| 📷 [Floorplan-Jigsaw: Jointly Estimating Scene Layout and Aligning Partial Scans](https://arxiv.org/abs/1812.06677) | ICCV 2019 | [[project]](https://enigma-li.github.io/projects/indoorRecons/floorplanJigsaw.html) |
| 🎲 [DeepPerimeter: Indoor Boundary Estimation from Posed Monocular Sequences](https://arxiv.org/abs/1904.11595) | CoRR 2019 | - |
| 📷 [FloorNet: A unified framework for floorplan reconstruction from 3D scans](https://arxiv.org/abs/1804.00090) | ECCV 2018 | [[project]](http://art-programmer.github.io/floornet.html) [[code]](https://github.com/art-programmer/FloorNet) |### Floorplan Vectorization
| Papers | Venue | Links |
|--------|-------|-------|
| [VectorFloorSeg: Two-Stream Graph Attention Network for Vectorized Roughcast Floorplan Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Yang_VectorFloorSeg_Two-Stream_Graph_Attention_Network_for_Vectorized_Roughcast_Floorplan_Segmentation_CVPR_2023_paper.pdf) | CVPR 2023 | [[code]](https://github.com/DrZiji/VecFloorSeg) |
| [Parsing Line Segments of Floor Plan Images Using Graph Neural Networks](https://arxiv.org/abs/2303.03851) | CoRR 2023 | - |
| [Residential floor plan recognition and reconstruction](https://openaccess.thecvf.com/content/CVPR2021/papers/Lv_Residential_Floor_Plan_Recognition_and_Reconstruction_CVPR_2021_paper.pdf) | CVPR 2021 | - |
| [Versailles-FP dataset: Wall Detection in Ancient Floor Plans](https://arxiv.org/abs/2103.08064) | CoRR 2021 | - |
| [Deep Floor Plan Recognition using a Multi-task Network with Room-boundary-Guided Attention](https://arxiv.org/abs/1908.11025) | ICCV 2019 | [[project]](https://github.com/zlzeng/DeepFloorplan) |
| [CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image Analysis](https://arxiv.org/abs/1904.01920) | Scandinavian Conference on Image Analysis 2019 | [[code]](https://github.com/CubiCasa/CubiCasa5k) |
| [Raster-to-Vector: Revisiting Floorplan Transformation](http://openaccess.thecvf.com/content_ICCV_2017/papers/Liu_Raster-To-Vector_Revisiting_Floorplan_ICCV_2017_paper.pdf) | ICCV 2017 | [[project]](http://art-programmer.github.io/floorplan-transformation.html) [[code]](https://github.com/art-programmer/FloorplanTransformation) |### Visual Localization
| Papers | Venue | Links |
|--------|-------|-------|
| [SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments](https://arxiv.org/abs/2404.10527) | ECCV 2024 | [[project]](https://fraunhoferhhi.github.io/spvloc/) [[code]](https://github.com/fraunhoferhhi/spvloc) |
| [LaLaLoc++: Global Floor Plan Comprehension for Layout Localisation in Unvisited Environments](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136870681.pdf) | ECCV 2022 | [[code]](https://github.com/ActiveVisionLab/LaLaLoc) |
| [LASER: LAtent SpacE Rendering for 2D Visual Localization](https://arxiv.org/abs/2204.00157) | CVPR 2022 | - |
| [LaLaLoc: Latent Layout Localisation in Dynamic, Unvisited Environments](https://arxiv.org/abs/2104.09169) | ICCV 2021 | - |## Primitive
### Junction
| Papers | Venue | Links |
|--------|-------|-------|
| [Manhattan Junction Catalogue for Spatial Reasoning of Indoor Scenes](https://www.cv-foundation.org/openaccess/content_cvpr_2013/papers/Ramalingam_Manhattan_Junction_Catalogue_2013_CVPR_paper.pdf) | CVPR 2013 | - |### Line Segment and Wireframe
| Papers | Venue | Links |
|--------|-------|-------|
| 📷[Volumetric Wireframe Parsing from Neural Attraction Fields](https://arxiv.org/abs/2307.10206) | CoRR 2023 | [[code](https://github.com/cherubicXN/neat)] |
| 📷[NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images](https://arxiv.org/abs/2303.07653) | CVPR 2023 | [[project]](https://yunfan1202.github.io/NEF/) |
| [DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients](https://arxiv.org/abs/2212.07766) | CoRR 2022 | [[Code]](https://github.com/cvg/DeepLSD) |
| [Holistically-Attracted Wireframe Parsing: From Supervised to Self-Supervised Learning](https://arxiv.org/abs/2210.12971) | CoRR 2022 | - |
| 🎲[Learning to Construct 3D Building Wireframes from 3D Line Clouds](https://arxiv.org/abs/2208.11948) | BMVC 2022 | [[Code]](https://github.com/Luo1Cheng/LC2WF) |
| [HoW-3D: Holistic 3D Wireframe Perception from a Single Image](https://arxiv.org/abs/2208.06999) | 3DV 2022 | [[Code]](https://github.com/Wenchao-M/HoW-3D) |
| [Semantic Room Wireframe Detection from a Single View](https://arxiv.org/abs/2206.00491) | ICPR 2022 | [[code]](https://github.com/DavidGillsjo/SRW-Net) |
| [Towards Real-time and Light-weight Line Segment Detection](https://arxiv.org/abs/2106.00186) | AAAI 2022 | [[code]](https://github.com/navervision/mlsd) |
| [Hole-robust Wireframe Detection](https://arxiv.org/abs/2111.15064) | WACV 2022 | - |
| [Fully Convolutional Line Parsing](https://arxiv.org/abs/2104.11207) | Neurocomputing 2022 | [[code]](https://github.com/Delay-Xili/F-Clip) |
| [ELSD: Efficient Line Segment Detector and Descriptor](https://arxiv.org/abs/2104.14205) | ICCV 2021 | - |
| [SOLD2: Self-supervised Occlusion-aware Line Description and Detection](https://arxiv.org/abs/2104.03362) | CVPR 2021 | [[code]](https://github.com/cvg/SOLD2) |
| [Line Segment Detection Using Transformers without Edges](https://arxiv.org/abs/2101.01909) | CVPR 2021 | [[code]](https://github.com/mlpc-ucsd/LETR/) |
| [PlueckerNet: Learn to Register 3D Line Reconstructions](https://arxiv.org/abs/2012.01096) | CVPR 2020 | [[code]](https://github.com/Liumouliu/PlueckerNet) |
| [LGNN: A Context-aware Line Segment Detector](https://arxiv.org/abs/2008.05892) | ACM MM 2020 | - |
| [TP-LSD: Tri-Points Based Line Segment Detector](https://arxiv.org/abs/2009.05505) | ECCV 2020 | [[code]](https://github.com/Siyuada7/TP-LSD) |
| [Deep Hough-Transform Line Priors](https://arxiv.org/abs/2007.09493) | ECCV 2020 | [[code]](https://github.com/yanconglin/Deep-Hough-Transform-Line-Priors) |
| [Deep Hough Transform for Semantic Line Detection](https://arxiv.org/abs/2003.04676) | ECCV 2020 | [[code]](https://github.com/Hanqer/deep-hough-transform) |
| [Holistically-Attracted Wireframe Parsing](https://arxiv.org/abs/2003.01663) | CVPR 2020 | [[code]](https://github.com/cherubicXN/hawp) |
| [Learning to Reconstruct 3D Manhattan Wireframes from a Single Image](https://arxiv.org/abs/1905.07482) | ICCV 2019 | [[code]](https://github.com/zhou13/shapeunity) |
| [End-to-End Wireframe Parsing](https://arxiv.org/abs/1905.03246) | ICCV 2019 | [[code]](https://github.com/zhou13/lcnn) |
| [PPGNet: Learning Point-Pair Graph for Line Segment Detection](https://arxiv.org/abs/1905.03415) | CVPR 2019 | [[code]](https://github.com/svip-lab/PPGNet) |
| [Learning Attraction Field Representation for Robust Line Segment Detection](https://arxiv.org/abs/1812.02122) | CVPR 2019 | [[code]](https://github.com/cherubicXN/afm_cvpr2019) |
| [Novel Single View Constraints for Manhattan 3D Line Reconstruction](https://arxiv.org/abs/1810.03737) | 3DV 2018 | - |
| [Learning to Parse Wireframes in Images of Man-Made Environments](http://openaccess.thecvf.com/content_cvpr_2018/papers/Huang_Learning_to_Parse_CVPR_2018_paper.pdf) | CVPR 2018 | [[code]](https://github.com/huangkuns/wireframe) |
| [A Novel Linelet-Based Representation for Line Segment Detection](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7926451) | TPAMI 2018 | - |
| [MCMLSD: A Dynamic Programming Approach to Line Segment Detection](http://openaccess.thecvf.com/content_cvpr_2017/papers/Almazan_MCMLSD_A_Dynamic_CVPR_2017_paper.pdf) | CVPR 2017 | - |
| [Lifting 3D Manhattan Lines from a Single Image](https://openaccess.thecvf.com/content_iccv_2013/papers/Ramalingam_Lifting_3D_Manhattan_2013_ICCV_paper.pdf) | ICCV 2013 | - |
| [LSD: A Fast Line Segment Detector with a False Detection Control](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4731268) | TPAMI 2010 | - |### Outdoor Architecture
| Papers | Venue | Links |
|--------|-------|-------|
| [HEAT: Holistic Edge Attention Transformer for Structured Reconstruction](https://arxiv.org/abs/2111.15143) | CVPR 2022 | [[Project]](https://heat-structured-reconstruction.github.io/) |
| [Structured Outdoor Architecture Reconsruction by Exploration and Classification](https://arxiv.org/abs/2108.07990) | ICCV 2021 | [[Project]](https://zhangfuyang.github.io/expcls/) |
| [Roof-GAN: Learning to Generate Roof Geometry and Relations for Residential Houses](https://arxiv.org/abs/2012.09340) | CVPR 2021 | [[Code]](https://github.com/yi-ming-qian/roofgan) |
| [Vectorizing World Buildings: Planar Graph Reconstruction by Primitive Detection and Relationship Inference](https://arxiv.org/abs/1912.05135) | ECCV 2020 | [[Project]](https://ennauata.github.io/buildings2vec/page.html) |
| [Conv-MPN: Convolutional Message Passing Neural Network for Structured Outdoor Architecture Reconstruction](https://arxiv.org/abs/1912.01756) | CVPR 2020 | [[Project]](https://zhangfuyang.github.io/convmpn/) |### Plane
| Papers | Venue | Links |
|--------|-------|-------|
| 📷 [UniPlane: Unified Plane Detection and Reconstruction from Posed Monocular Videos](https://arxiv.org/abs/2407.03594) | CoRR 2024 | |
| 📷 [AirPlanes: Accurate Plane Estimation via 3D-Consistent Embeddings](https://arxiv.org/abs/2406.08960) | CVPR 2024 | [[project]](https://nianticlabs.github.io/airplanes/) |
| [PlaneRecTR: Unified Query learning for 3D Plane Recovery from a Single View](https://arxiv.org/abs/2307.13756) | ICCV 2023 | [[Code]](https://github.com/SJingjia/PlaneRecTR) |
| 📷 [NOPE-SAC: Neural One-Plane RANSAC for Sparse-View Planar 3D Reconstruction](https://arxiv.org/abs/2211.16799) | CoRR 2022 | [[Code]](https://github.com/IceTTTb/NopeSAC) |
| 📷 [PlaneFormers: From Sparse View Planes to 3D Reconstruction](https://arxiv.org/abs/2208.04307) | ECCV 2022 | [[project]](https://samiragarwala.github.io/PlaneFormers) [[code]](https://github.com/samiragarwala/PlaneFormers)
| 📷 [PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos](https://arxiv.org/abs/2206.07710) | CVPR 2022 | [[Project]](https://neu-vi.github.io/planarrecon/) |
| [PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image](https://arxiv.org/abs/2110.11219) | BMVC 2021 | [[code]](https://github.com/EryiXie/PlaneRecNet) |
| [PlaneTR: Structure-Guided Transformers for 3D Plane Recovery](https://arxiv.org/abs/2107.13108) | ICCV 2021 | [[code]](https://github.com/IceTTTb/PlaneTR3D) |
| 📷 [Planar Surface Reconstruction From Sparse Views](https://arxiv.org/abs/2103.14644) | ICCV 2021 | [[project]](https://jinlinyi.github.io/SparsePlanes/) [[code]](https://github.com/jinlinyi/SparsePlanes)
| [Indoor Panorama Planar 3D Reconstruction via Divide and Conquer](https://arxiv.org/abs/2106.14166) | CVPR 2021 | [[code]](https://github.com/sunset1995/PanoPlane360) |
| [Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction](https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123520324.pdf) | ECCV 2020 | [[code]](https://github.com/yi-ming-qian/interplane) |
| [Peek-a-Boo: Occlusion Reasoning in Indoor Scenes with Plane Representations](https://openaccess.thecvf.com/content_CVPR_2020/papers/Jiang_Peek-a-Boo_Occlusion_Reasoning_in_Indoor_Scenes_With_Plane_Representations_CVPR_2020_paper.pdf) | CVPR 2020 | [[project]](https://www.nec-labs.com/~mas/peekaboo/) |
| [Single-Image Piece-wise Planar 3D Reconstruction via Associative Embedding](https://arxiv.org/abs/1902.09777) | CVPR 2019 | [[code]](https://github.com/svip-lab/PlanarReconstruction) |
| [PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image](https://arxiv.org/abs/1812.04072) | CVPR 2019 | [[project]](https://research.nvidia.com/publication/2019-06_PlaneRCNN) [[code]](https://github.com/NVlabs/planercnn) |
| [Recovering 3D Planes from a Single Image via Convolutional Neural Networks](http://openaccess.thecvf.com/content_ECCV_2018/papers/Fengting_Yang_Recovering_3D_Planes_ECCV_2018_paper.pdf) | ECCV 2018 | [[code]](https://github.com/fuy34/planerecover) |
| [PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image](https://arxiv.org/abs/1804.06278) | CVPR 2018 | [[project]](https://www.cse.wustl.edu/~chenliu/planenet.html) [[code]](https://github.com/art-programmer/PlaneNet) |## Vanishing Point
| Papers | Venue | Links |
|--------|-------|-------|
| [Vanishing Point Estimation in Uncalibrated Images with Prior Gravity Direction](https://arxiv.org/abs/2308.10694) | ICCV 2023 | [[code]](https://github.com/cvg/VP-Estimation-with-Prior-Gravity) |
| [Transformer Based Line Segment Classifier with Image Context for Real-Time Vanishing Point Detection in Manhattan World](https://openaccess.thecvf.com/content/CVPR2022/papers/Tong_Transformer_Based_Line_Segment_Classifier_With_Image_Context_for_Real-Time_CVPR_2022_paper.pdf) | CVPR 2022 | - |
| [Deep Vanishing Point Detection: Geometric Priors Make Dataset Variations Vanish](https://arxiv.org/abs/2203.08586) | CVPR 2022 | - |
| [VaPiD: A Rapid Vanishing Point Detector via Learned Optimizers](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_VaPiD_A_Rapid_Vanishing_Point_Detector_via_Learned_Optimizers_ICCV_2021_paper.pdf) | ICCV 2021 | - |
| [NeurVPS: Neural Vanishing Point Scanning via Conic Convolution](https://arxiv.org/abs/1910.06316) | NeurIPS 2021 | [[Code]](https://github.com/zhou13/neurvps) |[ScanNet-Layout]: https://github.com/vevenom/ScanNet-Layout
[Matterport3D-Layout]: https://vsislab.github.io/Matterport3D-Layout/
[MatterportLayout]: https://github.com/ericsujw/Matterport3DLayoutAnnotation
[LayoutMP3D]: https://github.com/fuenwang/LayoutMP3D
[ZInD]: https://github.com/zillow/zind
[cad-estate]: https://github.com/google-research/cad-estate