{"id":19544136,"url":"https://github.com/sunset1995/pytorch-layoutnet","last_synced_at":"2025-04-26T17:32:55.055Z","repository":{"id":45340495,"uuid":"144100685","full_name":"sunset1995/pytorch-layoutnet","owner":"sunset1995","description":"Pytorch implementation of LayoutNet.","archived":false,"fork":false,"pushed_at":"2021-04-03T13:46:16.000Z","size":7050,"stargazers_count":172,"open_issues_count":18,"forks_count":39,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-04-04T17:02:08.623Z","etag":null,"topics":["computer-vision","layoutnet","pytorch","pytorch-layoutnet","room-layout"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sunset1995.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-08-09T04:28:10.000Z","updated_at":"2025-03-25T03:43:54.000Z","dependencies_parsed_at":"2022-08-04T03:00:17.347Z","dependency_job_id":null,"html_url":"https://github.com/sunset1995/pytorch-layoutnet","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sunset1995%2Fpytorch-layoutnet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sunset1995%2Fpytorch-layoutnet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sunset1995%2Fpytorch-layoutnet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sunset1995%2Fpytorch-layoutnet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sunset1995","download_url":"https://codeload.github.com/sunset1995/pytorch-layoutnet/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251026584,"owners_count":21525002,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","layoutnet","pytorch","pytorch-layoutnet","room-layout"],"created_at":"2024-11-11T03:24:46.809Z","updated_at":"2025-04-26T17:32:54.410Z","avatar_url":"https://github.com/sunset1995.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# pytorch-layoutnet\n\n**News: Check out our new project [HoHoNet](https://github.com/sunset1995/HoHoNet) on this task and more!**\\\n**News: Check out our new project [HorizonNet](https://github.com/sunset1995/HorizonNet) on this task.**\n\nThis is an unofficial implementation of CVPR 18 [paper](https://arxiv.org/abs/1803.08999)  \"LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image\". [Official](https://github.com/zouchuhang/LayoutNet) layout dataset are all converted to `.png` and pretrained models are converted to pytorch `state-dict`.  \nWhat difference from official:\n- **Architecture**: Only joint *bounday branch* and *corner branch* are implemented as the paper states that \"Training with 3D regressor has a small impact\".\n- **Pre-processing**: Implementation of *line segment detector* and *pano image alignment* are converted from matlab to python in `pano.py` and `pano_lsd_align.py`.\n- **Post-processing**: No 3D layout optimization. Alternatively, this repo implement a gradient ascent optimizing the similar loss. (see below for more detail)\n\nOverview of the pipeline:\n![](assert/pipeline.png)\n\nUse this repo, you can:\n- extract/visualize layout of your own 360 images with my trained network\n- reproduce official experiments\n- train on your own dataset\n- quantitative evaluatation (3D IoU, Corner Error, Pixel Error)\n\n\n## Requirements\n- Python 3\n- pytorch\u003e=0.4.1\n- numpy\n- scipy\n- Pillow\n- torchfile\n- opencv-python\u003e=3.1 (for pre-processing)\n- open3d (for layout 3D viewer)\n- shapely (for layout 3D viewer)\n\n\n## Visualization\n\n### 1. Preparation\n- Get your fasinated 360 room images. I will use `assert/demo.png` for example.\n    - ![](assert/demo.png)\n- Prepare the enviroment to run the python scripts.\n- Download the trained model from [here (350M)](https://drive.google.com/file/d/1ZlXc4DLgOrkDLW6kQbCY-wOJ8bw4DYu2/view?usp=sharing). Put the 3 files extracted from the downloaded zip under `ckpt/` folder.\n    - So you will get `ckpt/epoch_30_*.pth`\n\n### 2. Pre-processing (Align camera pose with floor)\n- Pre-process the above `assert/demo.png` by firing below command. See `python visual_preprocess.py -h` for more detailed script description.\n    ```\n    python visual_preprocess.py --img_glob assert/demo.png --output_dir assert/output_preprocess/\n    ```\n- Arguments explanation:\n    - `--img_glob` telling the path to your fasinated 360 room image(s).\n    - `--output_dir` telling the path to the directory for dumping the results.\n    - *Hint*: you can use shell-style wildcards with quote (e.g. \"my_fasinated_img_dir/\\*png\") to process multiple images in one shot.\n- Under the given `--output_dir`, you will get results like below and prefix with source image basename.\n    - The aligned rgb images `[SOURCE BASENAME]_aligned_rgb.png` and line segments images `[SOURCE BASENAME]_aligned_line.png`\n        - `demo_aligned_rgb.png` | `demo_aligned_line.png`\n          :--------------------: | :---------------------:\n          ![](assert/output_preprocess/demo_aligned_rgb.png) | ![](assert/output_preprocess/demo_aligned_line.png)\n    - The detected vanishing points `[SOURCE BASENAME]_VP.txt` (Here `demo_VP.txt`)\n        ```\n        -0.006676 -0.499807 0.866111\n        0.000622 0.866128 0.499821\n        0.999992 -0.002519 0.003119\n        ```\n\n### 3. Layout Prediction with LayoutNet\n- Predict the layout from above aligned image and line segments by firing below command.\n    ```\n    python visual.py --path_prefix ckpt/epoch_30 --img_glob assert/output_preprocess/demo_aligned_rgb.png --line_glob assert/output_preprocess/demo_aligned_line.png --output_dir assert/output\n    ```\n- Arguments explanation:\n    - `--path_prefix` prefix path to the trained model.\n    - `--img_glob` path to the VP aligned image.\n    - `--line_glob` path to the corresponding line segment image of the VP aligned image.\n    - `--output_dir` path to the directory to dump the results.\n    - *Hint*: for the two glob, you can use wildcards with quote\n    - *Hint*: for better result, you can add `--flip`, `--rotate 0.25 0.5 0.75`, `--post_optimization`\n- you will get results like below and prefix with source image basename.\n    - The model's output corner/edge probability map `[SOURCE BASENAME]_[cor|edg].png`\n        - `demo_aligned_rgb_cor.png` | `demo_aligned_rgb_edg.png`\n          :------------------------: | :------------------------:\n          ![](assert/output/demo_aligned_rgb_cor.png) | ![](assert/output/demo_aligned_rgb_edg.png)\n    - The extracted layout and all in one image `[SOURCE BASENAME]_[bon|all].png`\n        - `demo_aligned_rgb_bon.png` | `demo_aligned_rgb_all.png`\n          :------------------------: | :------------------------:\n          ![](assert/output/demo_aligned_rgb_bon.png) | ![](assert/output/demo_aligned_rgb_all.png)\n    - The extracted corners of the layout `[SOURCE BASENAME]_cor_id.txt`\n        ```\n        104.928192 186.603119\n        104.928192 337.168579\n        378.994934 177.796646\n        378.994934 346.994629\n        649.976440 183.446518\n        649.976440 340.711731\n        898.234619 190.629089\n        898.234619 332.616364\n        ```\n\n### 4. Layout 3D Viewer\n- A pure python script to visualize the predicted layout in 3D using points cloud. Below command will visualize the result store in `assert/`\n    ```\n    python visual_3d_layout.py --ignore_ceiling --img assert/output_preprocess/demo_aligned_rgb.png --layout  assert/output/demo_aligned_rgb_cor_id.txt\n    ```\n- Arguements explanationL\n    - `--img` path to aligned 360 image\n    - `--layout` path to the txt stroing the `cor_id` (predicted or ground truth)\n    - `--ignore_ceiling` prevent rendering ceiling\n    - for more arguments, see `python visual_3d_layout.py -h`\n- ![](assert/demo_3d_layout.jpeg)\n    - In the window, you can use mouse and scroll to change the viewport\n\n\n## Preparation for Training\n- Download offical [data](https://github.com/zouchuhang/LayoutNet#data) and [pretrained model](https://github.com/zouchuhang/LayoutNet#pretrained-model) as below\n```\n/pytorch-layoutnet \n  /data\n  | /origin\n  |   /data  (download and extract from official)\n  |   /gt    (download and extract from official)\n  /ckpt\n    /panofull_*_pretrained.t7  (download and extract from official)\n```\n- Execute `python torch2pytorch_data.py` to convert `data/origin/**/*` to `data/train`, `data/valid` and `data/test` for pytorch data loader. Under these folder, `img/` contains all raw rgb `.png` while `line/`, `edge/`, `cor/` contain preprocessed Manhattan line segment, ground truth boundary and ground truth corner respectively.\n- [optional] Use `torch2pytorch_pretrained_weight.py` to convert official pretrained pano model to `encoder`, `edg_decoder`, `cor_decoder` pytorch `state_dict` (see `python torch2pytorch_pretrained_weight.py -h` for more detailed). examples:\n  - to convert layout pretrained only\n    ```\n    python torch2pytorch_pretrained_weight.py --torch_pretrained ckpt/panofull_joint_box_pretrained.t7 --encoder ckpt/pre_full_encoder.pth --edg_decoder ckpt/pre_full_edg_decoder.pth --cor_decoder ckpt/pre_full_cor_decoder.pth\n    ```\n  - to convert full pretrained (layout regressor branch  will be ignored)\n    ```\n    python torch2pytorch_pretrained_weight.py --torch_pretrained ckpt/panofull_joint_box_pretrained.t7 --encoder ckpt/pre_full_encoder.pth --edg_decoder ckpt/pre_full_edg_decoder.pth --cor_decoder ckpt/pre_full_cor_decoder.pth\n    ```\n\n\n## Training\nSee `python train.py -h` for detailed arguments explanation.  \nThe default training strategy is the same as official. To launch experiments as official \"corner+boundary\" setting (`--id` is used to identified the experiment and can be named youself):\n```\npython train.py --id exp_default\n```\nTo train only using RGB channels as input (no Manhattan line segment):  \n```\npython train.py --id exp_rgb --input_cat img --input_channels 3\n```\n\n## Gradient Ascent Post Optimization\nInstead of offical 3D layout optimization with sampling strategy, this repo implement a gradient ascent optimization algorithm to minimize the similar loss of official.  \nThe process abstract below:\n1. greedily extract the cuboid parameter from corner/edge probability map\n    - The cuboid are consist of the 6 parameters (`cx`, `cy`, `dx`, `dy`, `theta`, `h`)\n    - `corner probability map`   | `edge probability map`\n      :------------------------: | :------------------------:\n      ![](assert/output/demo_aligned_rgb_cor.png) | ![](assert/output/demo_aligned_rgb_edg.png)\n2. sample points alone the cuboid boundary and project them to equirectangular formatted corner/edge probability map\n    - The sample projected points are visualized as green dot\n    - \u003cimg src=\"assert/output/demo_aligned_rgb_all.png\" width=300\u003e\n3. for each projected sample point, getting value by bilinear interpolation from nearest 4 neighbor pixel on the corner/edge probability map\n4. all the sampled values are reduced to a single scalar called score\n5. compute the gradient for the 6 cuboid parameter to maximize the score\n6. Iterative apply gradient ascent (step 2 through 6)\n\nIt take less than 2 seconds on CPU and found slightly better result than offical reported. \n\n\n## Quantitative Evaluation\nSee `python eval.py -h` for more detailed arguments explanation. To get the result from my trained network (link above):\n```\npython eval.py --path_prefix ckpt/epoch_30 --flip --rotate 0.333 0.666\n```\nTo evaluate with gradient ascent post optimization:\n```\npython eval.py --path_prefix ckpt/epoch_30 --flip --rotate 0.333 0.666 --post_optimization\n```\n\n#### Dataset - PanoContext\n| exp | 3D IoU(%) | Corner error(%) | Pixel error(%) |\n| :-: | :------: | :------: | :--------------: |\n| Official best  | `75.12` | `1.02` | `3.18` |\n| ours rgb only  | `71.42` | `1.30` | `3.83` |\n| ours rgb only \u003cbr\u003e w/ gd opt | `72.52` | `1.50` | `3.66` | \n| ours           | `75.11` | `1.04` | `3.16` |\n| ours \u003cbr\u003e w/ gd opt | **`76.90`** | **`0.93`** | **`2.81`** |\n\n#### Dataset - Stanford 2D-3D\n| exp | 3D IoU(%) | Corner error(%) | Pixel error(%) |\n| :-: | :------: | :------: | :--------------: |\n| Official best  | `77.51` | `0.92` | **`2.42`** |\n| ours rgb only  | `70.39` | `1.50` | `4.28` |\n| ours rgb only \u003cbr\u003e w/ gd opt | `71.90` | `1.35` | `4.25` | \n| ours           | `75.49` | `0.96` | `3.07` |\n| ours \u003cbr\u003e w/ gd opt | **`78.90`** | **`0.88`** | `2.78` |\n\n\n## References\n- [LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image](https://arxiv.org/abs/1803.08999)\n  - Chuhang Zou, Alex Colburn, Qi Shan, Derek Hoiem\n  - CVPR2018\n  ```\n  @inproceedings{zou2018layoutnet,\n    title={LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image},\n    author={Zou, Chuhang and Colburn, Alex and Shan, Qi and Hoiem, Derek},\n    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},\n    pages={2051--2059},\n    year={2018}\n  }\n  ```\n  - [Official torch implementation](https://github.com/zouchuhang/LayoutNet)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsunset1995%2Fpytorch-layoutnet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsunset1995%2Fpytorch-layoutnet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsunset1995%2Fpytorch-layoutnet/lists"}