{"id":13504324,"url":"https://github.com/likojack/bnv_fusion","last_synced_at":"2025-03-29T21:30:32.984Z","repository":{"id":39205322,"uuid":"475061493","full_name":"likojack/bnv_fusion","owner":"likojack","description":"This repository implements our CVPR2022 paper \"BNV-Fusion: Dense 3D Reconstruction using Bi-level Neural Volume Fusion\"","archived":false,"fork":false,"pushed_at":"2023-01-10T13:38:39.000Z","size":3178,"stargazers_count":135,"open_issues_count":4,"forks_count":18,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-11-01T01:33:30.246Z","etag":null,"topics":["3d-reconstruction","computer-vision"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/likojack.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-03-28T15:23:02.000Z","updated_at":"2024-09-20T15:47:10.000Z","dependencies_parsed_at":"2023-02-08T18:46:04.408Z","dependency_job_id":null,"html_url":"https://github.com/likojack/bnv_fusion","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/likojack%2Fbnv_fusion","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/likojack%2Fbnv_fusion/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/likojack%2Fbnv_fusion/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/likojack%2Fbnv_fusion/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/likojack","download_url":"https://codeload.github.com/likojack/bnv_fusion/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246249113,"owners_count":20747164,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d-reconstruction","computer-vision"],"created_at":"2024-08-01T00:00:29.933Z","updated_at":"2025-03-29T21:30:31.874Z","avatar_url":"https://github.com/likojack.png","language":"Python","funding_links":[],"categories":["Papers"],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n\n  \u003ch1 align=\"center\"\u003eBNV-Fusion: Dense 3D Reconstruction using Bi-level Neural Volume Fusion\u003c/h1\u003e\n  \u003cp align=\"center\"\u003e\n    \u003ca href=\"https://likojack.github.io/kejieli/#/home\"\u003e\u003cstrong\u003eKejie Li\u003c/strong\u003e\u003c/a\u003e\n    ~\n    \u003ca href=\"https://andytang15.github.io/\"\u003e\u003cstrong\u003eYansong Tang\u003c/strong\u003e\u003c/a\u003e\n    ~\n    \u003ca href=\"https://www.robots.ox.ac.uk/~victor/\"\u003e\u003cstrong\u003eVictor Adrian Prisacariu\u003c/strong\u003e\u003c/a\u003e\n    ~\n    \u003ca href=\"https://torrvision.com/\"\u003e\u003cstrong\u003ePhilip H.S. Torr\u003c/strong\u003e\u003c/a\u003e\n  \u003c/p\u003e\n\u003c/p\u003e\n\n## BNV-Fusion ([Video](https://www.youtube.com/watch?v=ptx5vtQ9SvM) | [Paper](https://arxiv.org/pdf/2204.01139.pdf))\n\nThis repo implements the CVPR 2022 paper [Bi-level Neural Volume Fusion (BNV-Fusion)](https://arxiv.org/abs/2204.01139). BNV-Fusion leverages recent advances in neural implicit representations and neural rendering for dense 3D reconstruction. The keys to BNV-Fusion are 1) a sparse voxel grid of local shape codes to model surface geometry; 2) a well-designed bi-level fusion mechanism to integrate raw depth observations to the implicit grid efficiently and effectively. As a result, BNV-Fusin can run at a relatively **high frame rate** (2-5 frames per second on a desktop GPU) and reconstruct the 3D environment with **high accuracy**, where fine details missed by recent neural implicit based methods or traditional TSDF-Fusion are captured by BNV-Fusion.\n\n## Requirements\n\nSetup anaconda environment using the following command:\n\n`\nconda env create -f environment.yml -p CONDA_DIR/envs/bnv_fusion (CONDA_DIR is the folder where anaconda is installed)\n`\n\nYou will need to the [torch-scatter](https://github.com/rusty1s/pytorch_scatter) additionally since conda doesn't seem to handle this package particularly well.\n\n\nAlternatively, you can build a docker image using the DockerFile provided (Work in progress. We can't get the Open3D working within the docker image. Any help is appreciated!).\n\n[IMPORTANT] Setup the PYTHONPATH before running the code:\n\n`\nexport PYTHONPATH=$PYTHONPATH:$PWD\n`\n\nIf you don't want to run this command everytime using a new terminal, you can also setup an alias in Bash to setup PYTHONPATH and activate the environment at one go as follows:\n\n`\nalias bnv_fusion=\"export PYTHONPATH=$PYTHONPATH:PROJECT_DIR;conda activate bnv_fusion\"\n`\n\nPROJECT_DIR is the root directory of this repo.\n\n\n**New: Running with sequences captured by iPhone/iPad**\n------\nWe are happy to share that you can run BNV-Fusion reasonably easily on any sequences you captured using an iOS device with a lidar sensor (e.g., iPhone 12/13 Pro, iPad Pro). The instructions are as follows:\n1. Download the [3D scanner app](https://apps.apple.com/us/app/3d-scanner-app/id1419913995) to an iOS device.\n2. You can then capture a sequence using this app.\n3. After recoding, you need to transfer the raw data (e.g., depth images, camera poses) to a desktop with a GPU. To do this, tap \"scans\" at the bottom left of the app. Select \"Share Model\" after clicking the \"...\" button. There are various formats you can use to share the data, but what we need is the raw data, so select \"All Data\". You can then choose your favorite way, such as google drive, for sharing. \n4. After you unpack the data at your desktop, run BNV-Fusion using the following command:\n```\npython src/run_e2e.py model=fusion_pointnet_model dataset=fusion_inference_dataset_arkit trainer.checkpoint=$PWD/pretrained/pointnet_tcnn.ckpt 'dataset.scan_id=\"xxxxx\"' dataset.data_dir=yyyyy model.tcnn_config=$PWD/src/models/tcnn_config.json\n```\nObviously, you need to specify the scan_id and where you hold the data. You should be able to see the reconstruction provided by BNV-Fusion after this step. Hope you have fun!\n\n\n## Datasets and pretrained models\nWe tested BVF-Fusion on three datasets: 3D scene, ICL-NUIM, and ScanNet. Please go to the respective dataset repos to download data.\nAfter downloading the data, run preprocessing scripts: \n```\npython src/script/generate_fusion_data_{DATASET}.py\n```\n\nInstead of downloading the those datasets, we also provide some preprocessed data for one of the sequences in 3D scene in this link for quickly trying out BNV-Fusion. We can download the preprocessed data [here](https://drive.google.com/file/d/1nmdkK-mMpxebAO1MriCD_UbpLwbXYxah/view?usp=sharing).\n\nYou can also run the following command at the project root dir:\n```\nmkdir -p data/fusion/scene3d\ncd data/fusion/scene3d/\npip install gdown (if gdown was not installed)\ngdown https://drive.google.com/uc?id=1nmdkK-mMpxebAO1MriCD_UbpLwbXYxah\nunzip lounge.zip \u0026\u0026 rm lounge.zip\n```\n\n\n## Running\n\u003c!-- The following script is an example of running the system on all sequences in the 3D scene dataset.\n```\nexport PYTHONPATH=$PYTHONPATH:$PWD\nconda activate bnv_fusion\npython src/script/run_inference_on_scene3d.py\n``` --\u003e\n\nTo process a sequence, use the following command:\n```\npython src/run_e2e.py model=fusion_pointnet_model dataset=fusion_inference_dataset dataset.scan_id=\"scene3d/lounge\" trainer.checkpoint=$PWD/pretrained/pointnet_tcnn.ckpt model.tcnn_config=$PWD/src/models/tcnn_config.json model.mode=\"demo\"\n```\n\n## Evaluation\nThe results and GT meshes are availalbe here: https://drive.google.com/drive/folders/1gzsOIuCrj7ydX2-XXULQ61KjtITipYb5?usp=sharing\n\nAfter downloading the data, you can run evaluation using the ```evaluate_bnvf.py```.\n\n## Training the embedding (optional)\nInstead of using the pretrained model provided, you can also train the local embedding yourself by running the following command\n```\npython src/train.py model=fusion_pointnet_modeldataset=fusion_pointnet_dataset model.voxel_size=0.01 model.min_pts_in_grid=8 model.train_ray_splits=1000 model.tcnn_config=$PWD/src/models/tcnn_config.json\n```\n\n## FAQ\n- **Do I have to have a rough mesh, as requested [here](https://github.com/likojack/bnv_fusion/blob/9178e8c36743d6bf9a7828087553d365f50a6d7f/src/datasets/fusion_inference_dataset.py#L253), when running with my own data?**\n\nNo, We only use the mesh to determin the dimensions of the sceen to be reconstructed. You can manually set the boundary if you know the dimensions.\n\n- **How to set an appropriate voxel size?**\n\nThe reconstruction quality apparently depends on the voxel size. If the voxel size is too small, there won't be enough points within each local region for the local embedding. If it is too large, the system fail to recover fine details. Therefore, we select the ideal voxel size based on the number of 3D points in a voxel. You will get a statistic on 3D points used in the local embedding after running system (see [here](https://github.com/likojack/bnv_fusion/blob/9178e8c36743d6bf9a7828087553d365f50a6d7f/src/models/sparse_volume.py#L515)). Empirically, we found out that the voxel size satisfying the following requirements gives better results: 1) the ```min``` is larger than 4, and 2) the ```mean``` is ideally larger than 8.  \n\n## Citation\nIf you find our code or paper useful, please cite\n```bibtex\n@inproceedings{li2022bnv,\n  author    = {Li, Kejie and Tang, Yansong and Prisacariu, Victor Adrian and Torr, Philip HS},\n  title     = {BNV-Fusion: Dense 3D Reconstruction using Bi-level Neural Volume Fusion},\n  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},\n  year      = {2022}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flikojack%2Fbnv_fusion","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flikojack%2Fbnv_fusion","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flikojack%2Fbnv_fusion/lists"}