Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/vita-group/gnt-move
[ICCV2023] "Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts" by Wenyan Cong, Hanxue Liang, Peihao Wang, Zhiwen Fan, Tianlong Chen, Mukund Varma, Yi Wang, Zhangyang Wang
https://github.com/vita-group/gnt-move
3d generalizable-nerf nerf
Last synced: 3 months ago
JSON representation
[ICCV2023] "Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts" by Wenyan Cong, Hanxue Liang, Peihao Wang, Zhiwen Fan, Tianlong Chen, Mukund Varma, Yi Wang, Zhangyang Wang
- Host: GitHub
- URL: https://github.com/vita-group/gnt-move
- Owner: VITA-Group
- Created: 2023-08-24T14:40:34.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-08-25T05:49:56.000Z (over 1 year ago)
- Last Synced: 2024-01-10T08:25:58.236Z (about 1 year ago)
- Topics: 3d, generalizable-nerf, nerf
- Language: Python
- Homepage:
- Size: 40.7 MB
- Stars: 36
- Watchers: 13
- Forks: 3
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts [ArXiv](https://arxiv.org/abs/2308.11793)
[Wenyan Cong]()1∗,
[Hanxue Liang]()2,1∗,
[Peihao Wang](https://peihaowang.github.io/)1,
[Zhiwen Fan]()1,
[Tianlong Chen](https://tianlong-chen.github.io/)1,
[Mukund Varma T](https://mukundvarmat.github.io/)3,1,
[Yi Wang]()1,
[Zhangyang Wang](https://vita-group.github.io/)11University of Texas at Austin, 2University of Cambridge, 3Indian Institute of Technology Madras
* denotes equal contribution.
This repository is built based on GNT's [offical repository](https://github.com/VITA-Group/GNT)
## Introduction
Cross-scene generalizable NeRF models, which can di- rectly synthesize novel views of unseen scenes, have be- come a new spotlight of the NeRF field. Several existing attempts rely on increasingly end-to-end “neuralized” ar- chitectures, i.e., replacing scene representation and/or ren- dering modules with performant neural networks such as transformers, and turning novel view synthesis into a feed- forward inference pipeline. While those feedforward “neu- ralized” architectures still do not fit diverse scenes well out of the box, we propose to bridge them with the powerful Mixture-of-Experts (MoE) idea from large language models (LLMs), which has demonstrated superior generalization ability by balancing between larger overall model capacity and flexible per-instance specialization. Starting from a re- cent generalizable NeRF architecture called GNT, we first demonstrate that MoE can be neatly plugged in to en- hance the model. We further customize a shared permanent expert and a geometry-aware consistency loss to enforce cross-scene consistency and spatial smoothness respec- tively, which are essential for generalizable view synthesis. Our proposed model, dubbed GNT with Mixture-of-View-Experts (GNT-MOVE), has experimentally shown state-of- the-art results when transferring to unseen scenes, indicat- ing remarkably better cross-scene generalization in both zero-shot and few-shot settings.
## Installation
Clone this repository:
```bash
git clone https://github.com/VITA-Group/GNT-MOVE.git
cd GNT-MOVE/
```The code is tested with python 3.8, cuda == 11.1, pytorch == 1.10.1. Additionally dependencies include:
```bash
torchvision
ConfigArgParse
imageio
matplotlib
numpy
opencv_contrib_python
Pillow
scipy
imageio-ffmpeg
lpips
scikit-image
```## Datasets
We reuse the training, evaluation datasets from [IBRNet](https://github.com/googleinterns/IBRNet). All datasets must be downloaded to a directory `data/` within the project folder and must follow the below organization.
```bash
├──data/
├──ibrnet_collected_1/
├──ibrnet_collected_2/
├──real_iconic_noface/
├──spaces_dataset/
├──RealEstate10K-subset/
├──google_scanned_objects/
├──nerf_synthetic/
├──nerf_llff_data/
```
We refer to [IBRNet's](https://github.com/googleinterns/IBRNet) repository to download and prepare data. For ease, we consolidate the instructions below:
```bash
mkdir data
cd data/# IBRNet captures
gdown https://drive.google.com/uc?id=1rkzl3ecL3H0Xxf5WTyc2Swv30RIyr1R_
unzip ibrnet_collected.zip# LLFF
gdown https://drive.google.com/uc?id=1ThgjloNt58ZdnEuiCeRf9tATJ-HI0b01
unzip real_iconic_noface.zip## [IMPORTANT] remove scenes that appear in the test set
cd real_iconic_noface/
rm -rf data2_fernvlsb data2_hugetrike data2_trexsanta data3_orchid data5_leafscene data5_lotr data5_redflower
cd ../# Spaces dataset
git clone https://github.com/augmentedperception/spaces_dataset# RealEstate 10k
## make sure to install ffmpeg - sudo apt-get install ffmpeg
git clone https://github.com/qianqianwang68/RealEstate10K_Downloader
cd RealEstate10K_Downloader
python3 generate_dataset.py train
cd ../# Google Scanned Objects
gdown https://drive.google.com/uc?id=1w1Cs0yztH6kE3JIz7mdggvPGCwIKkVi2
unzip google_scanned_objects_renderings.zip# Blender dataset
gdown https://drive.google.com/uc?id=18JxhpWD-4ZmuFKLzKlAw-w5PpzZxXOcG
unzip nerf_synthetic.zip# LLFF dataset (eval)
gdown https://drive.google.com/uc?id=16VnMcF1KJYxN9QId6TClMsZRahHNMW5g
unzip nerf_llff_data.zip
```## Usage
## Cite this work
If you find our work / code implementation useful for your own research, please cite our paper.
```
@inproceedings{
gntmove2023,
title={Enhancing Ne{RF} akin to Enhancing {LLM}s: Generalizable Ne{RF} Transformer with Mixture-of-View-Experts},
author={Wenyan Cong and Hanxue Liang and Peihao Wang and Zhiwen Fan and Tianlong Chen and Mukund Varma and Yi Wang and Zhangyang Wang},
booktitle={ICCV},
year={2023}
}
```