{"id":13474507,"url":"https://tobias-kirschstein.github.io/nersemble/","last_synced_at":"2025-03-26T21:31:40.283Z","repository":{"id":161800764,"uuid":"635732533","full_name":"tobias-kirschstein/nersemble","owner":"tobias-kirschstein","description":"[Siggraph '23] NeRSemble: Neural Radiance Field Reconstruction of Human Heads","archived":false,"fork":false,"pushed_at":"2025-03-25T11:26:43.000Z","size":142654,"stargazers_count":222,"open_issues_count":5,"forks_count":11,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-03-25T12:29:59.778Z","etag":null,"topics":["3d-deep-learning","3d-face-reconstruction","avatars","digital-humans","dynamic-nerf","nerf","neural-fields","novel-view-synthesis","siggraph2023"],"latest_commit_sha":null,"homepage":"https://tobias-kirschstein.github.io/nersemble/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tobias-kirschstein.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-03T10:42:57.000Z","updated_at":"2025-03-21T05:09:30.000Z","dependencies_parsed_at":"2024-01-13T18:24:26.616Z","dependency_job_id":"a645a46d-d1d1-4ab6-ad8d-0f53ceedc8b5","html_url":"https://github.com/tobias-kirschstein/nersemble","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobias-kirschstein%2Fnersemble","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobias-kirschstein%2Fnersemble/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobias-kirschstein%2Fnersemble/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobias-kirschstein%2Fnersemble/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tobias-kirschstein","download_url":"https://codeload.github.com/tobias-kirschstein/nersemble/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245738688,"owners_count":20664328,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d-deep-learning","3d-face-reconstruction","avatars","digital-humans","dynamic-nerf","nerf","neural-fields","novel-view-synthesis","siggraph2023"],"created_at":"2024-07-31T16:01:12.820Z","updated_at":"2025-03-26T21:31:38.577Z","avatar_url":"https://github.com/tobias-kirschstein.png","language":"Python","funding_links":[],"categories":["Uncategorized"],"sub_categories":["Uncategorized"],"readme":"# NeRSemble: Multi-view Radiance Field Reconstruction of Human Heads\n\n[Paper](https://arxiv.org/pdf/2305.03027.pdf) | [Video](https://youtu.be/a-OAWqBzldU) | [Project Page](https://tobias-kirschstein.github.io/nersemble/)\n\n![](static/nersemble_teaser.gif)\n\n[Tobias Kirschstein](https://tobias-kirschstein.github.io/), [Shenhan Qian](https://shenhanqian.github.io), [Simon Giebenhain](https://simongiebenhain.github.io/), [Tim Walter](https://www.linkedin.com/in/tim-walter-7203aa20b/?originalSubdomain=de) and [Matthias Nießner](https://niessnerlab.org/)  \n**Siggraph 2023**\n\n# 1. Installation\n### 1.1. Dependencies\n- PyTorch 2.0\n- nerfstudio\n- tinycudann\n\n\n 1. Setup environment\n    ```\n    conda env create -f environment.yml\n    conda activate nersemble\n    ```\n    which creates a new conda environment `nersemble` (Installation may take a while).\n\n\n 2. Manually install `tinycudann`:\n    ```\n    pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch\n    ```\n    (Also helpful, if you get an error like `ImportError: DLL load failed while importing _86_C: The specified procedure could not be found.` later on)\n\n\n 3. Install the `nersemble` package itself by running \n    ```shell\n    pip install -e .\n    ```\n    inside the cloned repository folder.\n    \n### 1.2. Environment Paths\n\nAll paths to data / models / renderings are defined by _environment variables_.  \nPlease create a file in your home directory in `~/.config/nersemble/.env` with the following content:\n```shell\nNERSEMBLE_DATA_PATH=\"...\"\nNERSEMBLE_MODELS_PATH=\"...\"\nNERSEMBLE_RENDERS_PATH=\"...\"\n```\nReplace the `...` with the locations where data / models / renderings should be located on your machine.\n - `NERSEMBLE_DATA_PATH`:  Location of the multi-view video dataset (See [section 2](#2-dataset) for how to obtain the dataset)\n - `NERSEMBLE_MODELS_PATH`: During training, model checkpoints and configs will be saved here\n - `NERSEMBLE_RENDERS_PATH`: Video renderings of trained models will be stored here\n\nIf you do not like creating a config file in your home directory, you can instead hard-code the paths in the [env.py](src/nersemble/env.py).\n\n### 1.3. Troubleshooting\n\nYou may run into this error at the beginning of training:\n```shell\n\\lib\\site-packages\\torch\\include\\pybind11\\cast.h(624): error: too few arguments for template template parameter \"Tuple\"\n          detected during instantiation of class \"pybind11::detail::tuple_caster\u003cTuple, Ts...\u003e [with Tuple=std::pair, Ts=\u003cT1, T2\u003e]\"\n(721): here\n\n\\lib\\site-packages\\torch\\include\\pybind11\\cast.h(717): error: too few arguments for template template parameter \"Tuple\"\n          detected during instantiation of class \"pybind11::detail::tuple_caster\u003cTuple, Ts...\u003e [with Tuple=std::pair, Ts=\u003cT1, T2\u003e]\"\n(721): here\n```\nThis occurs during compilation of `torch_efficient_distloss` and can be solved by either training without \ndistortion loss or by changing one line in the `torch_efficient_distloss` library (see [https://github.com/sunset1995/torch_efficient_distloss/issues/8](https://github.com/sunset1995/torch_efficient_distloss/issues/8)).\n\n# 2. Dataset\n\nAccess to the dataset can be requested [here](https://forms.gle/rYRoGNh2ed51TDWX9).  \nTo reproduce the experiments from the paper, only download the `nersemble_XXX_YYY.zip` files (There are 10 in total for the 10 different sequences), as well as the `camera_params.zip`.\nExtract these .zip files into `NERSEMBLE_DATA_PATH`.  \nAlso, see [src/nersemble/data_manager/multi_view_data.py](src/nersemble/data_manager/multi_view_data.py) for an explanation of the folder layout.\n# 3. Usage\n\n### 3.1. Training\n\n```shell\npython scripts/train/train_nersemble.py $ID $SEQUENCE_NAME --name $NAME\n```\n\nwhere `$ID` is the id of the participant in the dataset (e.g., `030`) and `SEQUENCE_NAME` is the name of the expression / emotion / sentence (e.g., `EXP-2-eyes`).\n`$NAME` may optionally be used to annotate the checkpoint folder and the wandb experiment with some descriptive experiment name. \n\nThe training script will place model checkpoints and configuration in `${NERSEMBLE_MODELS_PATH}/nersemble/NERS-XXX-${name}/`. The incremental run id `XXX` will be automatically determined.\n\n#### GPU Requirements\nTraining takes roughly 1 day and requires at least an RTX A6000 GPU (**48GB VRAM**). GPU memory requirements may be lowered by tweaking some of these hyperparameters:\n - `--max_n_samples_per_batch`: restricts How many ray samples are fed through the model at once (default 20 for 2^20 samples)\n - `--n_hash_encodings`: Number of hash encodings in the ensemble (default 32). Using 16 should give comparable quality (`--latent_dim_time` needs to be set to the same value)\n - `--cone_angle`: Use larger steps between ray samples for further away points. The default value of `0` (no step size increase) provides the best quality. Try values up to `0.004`\n - `--n_train_rays`: Number of rays per batch (default 4096). Lower values can affect convergence\n - `--mlp_num_layers` / `--mlp_layer_width`: Making the deformation field smaller should still provide reasonable performance.\n\n#### RAM requirements\nPer default, the training script will cache loaded images in RAM which can cause RAM usage up to 200G. RAM usage can be lowered by:\n - `--max_cached_images` (default 10k): Set to `0` to completely disable caching\n\n#### Special config for sequences 97 and 124\n\nWe disable the occupancy grid acceleration structure from Instant NGP as well as the use of distortion loss due to complex hair motion in **sequence 97**:\n```shell\npython scripts/train/train_nersemble.sh 97 HAIR --name $name --disable_occupancy_grid --lambda_dist_loss 0\n```\n\nWe only train on a subset of **sequence 124** (timesteps 95-570) and slightly prolong the warmup phase due to the complexity of the sequence:\n```shell\n python scripts/train/train_nersemble.sh 124 FREE --name $name --start_timestep 95 --n_timesteps 475 --window_hash_encodings_begin 50000 --window_hash_encodings_end 100000\n```\n### 3.2. Evaluation\n\nIn the paper, all experiments are conducted by training on only 12 cameras and evaluating rendered images on 4 hold-out views (cameras `222200040`, `220700191`, `222200043` and `221501007`).\n\n - For obtaining the reported **PSNR**, **SSIM** and **LPIPS** metrics (evaluated at 15 evenly spaced timesteps):\n    ```shell\n    python scripts/evaluate/evaluate_nersemble.py NERS-XXX\n    ```\n    where `NERS-XXX` is the run name obtained from running the training script above.\n\n - For obtaining the **JOD video metric** (evaluated at 24fps, takes much longer):\n    ```shell\n    python scripts/evaluate/evaluate_nersemble.py NERS-XXX --skip_timesteps 3 --max_eval_timesteps -1\n    ```\n\nThe evaluation results will be printed in the terminal and persisted as a `.json` file in the model folder `${NERSEMBLE_MODELS_PATH}/NERS-XXX-${name}/evaluation`. \n\n### 3.3. Rendering\nFrom a trained model `NERS-XXX`, a circular trajectory (4s) may be rendered via:\n```shell\npython scripts/render/render_nersemble.py NERS-XXX\n```\nThe resulting `.mp4` file is stored in `NERSEMBLE_RENDERS_PATH`.\n\n# 4. Trained Models\n\nWe provide one trained NeRSemble for each of the 10 sequences used in the paper:\n\n| Participant ID | Sequence                  | Model                                                                            |\n|----------------|---------------------------|----------------------------------------------------------------------------------|\n| 18             | EMO-1-shout+laugh         | [NERS-9018](https://nextcloud.tobias-kirschstein.de/index.php/s/gQoLTHjQkNNHN2j) |\n| 30             | EXP-2-eyes                | [NERS-9030](https://nextcloud.tobias-kirschstein.de/index.php/s/gQoLTHjQkNNHN2j) |\n| 38             | EXP-1-head                | [NERS-9038](https://nextcloud.tobias-kirschstein.de/index.php/s/gQoLTHjQkNNHN2j) |\n| 85             | SEN-01-port_strong_smokey | [NERS-9085](https://nextcloud.tobias-kirschstein.de/index.php/s/gQoLTHjQkNNHN2j) |\n| 97             | HAIR                      | [NERS-9097](https://nextcloud.tobias-kirschstein.de/index.php/s/gQoLTHjQkNNHN2j) |\n | 124            | FREE                      | [NERS-9124](https://nextcloud.tobias-kirschstein.de/index.php/s/gQoLTHjQkNNHN2j) |\n| 175            | EXP-6-tongue-1            | [NERS-9175](https://nextcloud.tobias-kirschstein.de/index.php/s/gQoLTHjQkNNHN2j) |\n| 226            | EXP-3-cheeks+nose         | [NERS-9226](https://nextcloud.tobias-kirschstein.de/index.php/s/gQoLTHjQkNNHN2j) |\n| 227            | EXP-5-mouth               | [NERS-9227](https://nextcloud.tobias-kirschstein.de/index.php/s/gQoLTHjQkNNHN2j) |\n| 240            | EXP-4-lips                | [NERS-9240](https://nextcloud.tobias-kirschstein.de/index.php/s/gQoLTHjQkNNHN2j) |\n\nSimply put the downloaded model folders into `${NERSEMBLE_MODELS_PATH}/nersemble`.  \nYou can then use the `evaluate_nersemble.py` and `render_nersemble.py` scripts to obtain renderings or reproduce the official metrics below. \n\n# 5. Official metrics\n\nMetrics averaged over all 10 sequences from the NVS benchmark (same 10 sequences as in the paper):\n\n| Model     | PSNR  | SSIM  | LPIPS | JOD  |\n|-----------|-------|-------|-------|------|\n| NeRSemble | 31.48 | 0.872 | 0.217 | 7.85 |\n\nNote the following:\n - The metrics are slightly different from the paper due to the newer version of nerfstudio used in this repository\n - PSNR, SSIM and LPIPS are computed on only 15 evenly spaced timesteps (to make comparisons cheaper)\n - JOD is computed on every 3rd timestep (using ` --skip_timesteps 3 --max_eval_timesteps -1`)\n - Metrics for sequence 97 were computed with `--no_use_occupancy_grid_filtering`\n\n\u003chr\u003e\n\nIf you find our code, dataset or paper useful, please consider citing\n```bibtex\n@article{kirschstein2023nersemble,\n    author = {Kirschstein, Tobias and Qian, Shenhan and Giebenhain, Simon and Walter, Tim and Nie\\ss{}ner, Matthias},\n    title = {NeRSemble: Multi-View Radiance Field Reconstruction of Human Heads},\n    year = {2023},\n    issue_date = {August 2023},\n    publisher = {Association for Computing Machinery},\n    address = {New York, NY, USA},\n    volume = {42},\n    number = {4},\n    issn = {0730-0301},\n    url = {https://doi.org/10.1145/3592455},\n    doi = {10.1145/3592455},\n    journal = {ACM Trans. Graph.},\n    month = {jul},\n    articleno = {161},\n    numpages = {14},\n}\n```\n\nContact [Tobias Kirschstein](mailto:tobias.kirschstein@tum.de) for questions, comments and reporting bugs, or open a GitHub issue.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/tobias-kirschstein.github.io%2Fnersemble%2F","html_url":"https://awesome.ecosyste.ms/projects/tobias-kirschstein.github.io%2Fnersemble%2F","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/tobias-kirschstein.github.io%2Fnersemble%2F/lists"}