{"id":13761492,"url":"https://github.com/google-research/multinerf","last_synced_at":"2025-10-29T10:31:12.404Z","repository":{"id":52076650,"uuid":"516904877","full_name":"google-research/multinerf","owner":"google-research","description":"A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF","archived":false,"fork":false,"pushed_at":"2023-12-08T20:14:39.000Z","size":121,"stargazers_count":3688,"open_issues_count":105,"forks_count":344,"subscribers_count":47,"default_branch":"main","last_synced_at":"2025-02-07T07:09:51.550Z","etag":null,"topics":["nerf","neural-radiance-fields"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google-research.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2022-07-22T22:47:15.000Z","updated_at":"2025-02-06T18:46:02.000Z","dependencies_parsed_at":"2024-01-15T00:11:44.475Z","dependency_job_id":"aaa0fea3-63e4-40ac-bf79-0b04e2ce3eb4","html_url":"https://github.com/google-research/multinerf","commit_stats":{"total_commits":54,"total_committers":11,"mean_commits":4.909090909090909,"dds":"0.33333333333333337","last_synced_commit":"5b4d4f64608ec8077222c52fdf814d40acc10bc1"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fmultinerf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fmultinerf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fmultinerf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fmultinerf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google-research","download_url":"https://codeload.github.com/google-research/multinerf/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":238805874,"owners_count":19533618,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["nerf","neural-radiance-fields"],"created_at":"2024-08-03T13:01:57.721Z","updated_at":"2025-10-29T10:31:07.085Z","avatar_url":"https://github.com/google-research.png","language":"Python","funding_links":[],"categories":["Python","NeRFs"],"sub_categories":[],"readme":"# MultiNeRF: A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF\n\n*This is not an officially supported Google product.*\n\nThis repository contains the code release for three CVPR 2022 papers:\n[Mip-NeRF 360](https://jonbarron.info/mipnerf360/),\n[Ref-NeRF](https://dorverbin.github.io/refnerf/), and\n[RawNeRF](https://bmild.github.io/rawnerf/).\nThis codebase was written by\nintegrating our internal implementations of Ref-NeRF and RawNeRF into our\nmip-NeRF 360 implementation. As such, this codebase should exactly\nreproduce the results shown in mip-NeRF 360, but may differ slightly when\nreproducing Ref-NeRF or RawNeRF results.\n\nThis implementation is written in [JAX](https://github.com/google/jax), and\nis a fork of [mip-NeRF](https://github.com/google/mipnerf).\nThis is research code, and should be treated accordingly.\n\n## Setup\n\n```\n# Clone the repo.\ngit clone https://github.com/google-research/multinerf.git\ncd multinerf\n\n# Make a conda environment.\nconda create --name multinerf python=3.9\nconda activate multinerf\n\n# Prepare pip.\nconda install pip\npip install --upgrade pip\n\n# Install requirements.\npip install -r requirements.txt\n\n# Manually install rmbrualla's `pycolmap` (don't use pip's! It's different).\ngit clone https://github.com/rmbrualla/pycolmap.git ./internal/pycolmap\n\n# Confirm that all the unit tests pass.\n./scripts/run_all_unit_tests.sh\n```\nYou'll probably also need to update your JAX installation to support GPUs or TPUs.\n\n## Running\n\nExample scripts for training, evaluating, and rendering can be found in\n`scripts/`. You'll need to change the paths to point to wherever the datasets\nare located. [Gin](https://github.com/google/gin-config) configuration files\nfor our model and some ablations can be found in `configs/`.\nAfter evaluating on the test set of each scene in one of the datasets, you can\nuse `scripts/generate_tables.ipynb` to produce error metrics across all scenes\nin the same format as was used in tables in the paper.\n\n### OOM errors\n\nYou may need to reduce the batch size (`Config.batch_size`) to avoid out of memory\nerrors. If you do this, but want to preserve quality, be sure to increase the number\nof training iterations and decrease the learning rate by whatever scale factor you\ndecrease batch size by.\n\n## Using your own data\n\nSummary: first, calculate poses. Second, train MultiNeRF. Third, render a result video from the trained NeRF model.\n\n1. Calculating poses (using COLMAP):\n```\nDATA_DIR=my_dataset_dir\nbash scripts/local_colmap_and_resize.sh ${DATA_DIR}\n```\n2. Training MultiNeRF:\n```\npython -m train \\\n  --gin_configs=configs/360.gin \\\n  --gin_bindings=\"Config.data_dir = '${DATA_DIR}'\" \\\n  --gin_bindings=\"Config.checkpoint_dir = '${DATA_DIR}/checkpoints'\" \\\n  --logtostderr\n```\n3. Rendering MultiNeRF:\n```\npython -m render \\\n  --gin_configs=configs/360.gin \\\n  --gin_bindings=\"Config.data_dir = '${DATA_DIR}'\" \\\n  --gin_bindings=\"Config.checkpoint_dir = '${DATA_DIR}/checkpoints'\" \\\n  --gin_bindings=\"Config.render_dir = '${DATA_DIR}/render'\" \\\n  --gin_bindings=\"Config.render_path = True\" \\\n  --gin_bindings=\"Config.render_path_frames = 480\" \\\n  --gin_bindings=\"Config.render_video_fps = 60\" \\\n  --logtostderr\n```\nYour output video should now exist in the directory `my_dataset_dir/render/`.\n\nSee below for more detailed instructions on either using COLMAP to calculate poses or writing your own dataset loader (if you already have pose data from another source, like SLAM or RealityCapture).\n\n### Running COLMAP to get camera poses\n\nIn order to run MultiNeRF on your own captured images of a scene, you must first run [COLMAP](https://colmap.github.io/install.html) to calculate camera poses. You can do this using our provided script `scripts/local_colmap_and_resize.sh`. Just make a directory `my_dataset_dir/` and copy your input images into a folder `my_dataset_dir/images/`, then run:\n```\nbash scripts/local_colmap_and_resize.sh my_dataset_dir\n```\nThis will run COLMAP and create 2x, 4x, and 8x downsampled versions of your images. These lower resolution images can be used in NeRF by setting, e.g., the `Config.factor = 4` gin flag.\n\nBy default, `local_colmap_and_resize.sh` uses the OPENCV camera model, which is a perspective pinhole camera with k1, k2 radial and t1, t2 tangential distortion coefficients. To switch to another COLMAP camera model, for example OPENCV_FISHEYE, you can run\n```\nbash scripts/local_colmap_and_resize.sh my_dataset_dir OPENCV_FISHEYE\n```\n\nIf you have a very large capture of more than around 500 images, we recommend switching from the exhaustive matcher to the vocabulary tree matcher in COLMAP (see the script for a commented-out example).\n\nOur script is simply a thin wrapper for COLMAP--if you have run COLMAP yourself, all you need to do to load your scene in NeRF is ensure it has the following format:\n```\nmy_dataset_dir/images/    \u003c--- all input images\nmy_dataset_dir/sparse/0/  \u003c--- COLMAP sparse reconstruction files (cameras, images, points)\n```\n\n### Writing a custom dataloader\n\nIf you already have poses for your own data, you may prefer to write your own custom dataloader.\n\nMultiNeRF includes a variety of dataloaders, all of which inherit from the\nbase\n[Dataset class](https://github.com/google-research/multinerf/blob/main/internal/datasets.py#L152).\n\nThe job of this class is to load all image and pose information from disk, then\ncreate batches of ray and color data for training or rendering a NeRF model.\n\nAny inherited subclass is responsible for loading images and camera poses from\ndisk by implementing the `_load_renderings` method (which is marked as\nabstract by the decorator `@abc.abstractmethod`). This data is then used to\ngenerate train and test batches of ray + color data for feeding through the NeRF\nmodel. The ray parameters are calculated in `_make_ray_batch`.\n\n#### Existing data loaders\n\nTo work from an example, you can see how this function is overloaded for the\ndifferent dataloaders we have already implemented:\n\n-   [Blender](https://github.com/google-research/multinerf/blob/main/internal/datasets.py#L470)\n-   [DTU dataset](https://github.com/google-research/multinerf/blob/main/internal/datasets.py#L793)\n-   [Tanks and Temples](https://github.com/google-research/multinerf/blob/main/internal/datasets.py#L680),\n    as processed by the NeRF++ paper\n-   [Tanks and Temples](https://github.com/google-research/multinerf/blob/main/internal/datasets.py#L728),\n    as processed by the Free View Synthesis paper\n\nThe main data loader we rely on is\n[LLFF](https://github.com/google-research/multinerf/blob/main/internal/datasets.py#L526)\n(named for historical reasons), which is the loader for a dataset that has been\nposed by COLMAP.\n\n#### Making your own loader by implementing `_load_renderings`\n\nTo make a new dataset, make a class inheriting from `Dataset` and overload the\n`_load_renderings` method:\n\n```\nclass MyNewDataset(Dataset):\n  def _load_renderings(self, config):\n    ...\n```\n\nIn this function, you **must** set the following public attributes:\n\n-   images\n-   camtoworlds\n-   pixtocams\n-   height, width\n\nMany of our dataset loaders also set other useful attributes, but these are the\ncritical ones for generating rays. You can see how they are used (along with a batch of pixel coordinates) to create rays in [`camera_utils.pixels_to_rays`](https://github.com/google-research/multinerf/blob/main/internal/camera_utils.py#L520).\n\n**Images**\n\n`images` = [N, height, width, 3] numpy array of RGB images. Currently we\nrequire all images to have the same resolution.\n\n**Extrinsic camera poses**\n\n`camtoworlds` = [N, 3, 4] numpy array of extrinsic pose matrices.\n`camtoworlds[i]` should be in **camera-to-world** format, such that we can run\n\n```\npose = camtoworlds[i]\nx_world = pose[:3, :3] @ x_camera + pose[:3, 3:4]\n```\n\nto convert a 3D camera space point `x_camera` into a world space point `x_world`.\n\nThese matrices must be stored in the **OpenGL** coordinate system convention for camera rotation:\nx-axis to the right, y-axis upward, and z-axis backward along the camera's focal\naxis.\n\nThe most common conventions are\n\n-   `[right, up, backwards]`: OpenGL, NeRF, most graphics code.\n-   `[right, down, forwards]`: OpenCV, COLMAP, most computer vision code.\n\nFortunately switching from OpenCV/COLMAP to NeRF is\n[simple](https://github.com/google-research/multinerf/blob/main/internal/datasets.py#L108):\nyou just need to right-multiply the OpenCV pose matrices by `np.diag([1, -1, -1, 1])`,\nwhich will flip the sign of the y-axis (from down to up) and z-axis (from\nforwards to backwards):\n```\ncamtoworlds_opengl = camtoworlds_opencv @ np.diag([1, -1, -1, 1])\n```\n\nYou may also want to **scale** your camera pose translations such that they all\nlie within the `[-1, 1]^3` cube for best performance with the default mipnerf360\nconfig files.\n\nWe provide a useful helper function [`camera_utils.transform_poses_pca`](https://github.com/google-research/multinerf/blob/main/internal/camera_utils.py#L191) that computes a translation/rotation/scaling transform for the input poses that aligns the world space x-y plane with the ground (based on PCA) and scales the scene so that all input pose positions lie within `[-1, 1]^3`. (This function is applied by default when loading mip-NeRF 360 scenes with the LLFF data loader.) For a scene where this transformation has been applied, [`camera_utils.generate_ellipse_path`](https://github.com/google-research/multinerf/blob/main/internal/camera_utils.py#L230) can be used to generate a nice elliptical camera path for rendering videos.\n\n**Intrinsic camera poses**\n\n`pixtocams`= [N, 3, 4] numpy array of inverse intrinsic matrices, OR [3, 4]\nnumpy array of a single shared inverse intrinsic matrix. These should be in\n**OpenCV** format, e.g.\n\n```\ncamtopix = np.array([\n  [focal,     0,  width/2],\n  [    0, focal, height/2],\n  [    0,     0,        1],\n])\npixtocam = np.linalg.inv(camtopix)\n```\n\nGiven a focal length and image size (and assuming a centered principal point,\nthis matrix can be created using\n[`camera_utils.get_pixtocam`](https://github.com/google-research/multinerf/blob/main/internal/camera_utils.py#L411).\n\nAlternatively, it can be created by using\n[`camera_utils.intrinsic_matrix`](https://github.com/google-research/multinerf/blob/main/internal/camera_utils.py#L398)\nand inverting the resulting matrix.\n\n**Resolution**\n\n`height` = int, height of images.\n\n`width` = int, width of images.\n\n**Distortion parameters (optional)**\n\n`distortion_params` = dict, camera lens distortion model parameters. This\ndictionary must map from strings -\u003e floats, and the allowed keys are `['k1',\n'k2', 'k3', 'k4', 'p1', 'p2']` (up to four radial coefficients and up to two\ntangential coefficients). By default, this is set to the empty dictionary `{}`,\nin which case undistortion is not run.\n\n### Details of the inner workings of Dataset\n\nThe public interface mimics the behavior of a standard machine learning pipeline\ndataset provider that can provide infinite batches of data to the\ntraining/testing pipelines without exposing any details of how the batches are\nloaded/created or how this is parallelized. Therefore, the initializer runs all\nsetup, including data loading from disk using `_load_renderings`, and begins\nthe thread using its parent start() method. After the initializer returns, the\ncaller can request batches of data straight away.\n\nThe internal `self._queue` is initialized as `queue.Queue(3)`, so the infinite\nloop in `run()` will block on the call `self._queue.put(self._next_fn())` once\nthere are 3 elements. The main thread training job runs in a loop that pops 1\nelement at a time off the front of the queue. The Dataset thread's `run()` loop\nwill populate the queue with 3 elements, then wait until a batch has been\nremoved and push one more onto the end.\n\nThis repeats indefinitely until the main thread's training loop completes\n(typically hundreds of thousands of iterations), then the main thread will exit\nand the Dataset thread will automatically be killed since it is a daemon.\n\n\n## Citation\nIf you use this software package, please cite whichever constituent paper(s)\nyou build upon, or feel free to cite this entire codebase as:\n\n```\n@misc{multinerf2022,\n      title={{MultiNeRF}: {A} {Code} {Release} for {Mip-NeRF} 360, {Ref-NeRF}, and {RawNeRF}},\n      author={Ben Mildenhall and Dor Verbin and Pratul P. Srinivasan and Peter Hedman and Ricardo Martin-Brualla and Jonathan T. Barron},\n      year={2022},\n      url={https://github.com/google-research/multinerf},\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-research%2Fmultinerf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle-research%2Fmultinerf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-research%2Fmultinerf/lists"}