{"id":9688923,"url":"https://github.com/matajoh/fourier_feature_nets","last_synced_at":"2025-10-28T16:18:52.476Z","repository":{"id":38894888,"uuid":"409644364","full_name":"matajoh/fourier_feature_nets","owner":"matajoh","description":"Supplemental learning materials for \"Fourier Feature Networks and Neural Volume Rendering\"","archived":false,"fork":false,"pushed_at":"2023-05-12T12:27:59.000Z","size":6463,"stargazers_count":173,"open_issues_count":0,"forks_count":24,"subscribers_count":7,"default_branch":"main","last_synced_at":"2024-12-10T00:42:05.633Z","etag":null,"topics":["computer-graphics","computer-vision","deep-learning","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/matajoh.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-09-23T15:21:46.000Z","updated_at":"2024-12-02T02:53:59.000Z","dependencies_parsed_at":"2024-08-30T15:31:24.542Z","dependency_job_id":"f738dbc8-6ec4-497f-95ca-920e3bf90d35","html_url":"https://github.com/matajoh/fourier_feature_nets","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matajoh%2Ffourier_feature_nets","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matajoh%2Ffourier_feature_nets/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matajoh%2Ffourier_feature_nets/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matajoh%2Ffourier_feature_nets/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/matajoh","download_url":"https://codeload.github.com/matajoh/fourier_feature_nets/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":230423559,"owners_count":18223435,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-graphics","computer-vision","deep-learning","pytorch"],"created_at":"2024-05-15T06:13:38.323Z","updated_at":"2025-10-28T16:18:47.423Z","avatar_url":"https://github.com/matajoh.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Fourier Feature Networks and Neural Volume Rendering\n\nThis repository is a companion to a lecture given at the University of\nCambridge Engineering Department, which is available for\nviewing [here](https://youtu.be/cXoaCw796Do).\nIn it you will find the code to reproduce all\nof the visualizations and experiments shared in the lecture, as well as a\n[Jupyter Notebook](lecture_notes.ipynb) providing interactive lecture notes\nconvering the following topics:\n\n1. 1D Signal Reconstruction\n2. 2D Image Regression\n3. Volume Raycasting\n4. 3D Volume Rendering with NeRF\n\n# Getting Started\n\nIn this section I will outline how to run the various experiments. Before I\nbegin, it is worth noting that while the defaults are all reasonable and will\nproduce the results you see in the lecture, it can be very educational to\nplay around with different hyperparameter values and observe the results.\n\nIn order to run the various experiments, you will first need to install\nthe requirements for the repository, ideally in a virtual environment. We\nrecommend using a version of Python \u003e= 3.7. As this code heavily relies upon\nPyTorch, you should install the correct version for your platform. The guide\n[here](https://pytorch.org/get-started/locally/)\nis very useful and I suggest you follow it closely. Once that is done,\nyou can run the following:\n\n```\npip install wheel\npip install -r requirements.txt\n```\n\nYou should now be ready to run any of the experiment scripts in this\nrepository.\n\n# Fourier Feature Networks\n\nThis repository contains implementations of the research presented\nin [Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains](https://bmild.github.io/fourfeat/)\nand [NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis](https://www.matthewtancik.com/nerf).\nThose who use this code should be sure to cite them, and to also take a look\nat our own work in this space,\n[FastNeRF: High-Fidelity Neural Rendering at 200FPS](https://microsoft.github.io/FastNeRF/).\n\nFourier Feature Networks address the inherent problems with teaching neural nets\nto model complex signals from low frequency information. They do this by\nintroducing Fourier features as a preprocessing step, used to encode the\nlow-frequency inputs in such a way as to introduce higher-frequency information\nas seen below for the 1D case:\n\n![1D Fourier Feature Network](docs/fourier_feature_1d_diagram.png)\n\nUltimately the Fourier features replace the featurizer, or kernel, that the\nneural net would otherwise need to learn. As shown above, Fourier Feature\nNetworks can be used to predict a 1-D signal\nfrom a single floating point value indicating time. They can also be used to\npredict image pixel values from their position and, most intriguingly,\npredict color and opacity from 3D position and view direction,\n*i.e.* to model a radiance field.  The ability to do that allows the\ncreation of rendered neural avatars, like the one below:\n\nhttps://user-images.githubusercontent.com/6894931/142743800-737ae051-c605-4ced-8f99-cbab5b426e7c.mp4\n\nAs well as neurally rendered objects which have believable materials properties\nand view-dependent effects.\n\nThe code contained in this repository is intended for use as supplemental\nlearning materials to the lecture. The [Lecture Notes](lectures_nodes.ipynb) in\nparticular will provide a walkthrough of the technical content. This README\nis focused more on how to run these scripts to reproduce experimental results\nand/or run your own experiments using this code.\n\n# Data\n\nAs in the lecture, you can access any of a variety of datasets for use in\nrunning these (or your own) experiments:\n\n## 1D Datasets\n\nThe [`SignalDataset`](nerf/signal_dataset.py) class can take any function\nmapping a single input to a single output. Feel free to experiment.\nHere is an example of how to create one:\n\n```python\ndef _multifreq(x):\n    return np.sin(x) + 0.5*np.sin(2*x) - 0.2*np.cos(5*x) + 2\n\nnum_samples = 32\nsample_rate = 8\ndataset = ffn.SignalDataset.create(_multifreq, num_samples, sample_rate)\n```\n\n## 2D Datasets\n\nThe [`PixelDataset`](nerf/pixel_dataset.py) class can take any path to an\nimage. Create one like this:\n\n```python\ndataset = ffn.PixelDataset.create(path_to_image_file, color_space=\"RGB\",\n                                   size=512)\n```\n\n## 3D Datasets\n\nThis is where the library becomes a bit picky about input data. The\n[`ImageDataset`](nerf/image_dataset.py) supports a set format for data,\nand we provide several datasets in this format to play\nwith. These datasets are not stored in the repo, but the library will\nautomatically download them to the `data` folder when you first requests them\nwhich you can do like so:\n\n```python\ndataset = ffn.ImageDataset.load(\"antinous_400.npz\", split=\"train\", num_samples=64)\n```\n\nWe recommend you use one of the following (all datasets are provided in 400 and 800 versions):\n\n| Name         |  Image Size | # Train | # Val | # Test | Description | Sample image |\n|--------------|-------------|---------|-------|--------|-------------|--------------|\n| `antinous_(size)` | (size)x(size) | 100 | 7 | 13 | Renders of a [sculpture](https://sketchfab.com/3d-models/antinous-12aad55d55e1480da4811c3a4aa42f5f) kindly provided by the Fitzwilliam Museum. Does not include view-dependent effects.| ![Antinous](docs/antinous.jpg)|\n| `rubik_(size)` | (size)x(size) | 100 | 7 | 13 | This work is based on \"Rubik's Cube\" (https://sketchfab.com/3d-models/rubiks-cube-d7d8aa83720246c782bca30dbadebb98) by BeyondDigital (https://sketchfab.com/BeyondDigital) licensed under CC-BY-4.0 (http://creativecommons.org/licenses/by/4.0/). Does not include view-dependent effects. | ![Rubik](docs/rubik.jpg) |\n| `lego_(size)` | (size)x(size) | 100 | 7 | 13 | Physically based renders of a lego build, provided by the NeRF authors. | ![Lego](docs/lego.jpg) |\n| `trex_(size)` | (size)x(size) | 100 | 7 | 13 | This work is based on \"The revenge of the traditional toys\" (https://sketchfab.com/3d-models/the-revenge-of-the-traditional-toys-d2dd1ee7948343308cd732c665ef1337) by Bastien Genbrugge (https://sketchfab.com/bastienBGR) licensed under CC-BY-4.0 (http://creativecommons.org/licenses/by/4.0/). Rendered with PBR and thus includes multiple view-dependent effects. | ![T-Rex](docs/trex.jpg) |\n| `benin_(size)` | (size)x(1.5 *size) | 74 | 10 | 0 | Free moving, hand-held photographs of a bronze statue of a rooster from Benin, kindly provided by Jesus College, Cambridge. | ![Benin](docs/benin.jpg) |\n| `matthew_(size)` | (size)x(size) | 26 | 5 | 0 | Photographs of me, taken by a 31 camera fixed rig. | ![Matthew](docs/matthew.jpg)\n\nIf you want to bring your own data, the format we support is an NPZ with the\nfollowing tensors:\n\n| Name         |     Shape    |  dtype  | description |\n|--------------|:------------:|:-------:|-------------|\n| images       | (C, D, D, 4) |  uint8  | Tensor of camera images with RGBA pixel values. Alpha value indicates a mask around the object (where appropriate).\n| intrinsics   |   (C, 3, 3)  | float32 | Tensor of camera intrinsics (i.e. projection) matrices\n| extrinsics   |   (C, 4, 4)  | float32 | Tensor of camera extrinsics (i.e. camera to world) matrices\n| bounds       |    (4, 4)    | float32 | Rigid transform indicating the bounds of the volume to be rendered. Will be used to transform a unit cube.\n| split_counts |      (3)     |  int32  | Number of cameras (in order) for train, val and test data splits.\n\nwhere `C` is the number of cameras and `D` is the image resolution. You may find\nit helpful to use the provided datasets as a reference.\n\n# Experiments\n\nThese experiments form the basis of the results that you may have already\nseen in the lecture. With a sufficiently powerful GPU (or access to one\nin Azure or another cloud service) you should be able to reproduce all the\nanimations and videos you have seen. In this section, I will provide a brief\nguide to how to use the different scripts that you will find in the root\ndirectory of the repo.\n\n## 1D Signal Regression\n\nThe 1D Signal Regression script can be invoked like so:\n\n    python train_signal_regression.py multifreq outputs/multifreq\n\nYou should see a window pop up that looks like the image below:\n\n![1D Signal Training](docs/multifreq.png)\n\n## 2D Image Regression\n\nTo get started with 2D Image Regression, run the following command:\n\n    python train_image_regression.py cat.jpg mlp outputs/cat_mlp\n\nA window should pop up as the system trains that looks like this:\n\n![Image Regression](docs/image_regression.jpg)\n\nAt the end it will show you the result, which as you will have come to\nexpect from the lecture is severaly lacking in detail due to the lack\nof high-frequency gradients. Try running the same script with\n`positional` or `gaussian` in place of `mlp` to see how using\nFourier features dramatically improves the quality. Your results should\nlook like what you see below:\n\nhttps://user-images.githubusercontent.com/6894931/142743854-fafe15ef-e445-4b36-a02e-1096192e09fb.mp4\n\nFeel free to pass the script your own images and see what happens!\n\n## Ray Sampling\n\nAs a preparation for working with volume rendering, it can be useful to get a\nfeel for the training data. If you run:\n\n    python test_ray_sampling.py lego_400.npz lego_400_rays.html\n\nThis should download the dataset into the `data` directory and then create\na scenepic showing what the ray sampling data looks like. Notice how the rays\npass from the camera through the pixels and into the volume. Try running\nthis script again with `--stratified` to see what happens when we add some\nuniform noise to the samples. Here is an example of what this can look like:\n\nhttps://user-images.githubusercontent.com/6894931/142744680-f808c0b8-6313-4dcf-aa39-9ebd32ed52df.mp4\n\n## Voxel-based Volume Rendering\n\nJust like in the lecture, we'll start with voxel-based rendering. If you run\nthe following command:\n\n    python train_voxels.py lego_400.npz 128 outputs/lego_400_vox128\n\nYou should be able to train a voxel representation of a radiance field.\n\n\u003e **Note**\n\u003e You may have trouble running this script (and the ones that follow) if\n\u003e your computer does not have a GPU with enough memory. See\n\u003e [Running on Azure ML](#running-on-azure-ml) for information on how to run\n\u003e these experiments in the cloud.\n\nIf you look in the `train` and `val` folders in the output\ndirectory you can see images produced during training showing how\nthe model improves over time. There is also a visualization of the\nmodel provided in the `voxels.html` scenepic. Here is an example\nof an image produced by the [Ray Caster](nerf/ray_caster.py):\n\n![Raycaster Training Image](docs/raycaster_training.png)\n\nAll of the 3D methods will produce these images when in default training\nmode. They show (in row major order): rendered image, depth, training/val image,\nand per-pixel error. You can also ask the script to make a video of the training\nprocess. For example, if you run this script:\n\n     python train_voxels.py lego_400.npz 128 outputs/lego_400_vox128 --make-video\n\nIt will produce the frames of the following video:\n\nhttps://user-images.githubusercontent.com/6894931/142744837-382e13b1-d1cf-4305-870a-b64763c73e54.mp4\n\nAnother way to visualize what the model has learned is to produce a\nvoxelization of the model. This is different from the voxel-based volume\nrendering, in which multiple voxels contribute to a single sample. Rather, it\nis a sparse octree containing voxels at the places the model has determined are\nsolid, thus providing a rough sense of how the model is producing the rendered\nimages. You can produce a scenepic showing this via the following command:\n\n    python voxelize_model.py outputs/lego_400_vox128/voxels.pt lego_400.npz lego_400_voxels.html\n\nThis will work for any of the volumetric rendering models.\n\n## Tiny NeRF\n\nThe first neural rendering technique we looked at was so-called \"Tiny\" NeRF, in\nwhich the view direction is not incorporated but we only focus on the 3D\nposition within the volume. You can train Tiny NeRF models using the following\ncommand:\n\n    python train_tiny_nerf.py lego_400.npz mlp outputs/lego_400_mlp/\n\nSubstituting `positional` and `gaussian` as before to try out different modes\nof Fourier encoding. You'll notice again the same low-resolution results for\nMLP and similarly improved results when Fourier features are introduced. Here\nis a side-by-side comparison of `mlp` and `positional` training for our\ndatasets (top row is nearest training image to the orbit camera).\nYour results should be similar.\n\nhttps://user-images.githubusercontent.com/6894931/143578326-e786db33-f7e8-4ced-9a21-f9be80611088.mp4\n\n# NeRF\n\nIn the results above you possibly noticed that specularities and transparency\nwere not quite right. This is because those effects require the incorporation\nof the *view direction*, that is, where the camera is located in relation to\nthe position. NeRF introduces this via a novel structure in the fairly simple\nmodel we've used so far:\n\n![NeRF Diagram](docs/nerf_diagram.png)\n\nFirst, the model is deeper, allowing it to encode more information about the\nradiance field (note the skip connection to address signal attenuation with\ndepth). However, the key structure difference is the addition of the\nray direction being added before the final layer. A subtle but important point\nis that the opacity is predicted without the view direction, to encourage\nstructural consistency.\n\nThe other major difference from what has come before is that NeRF samples\nthe volume in a different way. The technique performs two-tiers of sampling.\nFirst, they sample a *coarse* network, which determines where in the space\nis opaque, and then they use that to create a second set of samples which\nare used to train a *fine* network. For the purpose of this lecture, we do\nsomething very similar in spirit, which is to use the voxel model we trained\nabove as the `coarse` model. You can see how this changes the sampling of the\nvolume by running the `test_ray_sampling.py` script again:\n\n    python test_ray_sampling.py lego_400.npz lego_400_fine.html --opacity-model lego_400_vox128.pt\n\nYou should now be able to see how additional samples are clustering near\nthe location of the model, as opposed to being only evenly distributed over\nthe volume. This helps the NeRF model to learn detail. Try passing in\n`--stratified` again to see the effects for random sampling as well. The video\nbelow displays the results of different kinds of sampling, but you should\nexplore it for yourself as well:\n\nhttps://user-images.githubusercontent.com/6894931/142744739-94e2fd8e-aff9-473c-aa63-5533ef9b0f92.mp4\n\n\u003e **Note**\n\u003e The Tiny NeRF model can also take advantage of fine sampling using an\n\u003e opacity model. Try it out!\n\nYou can train the NeRF model with the following command:\n\n   python train_nerf.py lego_400.npz outputs/lego_400_nerf --opacity-model lego_400_vox128.pt\n\nWhile this model can train for many more steps than 50000 and continue to\nimprove, you should already be able to see the increase in quality over the\nother models from adding in view direction. Here are some sample render orbits\nfrom the NeRF model:\n\nhttps://user-images.githubusercontent.com/6894931/142744753-cd155af5-f247-4854-b19b-32471eee80a8.mp4\n\nhttps://user-images.githubusercontent.com/6894931/142744750-638ceed9-5158-49b3-be9f-3e301730407d.mp4\n\nhttps://user-images.githubusercontent.com/6894931/143578140-65860c14-1b5d-458b-b2bb-7d814a3ef1e2.mp4\n\nYou can produce these orbit videos yourself by calling, for example:\n\n    python orbit_video.py antinous_800_nerf.pt 800 outputs/antinous_render --opacity-model antinous_800_vox128.pt\n\nGive it a try! That's it for the main experimental scripts. All of them have\ndescriptive help statements, so be sure to explore your options and see what\nyou can learn.\n\n# Running on Azure ML\n\nIt is outside of the scope of this lecture (or repository) to describe in detail\nhow to get access to cloud computing resources for machine learning via\nAzure ML. However, there are some amazing resources out there already.\nFor the purpose of this repository, all you need to do is complete\n[this Quickstart Tutorial](https://docs.microsoft.com/en-gb/azure/machine-learning/quickstart-create-resources)\nand download the `config.json` associated with your workspace into the root\nof the repository. You can then run any of the training scripts in Azure ML\nusing the `submit_aml_run.py` script, like so:\n\n    python submit_aml_run.py cat \u003ccompute\u003e train_image_regression.py \"cat.jpg mlp outputs\"\n\nWhere `cat` is the experiment name (you can choose anything here) that will\ngroup different runs together, and where you replace `\u003ccompute\u003e` with the\nname of the compute target you want to use to run the experiment (which\nwill need to have a GPU available). Finally you provide the script name\n(in this case, `train_image_regression.py`, which I suggest you use while you\nare getting your workspace up and running) and the arguments to the script as\na string. If you get an error, make certain you've run:\n\n    pip install -r requirements-azureml.txt\n\nIf everything is working, you should receive a link that lets you monitor\nthe experiment and view the output images and results in your browser.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmatajoh%2Ffourier_feature_nets","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmatajoh%2Ffourier_feature_nets","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmatajoh%2Ffourier_feature_nets/lists"}