{"id":13712563,"url":"https://github.com/mitsuba-renderer/enoki","last_synced_at":"2025-10-23T17:31:31.667Z","repository":{"id":44428112,"uuid":"78343485","full_name":"mitsuba-renderer/enoki","owner":"mitsuba-renderer","description":"Enoki: structured vectorization and differentiation on modern processor architectures","archived":false,"fork":false,"pushed_at":"2024-04-19T06:57:31.000Z","size":2659,"stargazers_count":1271,"open_issues_count":22,"forks_count":94,"subscribers_count":45,"default_branch":"master","last_synced_at":"2025-02-01T19:06:11.267Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mitsuba-renderer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-01-08T13:11:23.000Z","updated_at":"2025-01-26T19:49:33.000Z","dependencies_parsed_at":"2024-04-19T07:46:16.892Z","dependency_job_id":"1da3edeb-f900-44b0-a39d-4635195e5321","html_url":"https://github.com/mitsuba-renderer/enoki","commit_stats":{"total_commits":632,"total_committers":14,"mean_commits":"45.142857142857146","dds":"0.10601265822784811","last_synced_commit":"2a18afa2402e0677887c8439fa3d6a270ea15726"},"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitsuba-renderer%2Fenoki","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitsuba-renderer%2Fenoki/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitsuba-renderer%2Fenoki/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mitsuba-renderer%2Fenoki/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mitsuba-renderer","download_url":"https://codeload.github.com/mitsuba-renderer/enoki/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":237863958,"owners_count":19378260,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T23:01:19.870Z","updated_at":"2025-10-23T17:31:25.859Z","avatar_url":"https://github.com/mitsuba-renderer.png","language":"C++","readme":"\u003cp align=\"center\"\u003e\u003cimg src=\"https://github.com/mitsuba-renderer/enoki/raw/master/docs/enoki-logo.png\" alt=\"Enoki logo\" width=\"300\"/\u003e\u003c/p\u003e\n\n# Enoki — structured vectorization and differentiation on modern processor architectures\n\n| Documentation   | Linux             | Windows             |\n|      :---:      |       :---:       |        :---:        |\n| [![docs][1]][2] | [![rgl-ci][3]][4] | [![appveyor][5]][6] |\n\n\n[1]: https://readthedocs.org/projects/enoki/badge/?version=master\n[2]: http://enoki.readthedocs.org/en/master\n[3]: https://rgl-ci.epfl.ch/app/rest/builds/buildType(id:Enoki_Build)/statusIcon.svg\n[4]: https://rgl-ci.epfl.ch/viewType.html?buildTypeId=Enoki_Build\u0026guest=1\n[5]: https://ci.appveyor.com/api/projects/status/68db7e5es7el1btd/branch/master?svg=true\n[6]: https://ci.appveyor.com/project/wjakob/enoki/branch/master\n\n# Archived project\n\nPlease be advised that Enoki is no longer being maintained. It is superseded by\n[Dr.Jit](https://github.com/mitsuba-renderer/drjit).\n\n## Introduction\n\n**Enoki** is a C++ template library that enables automatic transformations of\nnumerical code, for instance to create a \"wide\" vectorized variant of an\nalgorithm that runs on the CPU or GPU, or to compute gradients via transparent\nforward/reverse-mode automatic differentation.\n\nThe core parts of the library are implemented as a set of header files with no\ndependencies other than a sufficiently C++17-capable compiler (GCC \u003e= 8.2,\nClang \u003e= 7.0, Visual Studio \u003e= 2017). Enoki code reduces to efficient SIMD\ninstructions available on modern CPUs and GPUs—in particular, Enoki supports:\n\n* **Intel**: AVX512, AVX2, AVX, and SSE4.2,\n* **ARM**: NEON/VFPV4 on armv7-a, Advanced SIMD on 64-bit armv8-a,\n* **NVIDIA**: CUDA via a *Parallel Thread Execution* (PTX) just-in-time compiler.\n* **Fallback**: a scalar fallback mode ensures that programs still run even\n  if none of the above are available.\n\nDeploying a program on top of Enoki usually serves three goals:\n\n1. Enoki ships with a convenient library of special functions and data\n   structures that facilitate implementation of numerical code (vectors,\n   matrices, complex numbers, quaternions, etc.).\n\n2. Programs built using these can be instantiated as *wide* versions that\n   process many arguments at once (either on the CPU or the GPU).\n\n   Enoki is also *structured* in the sense that it handles complex programs\n   with custom data structures, lambda functions, loadable modules, virtual\n   method calls, and many other modern C++ features.\n\n3. If derivatives are desired (e.g. for stochastic gradient descent), Enoki\n   performs transparent forward or reverse-mode automatic differentiation of\n   the entire program.\n\nFinally, Enoki can do all of the above simultaneously: if desired, it can\ncompile the same source code to multiple different implementations (e.g.\nscalar, AVX512, and CUDA+autodiff).\n\n### Motivation\n\nThe development of this library was prompted by the author's frustration\nwith the current vectorization landscape:\n\n1. Auto-vectorization in state-of-the-art compilers is inherently local. A\n   computation whose call graph spans separate compilation units (e.g. multiple\n   shared libraries) simply can't be vectorized.\n\n2. Data structures must be converted into a *Structure of Arrays* (SoA) layout\n   to be eligible for vectorization.\n\n   \u003cp align=\"center\"\u003e\n       \u003cimg src=\"https://github.com/mitsuba-renderer/enoki/raw/master/docs/intro-01.png\" alt=\"SoA layout\" width=\"400\"/\u003e\n   \u003c/p\u003e\n\n   This is analogous to performing a matrix transpose of an application's\n   entire memory layout—an intrusive change that is likely to touch almost\n   every line of code.\n\n3. Parts of the application likely have to be rewritten using [intrinsic\n   instructions](https://software.intel.com/sites/landingpage/IntrinsicsGuide),\n   which is going to look something like this:\n\n   \u003cp align=\"center\"\u003e\n       \u003cimg src=\"https://github.com/mitsuba-renderer/enoki/raw/master/docs/intro-02.png\" alt=\"intrinsics\" width=\"400\"/\u003e\n   \u003c/p\u003e\n\n\n   Intrinsics-heavy code is challenging to read and modify once written, and it\n   is inherently non-portable. CUDA provides a nice language environment\n   for programming GPUs but does nothing to help with the other requirements\n   (vectorization on CPUs, automatic differentiation).\n\n4. Vectorized transcendental functions (*exp*, *cos*, *erf*, ..) are not widely\n   available. Intel, AMD, and CUDA provide proprietary implementations, but many\n   compilers don't include them by default.\n\n5. It is desirable to retain both scalar and vector versions of an algorithm,\n   but ensuring their consistency throughout the development cycle becomes a\n   maintenance nightmare.\n\n6. *Domain-specific languages* (DSLs) for vectorization such as\n   [ISPC](https://ispc.github.io) address many of the above issues but assume\n   that the main computation underlying an application can be condensed into a\n   compact kernel that is implementable using the limited language subset of\n   the DSL (e.g. plain C in the case of ISPC).\n\n   This is not the case for complex applications, where the \"kernel\" may be\n   spread out over many separate modules involving high-level language features\n   such as functional or object-oriented programming.\n\n### What Enoki does differently\n\nEnoki addresses these issues and provides a *complete* solution for vectorizing\nand differentiating modern C++ applications with nontrivial control flow and\ndata structures, dynamic memory allocation, virtual method calls, and vector\ncalls across module boundaries. It has the following design goals:\n\n1. **Unobtrusive**. Only minor modifications are necessary to convert existing\n   C++ code into its Enoki-vectorized equivalent, which remains readable and\n   maintainable.\n\n2. **No code duplication**. It is generally desirable to provide both scalar\n   and vectorized versions of an API, e.g. for debugging, and to preserve\n   compatibility with legacy code. Enoki code extensively relies on class and\n   function templates to achieve this goal without any code duplication—the\n   same code template can be leveraged to create scalar, CPU SIMD, and GPU\n   implementations, and each variant can provide gradients via automatic\n   differentiation if desired.\n\n3. **Custom data structures**. Enoki can also vectorize custom data\n   structures. All the hard work (e.g. conversion to SoA format) is handled by\n   the C++17 type system.\n\n4. **Function calls**. Vectorized calls to functions in other compilation units\n   (e.g. a dynamically loaded plugin) are possible. Enoki can even vectorize\n   method or virtual method calls (e.g. ``instance-\u003emy_function(arg1, arg2,\n   ...);`` when ``instance`` turns out to be an array containing many different\n   instances).\n\n5. **Mathematical library**. Enoki includes an extensive mathematical support\n   library with complex numbers, matrices, quaternions, and related operations\n   (determinants, matrix, inversion, etc.). A set of transcendental and special\n   functions supports real, complex, and quaternion-valued arguments in single\n   and double-precision using polynomial or rational polynomial approximations,\n   generally with an average error of \u003c1/2 ULP on their full domain.\n   These include exponentials, logarithms, and trigonometric and hyperbolic\n   functions, as well as their inverses. Enoki also provides real-valued\n   versions of error function variants, Bessel functions, the Gamma function,\n   and various elliptic integrals.\n\n   \u003cp align=\"center\"\u003e\n       \u003cimg src=\"https://github.com/mitsuba-renderer/enoki/raw/master/docs/intro-03.png\" alt=\"Transcendentals\" width=\"720\"/\u003e\n   \u003c/p\u003e\n\n   Importantly, all of this functionality is realized using the abstractions of\n   Enoki, which means that it transparently composes with vectorization,\n   the JIT compiler for generating CUDA kernels, automatic differentiation, etc.\n\n6. **Portability**. When creating vectorized CPU code, Enoki supports arbitrary\n   array sizes that don't necessarily match what is supported by the underlying\n   hardware (e.g. 16 x single precision on a machine, whose SSE vector only has\n   hardware support for 4 x single precision operands). The library uses\n   template metaprogramming techniques to efficiently map array expressions\n   onto the available hardware resources. This greatly simplifies development\n   because it's enough to write a single implementation of a numerical\n   algorithm that can then be deployed on any target architecture. There are\n   non-vectorized fallbacks for everything, thus programs will run even on\n   unsupported architectures (albeit without the performance benefits of\n   vectorization).\n\n7. **Modular architecture**. Enoki is split into two major components: the\n   front-end provides various high-level array operations, while the back-end\n   provides the basic ingredients that are needed to realize these operations\n   using the SIMD instruction set(s) supported by the target architecture.\n\n   The CPU vector back-ends e.g. make heavy use of SIMD intrinsics to\n   ensure that compilers generate efficient machine code. The\n   intrinsics are contained in separate back-end header files (e.g.\n   ``array_avx.h`` for AVX intrinsics), which provide rudimentary\n   arithmetic and bit-level operations. Fancier operations (e.g.\n   *atan2*) use the back-ends as an abstract interface to the hardware,\n   which means that it's simple to support other instruction sets such\n   as a hypothetical future AVX1024 or even an entirely different\n   architecture (e.g. a DSP chip) by just adding a new back-end.\n\n8. **License**. Enoki is available under a non-viral open source license\n   (3-clause BSD).\n\n## Cloning\n\nEnoki depends on two other repositories\n([pybind11](https://github.com/pybind/pybind11) and\n[cub](https://nvlabs.github.io/cub)) that are required when using certain\noptional features, specifically differentiable GPU arrays with Python bindings.\n\nTo fetch the entire project including these dependencies, clone the project\nusing the ``--recursive`` flag as follows:\n\n```bash\n$ git clone --recursive https://github.com/mitsuba-renderer/enoki\n```\n\n## Documentation\n\nAn extensive set of tutorials and reference documentation are available at\n[readthedocs.org](http://enoki.readthedocs.org/en/master).\n\n## About\n\nThis project was created by [Wenzel Jakob](http://rgl.epfl.ch/people/wjakob).\nIt is named after [Enokitake](https://en.wikipedia.org/wiki/Enokitake), a type\nof mushroom with many long and parallel stalks reminiscent of data flow in\nvectorized arithmetic.\n\nEnoki is the numerical foundation of version 2 of the [Mitsuba\nrenderer](https://github.com/mitsuba-renderer/mitsuba2), though it is\nsignificantly more general and should be a trusty tool for a variety of\nsimulation and optimization problems.\n\nWhen using Enoki in academic projects, please cite\n\n```bibtex\n@misc{Enoki,\n   author = {Wenzel Jakob},\n   year = {2019},\n   note = {https://github.com/mitsuba-renderer/enoki},\n   title = {Enoki: structured vectorization and differentiation on modern processor architectures}\n}\n```\n","funding_links":[],"categories":["Linear Algebra / Statistics Toolkit","C++","Maths"],"sub_categories":["General Purpose Tensor Library"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmitsuba-renderer%2Fenoki","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmitsuba-renderer%2Fenoki","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmitsuba-renderer%2Fenoki/lists"}