{"id":19984480,"url":"https://github.com/intellabs/equitriton","last_synced_at":"2025-10-28T11:33:40.231Z","repository":{"id":251121081,"uuid":"821565348","full_name":"IntelLabs/EquiTriton","owner":"IntelLabs","description":"EquiTriton is a project that seeks to implement high-performance kernels for commonly used building blocks in equivariant neural networks, enabling compute efficient training and inference.","archived":false,"fork":false,"pushed_at":"2024-11-11T22:15:56.000Z","size":561,"stargazers_count":32,"open_issues_count":6,"forks_count":4,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-11-11T23:23:24.323Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/IntelLabs.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-28T20:51:12.000Z","updated_at":"2024-11-11T22:16:01.000Z","dependencies_parsed_at":"2024-08-01T02:16:04.636Z","dependency_job_id":"0755a942-2b05-4ada-8312-d8604fba800b","html_url":"https://github.com/IntelLabs/EquiTriton","commit_stats":null,"previous_names":["intellabs/equitriton"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelLabs%2FEquiTriton","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelLabs%2FEquiTriton/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelLabs%2FEquiTriton/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IntelLabs%2FEquiTriton/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/IntelLabs","download_url":"https://codeload.github.com/IntelLabs/EquiTriton/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224386177,"owners_count":17302612,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-13T04:19:08.702Z","updated_at":"2025-10-28T11:33:40.129Z","avatar_url":"https://github.com/IntelLabs.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# EquiTriton\n[![CodeQL](https://github.com/ossf/scorecard-action/actions/workflows/codeql-analysis.yml/badge.svg)](https://github.com/IntelLabs/EquiTriton/actions/workflows/codeql-analysis.yml)\n[![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/IntelLabs/EquiTriton/badge)](https://scorecard.dev/viewer/?uri=github.com/IntelLabs/EquiTriton)\n\n\u003cdiv align=\"center\"\u003e\n\n[![pytorch](https://img.shields.io/badge/PyTorch-v2.1.0-red?logo=pytorch)](https://pytorch.org/get-started/locally/)\n[![License: Apache2.0](https://img.shields.io/badge/License-Apache-yellow.svg)](https://opensource.org/licenses/apache-2-0)\n![python-support](https://img.shields.io/badge/Python-3.10%7C3.11%7C3.12-3?logo=python)\n![triton](https://img.shields.io/badge/Triton-2.10-2?link=https%3A%2F%2Fgithub.com%2Fintel%2Fintel-xpu-backend-for-triton%2Freleases%2Ftag%2Fv2.1.0)\n[![paper](https://img.shields.io/badge/Paper-OpenReview-blue.svg)](https://openreview.net/forum?id=ftK00FO5wq)\n\n\n\u003c/div\u003e\n\n_Performant kernels for equivariant neural networks in Triton-lang_\n\n## Introduction\n\n_EquiTriton_ is a project that seeks to implement high-performance kernels\nfor commonly used building blocks in equivariant neural networks, enabling\ncompute efficient training and inference. The advantage of Triton-lang is\nportability across GPU architectures: kernels here have been tested against\nGPUs from multiple vendors, including A100/H100 from Nvidia, and the Intel®️\nData Center GPU Max Series 1550.\n\nOur current scope includes components such as spherical harmonics (including\nderivatives, up to $l=4$), and we intend to expand this set quickly. If you\nfeel that a particular set of kernels would be valuable, please feel free\nto submit an issue or pull request!\n\n\n## Getting Started\n\nFor users, run `pip install git+https://github.com/IntelLabs/EquiTriton`. For those who\nare using Intel XPUs, we recommend you reading the section on Intel XPU usage first,\nand setting up an environment with PyTorch, IPEX, and Triton for XPU before installing\n_EquiTriton_.\n\nFor developers/contributors, please clone this repository and install it in editable mode:\n\n```console\ngit clone https://github.com/IntelLabs/EquiTriton\ncd EquiTriton\npip install -e './[dev]'\n```\n\n...which will include development dependencies such as `pre-commit` (used for linting\nand formatting), and `jupyter` used for symbolic differentiation for kernel development.\n\nFinally, we provide `Dockerfile`s for users who prefer containers.\n\n## Usage\n\nAs a drop-in replacement for `e3nn` spherical harmonics, simply include the\nfollowing in your code:\n\n```python\nfrom equitriton import patch\n```\n\nThis will dynamically replace the `e3nn` spherical harmonics implementation\nwith the _EquiTriton_ kernels.\n\nThere are two important things to consider before replacing:\n\n1. Numerically, there are small differences between implementations, primarily\nin the backward pass. Because terms in the gradients are implemented as literals,\nthey can be more susceptible to rounding errors at lower precision. In most\n(not all!) instances, they are numerically equivalent for `torch.float32`, and\nbasically _always_ different for `torch.float16`. At double precision (`torch.float64`)\nthis does not seem to be an issue, which makes it ideal for use in simulation loops but\nplease be aware that if it is used for training, the optimization trajectory may not\nbe exactly the same; we have not tested for divergence and encourage experimentation.\n2. Triton kernels are compiled just-in-time and a cached every time it encounters\na new input tensor shape. In `equitriton.sph_harm.SphericalHarmonics`, the `pad_tensor`\nargument (default is `True`) is used to try and maximize cache re-use by padding\nnodes and masking in the forward pass. The script `scripts/dynamic_shapes.py` will\nlet you test the performance over a range of shapes; we encourage you to test it\nbefore performing full-scale training/inference.\n\n## Decoupled spherical harmonics kernels\n\nWe recently published a paper at the AI4Mat workshop at NeurIPS 2024, which as part\nof that work, we went back into ``sympy`` to refactor the spherical harmonics up to $l=10$,\nsuch that computations of a particular order are _independent_ from others. This allows\narbitrary orders to be freely composed without incurring a performance penalty, in\nthe case that one wishes to calculate $l=8$, but not $l=7$, for example.\n\nFunctionally, these kernels are intended to behave in the same way as their original\nimplementation, i.e. they still provide equivariant properties when used to map\ncartesian point clouds. However, because of the aggressive refactoring and heavy use\nof hard-coded literals, they may (or will) differ numerically from even the initial _EquiTriton_\nkernels, particularly at higher orders.\n\n\u003e [!IMPORTANT]\n\u003e For the above reason, while the kernels can be drop-in replacements, we do not recommend\n\u003e using them from already trained models, at least without some testing on the user's part,\n\u003e as the results may differ. We have also not yet attempted to use these kernels as part of\n\u003e simulation-based workflows (i.e. molecular dynamics), however our training experiments do\n\u003e show that training indeed does converge.\n\nTo use the new set of decoupled kernels, the main `torch.autograd` binding is through\nthe `equitriton.sph_harm.direct.TritonSphericalHarmonic`:\n\n```python\nimport torch\nfrom equitriton.sph_harm.direct import TritonSphericalHarmonic\n\ncoords = torch.rand(100, 3)\nsph_harm = TritonSphericalHarmonic.apply(\n  l_values=[0, 1, 2, 6, 10],\n  coords=coords\n)\n```\n\nThe improvements to performance are expected to come from (1) decoupling of each spherical\nharmonic order, and (2) pre-allocation of an output tensor as to avoid using `torch.cat`,\nwhich calculates each order followed by copying. See the \"Direct spherical harmonics evaluation\"\nnotebook in the notebooks folder for derivation.\n\n### Development and usage on Intel XPU\n\nDevelopment on Intel XPUs such as the Data Center GPU Max Series 1550 requires\na number of manual components for bare metal. The core dependency to consider\nis the [Intel XPU backend for Triton][triton-git], which will dictate the version\nof oneAPI, PyTorch, and Intel Extension for PyTorch to install. At the time\nof release, _EquiTriton_ has been developed on the following:\n\n- oneAPI 2024.0\n- PyTorch 2.1.0\n- IPEX 2.1.10+xpu\n- Intel XPU backend for Triton [2.1.0](https://github.com/intel/intel-xpu-backend-for-triton/releases/tag/v2.1.0)\n\nDue to the way that wheels are distributed, please install PyTorch\nand IPEX per `intel-requirements.txt`. Alternatively, use the provided\nDocker image for development.\n\n```python\n\u003e\u003e\u003e import intel_extension_for_pytorch\n\u003e\u003e\u003e import torch\n\u003e\u003e\u003e torch.xpu.device_count()\n# should be greater than zero\n```\n[triton-git]: https://github.com/intel/intel-xpu-backend-for-triton/releases/tag/v2.1.0\n\n## Useful commands for Intel GPUs\n\n- `xpu-smi` (might not be installed) as the name suggests is the equivalent to `nvidia-smi`,\nbut with a bit more functionality based on our architecture\n- `sycl-ls` is provided by the `dpcpp` runtime, and lists out all devices that are OpenCL\nand SYCL capable. Notably this can be used to quickly check how many GPUs are available.\n- [pti-gpu](https://github.com/intel/pti-gpu) provides a set of tools that you can compile for profiling. Notably,\n`unitrace` and `oneprof` allows you do to low-level profiling for the device.\n\n\nContributing\n------------\n\nWe welcome contributions from the open-source community! If you have any\nquestions or suggestions, feel free to create an issue in our\nrepository. We will be happy to work with you to make this project even\nbetter.\n\nLicense\n-------\n\nThe code and documentation in this repository are licensed under the Apache 2.0\nlicense. By contributing to this project, you agree that your\ncontributions will be licensed under this license.\n\nCitation\n--------\nIf you find this repo useful, please consider citing the respective papers.\n\nFor the original EquiTriton implementation, please use/read the following citation:\n\n```bibtex\n@inproceedings{lee2024scaling,\n    title={Scaling Computational Performance of Spherical Harmonics Kernels with Triton},\n    author={Kin Long Kelvin Lee and Mikhail Galkin and Santiago Miret},\n    booktitle={AI for Accelerated Materials Design - Vienna 2024},\n    year={2024},\n    url={https://openreview.net/forum?id=ftK00FO5wq}\n}\n```\n\nFor the refactored spherical harmonics up to $l=10$, and subsequent PHATE embedding analysis, see:\n\n```bibtex\n@inproceedings{lee2024deconstructing,\n    title={Deconstructing equivariant representations in molecular systems},\n    author={Kin Long Kelvin Lee and Mikhail Galkin and Santiago Miret},\n    booktitle={AI for Accelerated Materials Design - NeurIPS 2024},\n    year={2024},\n    url={https://openreview.net/forum?id=pshyLoyzRn}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fintellabs%2Fequitriton","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fintellabs%2Fequitriton","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fintellabs%2Fequitriton/lists"}