{"id":14958175,"url":"https://github.com/helmholtz-analytics/heat","last_synced_at":"2025-05-15T12:06:22.153Z","repository":{"id":33059394,"uuid":"133808899","full_name":"helmholtz-analytics/heat","owner":"helmholtz-analytics","description":"Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python","archived":false,"fork":false,"pushed_at":"2025-05-09T12:08:42.000Z","size":22068,"stargazers_count":221,"open_issues_count":105,"forks_count":54,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-05-09T13:26:21.916Z","etag":null,"topics":["array-api","data-analytics","data-processing","data-science","distributed","gpu","hpc","machine-learning","massive-datasets","mpi","mpi4py","multi-gpu","multi-node-cluster","numpy","parallelism","python","pytorch","tensors"],"latest_commit_sha":null,"homepage":"https://heat.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/helmholtz-analytics.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"contributing.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-05-17T12:16:27.000Z","updated_at":"2025-05-09T08:18:21.000Z","dependencies_parsed_at":"2023-01-16T22:45:19.578Z","dependency_job_id":"493f7176-3b74-4ba0-94b5-084fc5860710","html_url":"https://github.com/helmholtz-analytics/heat","commit_stats":{"total_commits":5244,"total_committers":56,"mean_commits":93.64285714285714,"dds":0.7725019069412662,"last_synced_commit":"87f2812d427fbd86a75ee5512e227b05c86baaf9"},"previous_names":[],"tags_count":29,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/helmholtz-analytics%2Fheat","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/helmholtz-analytics%2Fheat/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/helmholtz-analytics%2Fheat/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/helmholtz-analytics%2Fheat/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/helmholtz-analytics","download_url":"https://codeload.github.com/helmholtz-analytics/heat/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253258833,"owners_count":21879721,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["array-api","data-analytics","data-processing","data-science","distributed","gpu","hpc","machine-learning","massive-datasets","mpi","mpi4py","multi-gpu","multi-node-cluster","numpy","parallelism","python","pytorch","tensors"],"created_at":"2024-09-24T13:16:24.977Z","updated_at":"2025-05-15T12:06:17.120Z","avatar_url":"https://github.com/helmholtz-analytics.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"https://raw.githubusercontent.com/helmholtz-analytics/heat/main/doc/images/logo.png\"\u003e\n\u003c/div\u003e\n\n---\n\nHeat is a distributed tensor framework for high performance data analytics.\n\n# Project Status\n\n[![CPU/CUDA/ROCm tests](https://codebase.helmholtz.cloud/helmholtz-analytics/ci/badges/heat/base/pipeline.svg)](https://codebase.helmholtz.cloud/helmholtz-analytics/ci/-/commits/heat/base)\n[![Documentation Status](https://readthedocs.org/projects/heat/badge/?version=latest)](https://heat.readthedocs.io/en/latest/?badge=latest)\n[![coverage](https://codecov.io/gh/helmholtz-analytics/heat/branch/main/graph/badge.svg)](https://codecov.io/gh/helmholtz-analytics/heat)\n[![license: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)\n[![PyPI Version](https://img.shields.io/pypi/v/heat)](https://pypi.org/project/heat/)\n[![Downloads](https://pepy.tech/badge/heat)](https://pepy.tech/project/heat)\n[![Anaconda-Server Badge](https://anaconda.org/conda-forge/heat/badges/version.svg)](https://anaconda.org/conda-forge/heat)\n[![fair-software.eu](https://img.shields.io/badge/fair--software.eu-%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F-green)](https://fair-software.eu)\n[![OpenSSF Scorecard](https://api.securityscorecards.dev/projects/github.com/helmholtz-analytics/heat/badge)](https://securityscorecards.dev/viewer/?uri=github.com/helmholtz-analytics/heat)\n[![OpenSSF Best Practices](https://bestpractices.coreinfrastructure.org/projects/7688/badge)](https://bestpractices.coreinfrastructure.org/projects/7688)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.2531472.svg)](https://doi.org/10.5281/zenodo.2531472)\n[![Benchmarks](https://img.shields.io/badge/Grafana-Benchmarks-2ea44f)](https://57bc8d92-72f2-4869-accd-435ec06365cb.ka.bw-cloud-instance.org:3000/d/adjpqduq9r7k0a/heat-cb?orgId=1)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![JuRSE Code Pick of the Month](https://img.shields.io/badge/JuRSE_Code_Pick-August_2024-blue)](https://www.fz-juelich.de/en/rse/jurse-community/jurse-code-of-the-month/august-2024)\n\n# Table of Contents\n  - [What is Heat for?](#what-is-heat-for)\n  - [Features](#features)\n  - [Getting Started](#getting-started)\n  - [Installation](#installation)\n    - [Requirements](#requirements)\n    - [pip](#pip)\n    - [conda](#conda)\n  - [Support Channels](#support-channels)\n  - [Contribution guidelines](#contribution-guidelines)\n    - [Resources](#resources)\n  - [License](#license)\n  - [Citing Heat](#citing-heat)\n  - [FAQ](#faq)\n  - [Acknowledgements](#acknowledgements)\n\n\n# What is Heat for?\n\nHeat builds on [PyTorch](https://pytorch.org/) and [mpi4py](https://mpi4py.readthedocs.io) to provide high-performance computing infrastructure for memory-intensive applications within the NumPy/SciPy ecosystem.\n\n\nWith Heat you can:\n- port existing NumPy/SciPy code from single-CPU to multi-node clusters with minimal coding effort;\n- exploit the entire, cumulative RAM of your many nodes for memory-intensive operations and algorithms;\n- run your NumPy/SciPy code on GPUs (CUDA, ROCm, coming up: Apple MPS).\n\nFor a example that highlights the benefits of multi-node parallelism, hardware acceleration, and how easy this can be done with the help of Heat, see, e.g., our [blog post on trucated SVD of a 200GB data set](https://helmholtz-analytics.github.io/heat/2023/06/16/new-feature-hsvd.html).\n\nCheck out our [coverage tables](coverage_tables.md) to see which NumPy, SciPy, scikit-learn functions are already supported.\n\n If you need a functionality that is not yet supported:\n  - [search existing issues](https://github.com/helmholtz-analytics/heat/issues) and make sure to leave a comment if someone else already requested it;\n  - [open a new issue](https://github.com/helmholtz-analytics/heat/issues/new/choose).\n\n\nCheck out our [features](#features) and the [Heat API Reference](https://heat.readthedocs.io/en/latest/autoapi/index.html) for a complete list of functionalities.\n\n# Features\n\n* High-performance n-dimensional arrays\n* CPU, GPU, and distributed computation using MPI\n* Powerful data analytics and machine learning methods\n* Seamless integration with the NumPy/SciPy ecosystem\n* Python array API (work in progress)\n\n\n# Getting Started\n\nGo to [Quick Start](quick_start.md) for a quick overview. For more details, see [Installation](#installation).\n\n**You can test your setup** by running the [`heat_test.py`](https://github.com/helmholtz-analytics/heat/blob/main/scripts/heat_test.py) script:\n\n```shell\nmpirun -n 2 python heat_test.py\n```\n\nIt should print something like this:\n\n```shell\nx is distributed:  True\nGlobal DNDarray x:  DNDarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=ht.int32, device=cpu:0, split=0)\nGlobal DNDarray x:\nLocal torch tensor on rank  0 :  tensor([0, 1, 2, 3, 4], dtype=torch.int32)\nLocal torch tensor on rank  1 :  tensor([5, 6, 7, 8, 9], dtype=torch.int32)\n```\n\nCheck out our Jupyter Notebook [**Tutorials**](https://github.com/helmholtz-analytics/heat/blob/main/tutorials/), choose `local` to try things out on your machine, or `hpc` if you have access to an HPC system.\n\nThe complete documentation of the latest version is always deployed on\n[Read the Docs](https://heat.readthedocs.io/).\n\n\n\u003c!-- # Goals\n\nHeat is a flexible and seamless open-source software for high performance data\nanalytics and machine learning. It provides highly optimized algorithms and data structures for tensor computations using CPUs, GPUs, and distributed cluster systems on top of MPI. The goal of Heat is to fill the gap between single-node data analytics and machine learning libraries, and  high-performance computing (HPC). Heat's interface integrates seamlessly with the existing data science ecosystem and makes  writing scalable\nscientific and data science applications as effortless as using NumPy.\n\nHeat allows you to tackle your actual Big Data challenges that go beyond the\ncomputational and memory needs of your laptop and desktop.\n --\u003e\n# Installation\n\n## Requirements\n\n### Basics\n- python \u003e= 3.9\n- MPI (OpenMPI, MPICH, Intel MPI, etc.)\n- mpi4py \u003e= 3.0.0\n- pytorch \u003e= 2.0.0\n\n### Parallel I/O\n- h5py\n- netCDF4\n\n### GPU support\nIn order to do computations on your GPU(s):\n- your CUDA or ROCm installation must match your hardware and its drivers;\n- your [PyTorch installation](https://pytorch.org/get-started/locally/) must be compiled with CUDA/ROCm support.\n\n### HPC systems\nOn most HPC-systems you will not be able to install/compile MPI or CUDA/ROCm yourself. Instead, you will most likely need to load a pre-installed MPI and/or CUDA/ROCm module from the module system. Maybe, you will even find PyTorch, h5py, or mpi4py as (part of) such a module. Note that for optimal performance on GPU, you need to usa an MPI library that has been compiled with CUDA/ROCm support (e.g., so-called \"CUDA-aware MPI\").\n\n\n## pip\nInstall the latest version with\n\n```bash\npip install heat[hdf5,netcdf]\n```\nwhere the part in brackets is a list of optional dependencies. You can omit\nit, if you do not need HDF5 or NetCDF support.\n\n## **conda**\n\nThe conda build includes all dependencies **including OpenMPI**.\n```bash\n conda install -c conda-forge heat\n ```\n\n# Support Channels\n\nGo ahead and ask questions on [GitHub Discussions](https://github.com/helmholtz-analytics/heat/discussions). If you found a bug or are missing a feature, then please file a new [issue](https://github.com/helmholtz-analytics/heat/issues/new/choose). You can also get in touch with us on [Mattermost](https://mattermost.hzdr.de/signup_user_complete/?id=3sixwk9okpbzpjyfrhen5jpqfo) (sign up with your GitHub credentials). Once you log in, you can introduce yourself on the `Town Square` channel.\n\n\n# Contribution guidelines\n\n**We welcome contributions from the community, if you want to contribute to Heat, be sure to review the [Contribution Guidelines](contributing.md) and [Resources](#resources)  before getting started!**\n\nWe use [GitHub issues](https://github.com/helmholtz-analytics/heat/issues) for tracking requests and bugs, please see [Discussions](https://github.com/helmholtz-analytics/heat/discussions) for general questions and discussion. You can also get in touch with us on [Mattermost](https://mattermost.hzdr.de/signup_user_complete/?id=3sixwk9okpbzpjyfrhen5jpqfo) (sign up with your GitHub credentials). Once you log in, you can introduce yourself on the `Town Square` channel.\n\nIf you’re unsure where to start or how your skills fit in, reach out! You can ask us here on GitHub, by leaving a comment on a relevant issue that is already open.\n\n**If you are new to contributing to open source, [this guide](https://opensource.guide/how-to-contribute/) helps explain why, what, and how to get involved.**\n\n\n## Resources\n\n* [Heat Tutorials](https://github.com/helmholtz-analytics/heat/tree/main/tutorials)\n* [Heat API Reference](https://heat.readthedocs.io/en/latest/autoapi/index.html)\n\n### Parallel Computing and MPI:\n\n* David Henty's [course](https://www.archer2.ac.uk/training/courses/200514-mpi/)\n* Wes Kendall's [Tutorials](https://mpitutorial.com/tutorials/)\n* Rolf Rabenseifner's [MPI course material](https://www.hlrs.de/training/self-study-materials/mpi-course-material) (including C, Fortran **and** Python via `mpi4py`)\n\n### mpi4py\n\n* [mpi4py docs](https://mpi4py.readthedocs.io/en/stable/tutorial.html)\n* [Tutorial](https://www.kth.se/blogs/pdc/2019/08/parallel-programming-in-python-mpi4py-part-1/)\n# License\n\nHeat is distributed under the MIT license, see our\n[LICENSE](LICENSE) file.\n\n# Citing Heat\n\n\u003c!-- If you find Heat helpful for your research, please mention it in your publications. You can cite: --\u003e\n\nPlease do mention Heat in your publications if it helped your research. You can cite:\n\n* Götz, M., Debus, C., Coquelin, D., Krajsek, K., Comito, C., Knechtges, P., Hagemeier, B., Tarnawa, M., Hanselmann, S., Siggel, S., Basermann, A. \u0026 Streit, A. (2020). HeAT - a Distributed and GPU-accelerated Tensor Framework for Data Analytics. In 2020 IEEE International Conference on Big Data (Big Data) (pp. 276-287). IEEE, DOI: 10.1109/BigData50022.2020.9378050.\n\n```\n@inproceedings{heat2020,\n    title={{HeAT -- a Distributed and GPU-accelerated Tensor Framework for Data Analytics}},\n    author={\n      Markus Götz and\n      Charlotte Debus and\n      Daniel Coquelin and\n      Kai Krajsek and\n      Claudia Comito and\n      Philipp Knechtges and\n      Björn Hagemeier and\n      Michael Tarnawa and\n      Simon Hanselmann and\n      Martin Siggel and\n      Achim Basermann and\n      Achim Streit\n    },\n    booktitle={2020 IEEE International Conference on Big Data (Big Data)},\n    year={2020},\n    pages={276-287},\n    month={December},\n    publisher={IEEE},\n    doi={10.1109/BigData50022.2020.9378050}\n}\n```\n# FAQ\nWork in progress...\n\n  \u003c!-- - Users\n  - Developers\n  - Students\n  - system administrators --\u003e\n\n## Acknowledgements\n\n*This work is supported by the [Helmholtz Association Initiative and\nNetworking Fund](https://www.helmholtz.de/en/about_us/the_association/initiating_and_networking/)\nunder project number ZT-I-0003 and the Helmholtz AI platform grant.*\n\n*This project has received funding from Google Summer of Code (GSoC) in 2022.*\n\n*This work is partially carried out under a [programme](https://activities.esa.int/index.php/4000144045) of, and funded by, the European Space Agency.\nAny view expressed in this repository or related publications can in no way be taken to reflect the official opinion of the European Space Agency.*\n\n---\n\n\u003cdiv align=\"center\"\u003e\n  \u003ca href=\"https://www.dlr.de/EN/Home/home_node.html\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/helmholtz-analytics/heat/main/doc/images/dlr_logo.svg\" height=\"50px\" hspace=\"3%\" vspace=\"20px\"\u003e\u003c/a\u003e\u003ca href=\"https://www.fz-juelich.de/portal/EN/Home/home_node.html\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/helmholtz-analytics/heat/main/doc/images/fzj_logo.svg\" height=\"40px\" hspace=\"3%\" vspace=\"20px\"\u003e\u003c/a\u003e\u003ca href=\"http://www.kit.edu/english/index.php\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/helmholtz-analytics/heat/main/doc/images/kit_logo.svg\" height=\"40px\" hspace=\"3%\" vspace=\"5px\"\u003e\u003c/a\u003e\u003ca href=\"https://www.helmholtz.de/en/\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/helmholtz-analytics/heat/main/doc/images/helmholtz_logo.svg\" height=\"50px\" hspace=\"3%\" vspace=\"5px\"\u003e\u003c/a\u003e\u003ca href=\"https://www.esa.int/\"\u003e\u003cimg src=\"https://github.com/user-attachments/assets/2ee251b4-733e-44ea-8d1c-8b75928eef55\" height=\"45px\" hspace=\"3%\" vspace=\"20px\"\u003e\u003c/a\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhelmholtz-analytics%2Fheat","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhelmholtz-analytics%2Fheat","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhelmholtz-analytics%2Fheat/lists"}