{"id":27154738,"url":"https://github.com/helmholtz-ai-energy/propulate","last_synced_at":"2025-04-08T18:33:29.336Z","repository":{"id":44093086,"uuid":"495731357","full_name":"Helmholtz-AI-Energy/propulate","owner":"Helmholtz-AI-Energy","description":"Propulate is an asynchronous population-based optimization algorithm and software package for global optimization and hyperparameter search on high-performance computers.","archived":false,"fork":false,"pushed_at":"2024-10-29T00:37:28.000Z","size":8462,"stargazers_count":32,"open_issues_count":55,"forks_count":6,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-10-29T20:47:53.323Z","etag":null,"topics":["distributed","evolutionary-algorithms","genetic-algorithm","hyperparameter-optimization","hyperparameter-tuning","optimization","parallel","parallel-computing","python"],"latest_commit_sha":null,"homepage":"https://doi.org/10.1007/978-3-031-32041-5_6","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Helmholtz-AI-Energy.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.md","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-05-24T08:19:12.000Z","updated_at":"2024-10-17T11:11:36.000Z","dependencies_parsed_at":"2023-02-04T07:01:00.653Z","dependency_job_id":"ded80114-ab57-4106-b888-f7d386018691","html_url":"https://github.com/Helmholtz-AI-Energy/propulate","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Helmholtz-AI-Energy%2Fpropulate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Helmholtz-AI-Energy%2Fpropulate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Helmholtz-AI-Energy%2Fpropulate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Helmholtz-AI-Energy%2Fpropulate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Helmholtz-AI-Energy","download_url":"https://codeload.github.com/Helmholtz-AI-Energy/propulate/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247902926,"owners_count":21015553,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["distributed","evolutionary-algorithms","genetic-algorithm","hyperparameter-optimization","hyperparameter-tuning","optimization","parallel","parallel-computing","python"],"created_at":"2025-04-08T18:32:30.823Z","updated_at":"2025-04-08T18:33:29.323Z","avatar_url":"https://github.com/Helmholtz-AI-Energy.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![Propulate Logo](https://raw.githubusercontent.com/Helmholtz-AI-Energy/propulate/refs/heads/main/LOGO_light.svg#gh-light-mode-only)\n![Propulate Logo](https://raw.githubusercontent.com/Helmholtz-AI-Energy/propulate/refs/heads/main/LOGO_dark.svg#gh-dark-mode-only)\n\n\n# Parallel Propagator of Populations\n\n[![DOI](https://zenodo.org/badge/495731357.svg)](https://zenodo.org/badge/latestdoi/495731357)\n[![fair-software.eu](https://img.shields.io/badge/fair--software.eu-%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F-green)](https://fair-software.eu)\n[![License: BSD-3](https://img.shields.io/badge/License-BSD--3-blue)](https://opensource.org/licenses/BSD-3-Clause)\n![PyPI](https://img.shields.io/pypi/v/propulate)\n![PyPI - Downloads](https://img.shields.io/pypi/dm/propulate)\n[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)[![](https://img.shields.io/badge/Python-3.9+-blue.svg)](https://www.python.org/downloads/)\n[![OpenSSF Best Practices](https://www.bestpractices.dev/projects/7785/badge)](https://www.bestpractices.dev/projects/7785)\n[![](https://img.shields.io/badge/Contact-propulate%40lists.kit.edu-orange)](mailto:propulate@lists.kit.edu)\n[![Documentation Status](https://readthedocs.org/projects/propulate/badge/?version=latest)](https://propulate.readthedocs.io/en/latest/?badge=latest)\n[![codecov](https://codecov.io/gh/Helmholtz-AI-Energy/propulate/graph/badge.svg?token=ZG6PEXJOIO)](https://codecov.io/gh/Helmholtz-AI-Energy/propulate)[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/Helmholtz-AI-Energy/propulate/main.svg)](https://results.pre-commit.ci/latest/github/Helmholtz-AI-Energy/propulate/main)\n\n# **Click [here](https://www.scc.kit.edu/en/aboutus/16956.php) to watch our 3 min introduction video!**\n\n## What `Propulate` can do for you\n\n`Propulate` is an HPC-tailored software for solving optimization problems in parallel. It is openly accessible and easy\nto use. Compared to a widely used competitor, `Propulate` is consistently faster - at least an order of magnitude for a\nset of typical benchmarks - and in some cases even more accurate.\n\nInspired by biology, `Propulate` borrows mechanisms from biological evolution, such as selection, recombination, and\nmutation. Evolution begins with a population of solution candidates, each with randomly initialized genes. It is an\niterative \"survival of the fittest\" process where the population at each iteration can be viewed as a generation. For\neach generation, the fitness of each candidate in the population is evaluated. The genes of the fittest candidates are\nincorporated in the next generation.\n\nLike in nature, `Propulate` does not wait for all compute units to finish the evaluation of the current generation.\nInstead, the compute units communicate the currently available information and use that to breed the next candidate\nimmediately. This avoids waiting idly for other units and thus a load imbalance.\nEach unit is responsible for evaluating a single candidate. The result is a fitness level corresponding with that\ncandidate’s genes, allowing us to compare and rank all candidates. This information is sent to other compute units as\nsoon as it becomes available.\nWhen a unit is finished evaluating a candidate and communicating the resulting fitness, it breeds the candidate for the\nnext generation using the fitness values of all candidates it evaluated and received from other units so far.\n\n`Propulate` can be used for hyperparameter optimization and neural architecture search at scale.\nIt was already successfully applied in several accepted scientific publications. Applications include grid load\nforecasting, remote sensing, and structural molecular biology:\n\n\u003e J. Debus, C. Debus, G. Dissertori, et al. **PETNet–Coincident Particle Event Detection using Spiking Neural Networks**.\n\u003e 2024 Neuro Inspired Computational Elements Conference (NICE), La Jolla, CA, USA, pp. 1-9 ( 2024).\n\u003e https://doi.org/10.1109/NICE61972.2024.10549584\n\n\u003e D. Coquelin, K. Flügel, M. Weiel, et al. **AB-Training: A Communication-Efficient Approach for Distributed Low-Rank\n\u003e Learning**. arXiv preprint (2024). https://doi.org/10.48550/arXiv.2405.01067\n\n\u003e D. Coquelin, K. Flügel, M. Weiel, et al. **Harnessing Orthogonality to Train Low-Rank Neural Networks**. arXiv\n\u003e preprint (2024). https://doi.org/10.48550/arXiv.2401.08505\n\n\u003e A. Weyrauch, T. Steens, O. Taubert, et al. **ReCycle: Fast and Efficient Long Time Series Forecasting with Residual\n\u003e Cyclic Transformers**. 2024 IEEE Conference on Artificial Intelligence (CAI), Singapore, pp. 1187-1194 (2024).\n\u003e https://doi.org/10.1109/CAI59869.2024.00212\n\n\u003e O. Taubert, F. von der Lehr, A. Bazarova, et al. **RNA contact prediction by data efficient deep learning**.\n\u003e Communications Biology 6(1), 913 (2023). https://doi.org/10.1038/s42003-023-05244-9\n\n\u003e D. Coquelin, K. Flügel, M. Weiel, et al. **Harnessing Orthogonality to Train Low-Rank Neural Networks**. arXiv\n\u003e preprint (2023). https://doi.org/10.48550/arXiv.2401.08505\n\n\u003e Y. Funk, M. Götz, and H. Anzt. **Prediction of optimal solvers for sparse linear systems using deep learning**.\n\u003e Proceedings of the 2022 SIAM Conference on Parallel Processing for Scientific Computing (pp. 14-24). Society for\n\u003e Industrial and Applied Mathematics (2022). https://doi.org/10.1137/1.9781611977141.2\n\n\u003e D. Coquelin, R. Sedona, M. Riedel, and M. Götz. **Evolutionary Optimization of Neural Architectures in Remote Sensing\n\u003e Classification Problems**. IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium,\n\u003e pp. 1587-1590 (2021). https://doi.org/10.1109/IGARSS47720.2021.9554309\n\n## In more technical terms\n\n``Propulate`` is a massively parallel evolutionary hyperparameter optimizer based on the island model with asynchronous\npropagation of populations and asynchronous migration.\nIn contrast to classical GAs, ``Propulate`` maintains a continuous population of already evaluated individuals with a\nsoftened notion of the typically strictly separated, discrete generations.\nOur contributions include:\n- A novel parallel genetic algorithm based on a fully asynchronized island model with independently processing workers.\n- Massive parallelism by asynchronous propagation of continuous populations and migration via efficient communication using the message passing interface.\n- Optimized use efficiency of parallel hardware by minimizing idle times in distributed computing environments.\n\nTo be more efficient, the generations are less well separated than they usually are in evolutionary algorithms.\nNew individuals are generated from a pool of currently active, already evaluated individuals that may be from any\ngeneration.\nIndividuals may be removed from the breeding population based on different criteria.\n\nYou can find the corresponding publication [here](https://doi.org/10.1007/978-3-031-32041-5_6):\n\u003e Taubert, O. *et al.* (2023). Massively Parallel Genetic Optimization Through Asynchronous Propagation of Populations.\n\u003e In: Bhatele, A., Hammond, J., Baboulin, M., Kruse, C. (eds) High Performance Computing. ISC High Performance 2023.\n\u003e Lecture Notes in Computer Science, vol 13948. Springer, Cham.\n\u003e [doi.org/10.1007/978-3-031-32041-5_6](https://doi.org/10.1007/978-3-031-32041-5_6)\n\n## Documentation\n\nCheck out the full documentation at [https://propulate.readthedocs.io/](https://propulate.readthedocs.io/) :rocket:! Here you can find installation\ninstructions, tutorials, theoretical background, and API references.\n\n**:point_right: If you have any questions or run into any challenges while using `Propulate`, don't hesitate to post an\n[issue](https://github.com/Helmholtz-AI-Energy/propulate/issues) :bookmark:, reach out via [GitHub\ndiscussions](https://github.com/Helmholtz-AI-Energy/propulate/discussions) :octocat:, or contact us directly via e-mail\n:email: to [propulate@lists.kit.edu](mailto:propulate@lists.kit.edu).**\n\n## Installation\n\n- You can install the **latest stable release** from PyPI: ``pip install propulate``\n- If you need the **latest updates**, you can also install ``Propulate`` directly from the master branch.\nPull and run ``pip install .``.\n- If you want to run the **tutorials**, you can install the required dependencies via: ``pip install .\"[tutorials]\"``\n- If you want to **contribute** to ``Propulate`` as a developer, you need to install the required dependencies with the package:\n``pip install -e .\"[dev]\"``.\n\n``Propulate`` depends on [``mpi4py``](https://mpi4py.readthedocs.io/en/stable/) and requires an MPI implementation under\nthe hood. Currently, it is only tested with [OpenMPI](https://www.open-mpi.org/).\n\n## Quickstart\n*Below, you can find a quick recipe for how to use `Propulate` in general. Check out the official\n[ReadTheDocs](https://propulate.readthedocs.io/en/latest/tut_propulator.html) documentation for more detailed tutorials\nand explanations.*\n\nLet's minimize the sphere function $f_\\text{sphere}\\left(x,y\\right)=x^2 +y^2$ with `Propulate` as a quick example. The\nminimum is at $\\left(x, y\\right)=\\left(0,0\\right)$ at the orange star.\n![](./docs/images/sphere.png)\nFirst, we need to define the key ingredients that define our optimization problem:\n- The **search space** of the parameters to be optimized as a `Python` dictionary. `Propulate` can handle three different\n  parameter types:\n    - A tuple of `float` for a continuous parameter, e.g., `{\"learning_rate\": (0.0001, 0.01)}`\n    - A tuple of `int` for an ordinal parameter, e.g., `{\"conv_layers\": (2, 10)}`\n    - A tuple of `str` for a categorical parameter, e.g., `{\"activation\": (\"relu\", \"sigmoid\", \"tanh\")}`\n\n  Thus, an exemplary search space might look like this:\n  ```python\n  search_space = {\n      \"learning_rate\": (0.0001, 0.01),  # Search a continuous space between 0.0001 and 0.01.\n      \"num_layers\": (2, 10),  # Search the integer space between 2 and 10 (inclusive).\n      \"activation\": (\"relu\", \"sigmoid\", \"tanh\"),  # Search the categorical space with the specified possibilities.\n  }\n  ```\n\n  The sphere function has two continuous parameters, $x$ and $y$, and we consider $x,y\\in\\left[-5.12,5.12\\right]$. The\n  search space in our example thus looks like this:\n  ```python\n  limits = {\n      \"x\": (-5.12, 5.12),\n      \"y\": (-5.12, 5.12)\n  }\n  ```\n- The **loss function**. This is the function we want to minimize in order to find the best parameters. It can be any\n  `Python` function that\n  - takes a set of parameters as a `Python` dictionary as an input.\n  - returns a scalar loss value that determines how good the tested parameter set is.\n\n  In this example, the loss function whose minimum we want to find is the sphere function:\n  ```python\n  def sphere(params: Dict[str, float]) -\u003e float:\n    \"\"\"\n    Sphere function: continuous, convex, separable, differentiable, unimodal\n\n    Input domain: -5.12 \u003c= x, y \u003c= 5.12\n    Global minimum 0 at (x, y) = (0, 0)\n\n    Parameters\n    ----------\n    params: Dict[str, float]\n        The function parameters.\n\n    Returns\n    -------\n    float\n        The function value.\n    \"\"\"\n    return numpy.sum(numpy.array(list(params.values())) ** 2).item()\n  ```\nNext, we need to define the **evolutionary operator** or propagator that we want to use to breed new individuals during the\noptimization process. `Propulate` provides a reasonable default propagator via a utility function:\n```python\n# Set up logger for Propulate optimization.\npropulate.set_logger_config()\n# Set up separate random number generator for Propulate optimization. DO NOT USE SOMEWHERE ELSE!\nrng = random.Random(\n    \u003cyour-random-seed\u003e + mpi4py.MPI.COMM_WORLD.rank\n)\n# Set up evolutionary operator.\npropagator = propulate.get_default_propagator(\n    pop_size=config.pop_size,  # The breeding population size\n    limits=limits,  # The search-space limits\n    rng=rng,  # Random number generator\n)\n```\nWe also need to set up the asynchronous parallel evolutionary **optimizer**, that is a so-called ``Propulator`` instance:\n```python\n# Set up Propulator performing actual optimization.\npropulator = propulate.Propulator(\n    loss_fn=sphere,\n    propagator=propagator,\n    rng=rng,\n    generations=config.generations,\n    checkpoint_path=config.checkpoint,\n)\n```\nNow we can run the actual optimization. Overall, ``generations * mpi4py.MPI.COMM_WORLD.size`` evaluations will be\nperformed:\n```python\n# Run optimization and print summary of results.\npropulator.propulate()\npropulator.summarize()\n```\nThe output should look something like this:\n```text\n#################################################\n# PROPULATE: Parallel Propagator of Populations #\n#################################################\n\n[2024-03-12 14:37:01,374][propulate.propulator][INFO] - No valid checkpoint file given. Initializing population randomly...\n[2024-03-12 14:37:01,374][propulate.propulator][INFO] - Island 0 has 4 workers.\n[2024-03-12 14:37:01,374][propulate.propulator][INFO] - Island 0 Worker 0: In generation 0...\n[2024-03-12 14:37:01,374][propulate.propulator][INFO] - Island 0 Worker 3: In generation 0...\n[2024-03-12 14:37:01,374][propulate.propulator][INFO] - Island 0 Worker 2: In generation 0...\n[2024-03-12 14:37:01,374][propulate.propulator][INFO] - Island 0 Worker 1: In generation 0...\n[2024-03-12 14:37:01,377][propulate.propulator][INFO] - Island 0 Worker 3: In generation 10...\n[2024-03-12 14:37:01,377][propulate.propulator][INFO] - Island 0 Worker 1: In generation 10...\n[2024-03-12 14:37:01,378][propulate.propulator][INFO] - Island 0 Worker 0: In generation 10...\n[2024-03-12 14:37:01,378][propulate.propulator][INFO] - Island 0 Worker 2: In generation 10...\n\n...\n[2024-03-12 14:37:02,197][propulate.propulator][INFO] - Island 0 Worker 1: In generation 960...\n[2024-03-12 14:37:02,206][propulate.propulator][INFO] - Island 0 Worker 2: In generation 990...\n[2024-03-12 14:37:02,206][propulate.propulator][INFO] - Island 0 Worker 1: In generation 970...\n[2024-03-12 14:37:02,215][propulate.propulator][INFO] - Island 0 Worker 1: In generation 980...\n[2024-03-12 14:37:02,224][propulate.propulator][INFO] - Island 0 Worker 1: In generation 990...\n[2024-03-12 14:37:02,232][propulate.propulator][INFO] - OPTIMIZATION DONE.\nNEXT: Final checks for incoming messages...\n[2024-03-12 14:37:02,244][propulate.propulator][INFO] - ###########\n# SUMMARY #\n###########\nNumber of currently active individuals is 4000.\nExpected overall number of evaluations is 4000.\n[2024-03-12 14:37:03,703][propulate.propulator][INFO] - Top 1 result(s) on island 0:\n(1): [{'a': '2.91E-3', 'b': '-3.05E-3'}, loss 1.78E-5, island 0, worker 0, generation 956]\n```\n### Let's get your hands dirty\nDo the following to run the [example script](https://github.com/Helmholtz-AI-Energy/propulate/blob/master/tutorials/propulator_example.py):\n\n- Make sure you have a working MPI installation on your machine.\n- If you have not already done this, create a fresh virtual environment with ``Python``: ``$ python3 -m venv best-venv-ever``\n- Activate it: ``$ source best-venv-ever/bin/activate``\n- Upgrade ``pip``: ``$ pip install --upgrade pip``\n- Install ``Propulate``: ``$ pip install propulate``\n- Run the example script ``propulator_example.py``: ``$ mpirun --use-hwthread-cpus python propulator_example.py``\n\n## Acknowledgments\n*This work is supported by the Helmholtz AI platform grant.*\n![](./.figs/hai_kit_logos.svg)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhelmholtz-ai-energy%2Fpropulate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhelmholtz-ai-energy%2Fpropulate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhelmholtz-ai-energy%2Fpropulate/lists"}