https://github.com/reproducible-reporting/parman
ParMan extends Python concurrent.futures to facilitate parallel workflows
https://github.com/reproducible-reporting/parman
Last synced: 9 days ago
JSON representation
ParMan extends Python concurrent.futures to facilitate parallel workflows
- Host: GitHub
- URL: https://github.com/reproducible-reporting/parman
- Owner: reproducible-reporting
- License: lgpl-3.0
- Created: 2023-03-26T18:13:55.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2025-04-07T20:45:33.000Z (15 days ago)
- Last Synced: 2025-04-07T21:40:51.803Z (15 days ago)
- Language: Python
- Homepage:
- Size: 256 KB
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE.txt
Awesome Lists containing this project
README
[](https://github.com/reproducible-reporting/parman/actions/workflows/pytest.yaml)
[](https://github.com/reproducible-reporting/parman/actions/workflows/release.yaml)
[](https://pypi.org/project/Parman/)


[](https://www.codefactor.io/repository/github/reproducible-reporting/parman)
[](https://app.deepsource.com/gh/reproducible-reporting/parman/)# Parman
At this stage, Parman is an experimental project, so expect a rocky road ahead.
The goal of Parman is to extend `concurrent.futures` (and compatible implementations)
with features that facilitate a transparent implementation of workflows.- `WaitFuture`: a Future subclass that is "finished" after its dependencies have finished.
(To be created with `WaitGraph.submit`, which never blocks.)
- `ScheduledFuture`: a Future subclass that submits a Future after its dependencies have finished.
(To be created with `Scheduler.submit`, which never blocks.)
- Various `Runner` classes, similar to Executors, which dispatch function calls elsewhere.
The main differences with conventional executors being:
- Closures are submitted for (remote) execution, which contain more metadata,
e.g. about (keyword) arguments and return values, than ordinary functions
The extra metadata offer several advantages...
- A dry run can be carried out to quickly validate the connectivity of steps in the workflow
before launching a full scale calculation.
- Closure arguments may contain futures.
If `schedule=True` is set, closures are scheduled for later execution
when not all dependency futures have finished yet.
(Dependencies are inferred from the arguments and keyword arguments.)
Otherwise, the runner will block until all required futures have completed.
- Closure return values are instantiated as much as possible,
instead of just returning a single future.
They may contain futures more deeply nested for parts of the return value,
This makes it easier to submit more closures further down the workflow.As a result, workflows can be implemented efficiently with relatively simple Python scripts,
mostly hiding the interaction with Future objects.Other useful features:
- Compatible with Python's built-in Concurrent package and [Parsl](parsl.readthedocs.io).
(Parls is an optional dependency.)
- Simplicity:
- Template jobs, for a straightforward migration of existing job scripts.
- Minimal Python package dependencies.
- Minimal API.## Getting started
### Install
```bash
python -m pip install parman
```### Examples
At this stage, there is no documentation as such.
If you want to learn how to use Parman, check out the [demos](demos/).
If you want to understand the internals, read the source and the docstrings.## Non-goals
- Support for Dask, because:
1. The Dask `Future` does not subclass from `concurrent.futures.Future`.
Supporting dask would imply a lot of extra boilerplate code in Parman.
2. The Dask `Future` implements only a subset of `concurrent.futures.Future`.
3. Dask Distributed has a large memory and time overhead.## Plans
- Simplify usage.
- Add more examples.
- Tutorial.