https://github.com/cscfi/scalable-python
Python interpreter for massively parallel HPC systems
https://github.com/cscfi/scalable-python
Last synced: over 1 year ago
JSON representation
Python interpreter for massively parallel HPC systems
- Host: GitHub
- URL: https://github.com/cscfi/scalable-python
- Owner: CSCfi
- License: other
- Created: 2015-09-25T12:20:02.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2017-10-04T10:00:13.000Z (over 8 years ago)
- Last Synced: 2025-01-07T12:25:52.778Z (over 1 year ago)
- Language: Python
- Homepage:
- Size: 21.2 MB
- Stars: 3
- Watchers: 37
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Scalable Python 1.2
> based on Python 2.7.13 (see original [README](README.orig))
>
> © 2011-2017
> [CSC - IT Center for Science Ltd.](http://github.com/CSCfi)
> authors:
> [Jussi Enkovaara](http://github.com/jussienko) &
> [Martti Louhivuori](http://github.com/mlouhivu)
***Special Python interpreter intended for massively parallel HPC systems.***
When using Python for parallel applications, standard Python puts a lot of
pressure on the filesystem with multiple processes accessing several small
files during import statements. With thousands of processes the startup time
of a Python application can be more than 30 minutes.
**Scalable Python** performs the I/O operations used e.g. by import statements
in a single process and uses MPI to transmit data to/from all other processes.
Please see the [inherent limitation](#limitations) of this approach.
## Short build instructions
Use `--enable-mpi` option for configure, e.g.
./configure --enable-mpi
Build and install normal Python interpreter (`python`) as well as standard
modules etc.
make
make install
Build and install the special Python interpreter (`python_mpi`)
make mpi
make install-mpi
By default, the MPI related parts are build using `mpicc`, but an alternative
MPI compiler can be specified using the environment variable MPICC (e.g.
`export MPICC=cc`).
## Short usage instructions
Installation creates two executables:
- **python** is the standard Python interpreter, which can be used for normal
serial applications and e.g. for software installations.
- **python_mpi** is the special Python interpreter that wraps I/O calls and
uses MPI for communication, and thus has to be started with `mpirun` (or
an equivalent MPI launch command), e.g.
```
mpirun -np 16384 python_mpi my_application.py
```
## Disabling I/O wrappers
It is possible to disable the special I/O wrappers on-the-fly and revert to
the standard POSIX calls by using the built-in function `wrapoff()` in a
Python script. For example:
wrapoff()
from numpy import array
Similarly, they can be turned back on with the built-in function `wrapon()`.
## Limitations
All MPI processes have to perform the same I/O operations (e.g. import
statements). For example, a code like this
if rank == 0:
import mymodule
is NOT allowed. If you need to do it, please use the `wrapoff()` and
`wrapon()` functions to disable the special I/O wrappers temporarily.
If you experience an unexpected deadlock while using Scalable Python, it is
most likely due to triggering this limitation. Unfortunately, even some
standard modules use in-function import statements, so the root cause may turn
out to be an innocent looking function call within an if statement. If needed,
you can try to turn off the special I/O wrappers e.g. after the initial module
imports.