An open API service indexing awesome lists of open source software.

https://github.com/kabicm/grid2grid

A library transforming between two arbitrary grid-like matrix data layouts over MPI ranks.
https://github.com/kabicm/grid2grid

distributed-algorithm linear-algebra matrix mpi openmp

Last synced: 9 months ago
JSON representation

A library transforming between two arbitrary grid-like matrix data layouts over MPI ranks.

Awesome Lists containing this project

README

          

# grid2grid

This is a library that transforms a matrix between two arbitrary grid-like data layouts. By layout, we mean the way in which a matrix is distributed over MPI ranks.

For example, imagine a matrix split into four blocks, such that each block resides on a different rank as shown below:

```
+---------------------+
| | |
| | |
| 0 | 1 |
| | |
| | |
+---------------------+
| | |
| 2 | 3 |
| | |
+---------------------+
```

Assume that we want to redistribute the matrix, to achieve the following data layout:

```
+---------------------+
| 0 | 2 | 1 |
| | | |
+---------------------+
| 1 | 0 | 1 |
| | | |
+---------------------+
| | | |
| 2 | 2 | 0 |
| | | |
+---------------------+
```

This is exaclty what this library is doing!

## Features:

- initial and final layouts do not have to be block-cyclic, but any grid-like data layout.
- blocks do not have to be all the same dimensions, but can have different dimensions.
- a rank might own arbitrary number of blocks
- the type of entries in a matrix can be arbitrary
- blocks that belong to the same rank do not have to be consecutive in memory
- each block can have an arbitrary stride
- OpenMP support

## Performance:

When tested against ScaLAPACK (as provided by Intel MKL), this library outperforms the MKL's `pdgemr2d` routine that transforms between two block-cyclic data layouts by up to 50\%, as shown below:

## Algorithm

The pipeline of the algorithm is roughly the following:
- Find the intersections of the initial and final grids (called grid cover) in one pass, which decomposes initial blocks into smaller blocks.
- Sort decomposed initial blocks based on the rank to which they should be sent. Thus, if some ranks should exchange more than one block, they will do it within a single message.
- Copy decomposed intial blocks from a local storage (with arbitrary local data layout) to a temporary MPI buffer.
- Perform MPI_Alltoallv.
- Use the computed grid cover to decompose blocks from the final grid into smaller blocks.
- Sort decomposed final blocks based on the receiver rank.
- All smaller blocks that are received are copied from a temporary MPI buffer to the local storage (with arbitrary local data layout).

## Building and Installing

Assuming that you want to use the `gcc 8` compiler and `OpenMP`, you can build the project as follows:
```bash
# clone the repo
git clone --recursive https://github.com/kabicm/grid2grid
cd grid2grid
mkdir build
cd build

# build
CC=gcc-8 CXX=g++-8 cmake -DCMAKE_BUILD_TYPE=Release -DWITH_OPENMP=TRUE ..

# compile
make -j 4
```

## Example

To start with, there is a small example that transforms the matrix between two block-cyclic layouts. This example can be run from `build` directory as follows:
```bash
mpirun --oversubscribe -np 4 ./examples/scalapack2scalapack -m 10 -n 10 -ibm 2 -ibn 3 -fbm 3 -fbn 5 -pm 2 -pn 2
```
Where flags have the following meaning:
- `(m, n)`: Dimensions of matrix that we want to change the layout for.
- `(ibm, ibn)`: Dimensions of initial blocks, determining the initial block-cyclic distribution.
- `(fbm, fbn)`: Dimensions of final blocks, determining the final block-cyclic distribution that we want to reach.
- `(pm, pn)`: Processor grid, determining the processor decomposition for the block-cyclic distribution.

## Arbitrary Grid-Like Data Layouts

To transform between two arbitrary grid-like data layouts, we need to construct two `grid_layout` objects, one describing the initial layout and one describing the final layout. After that, we just invoke:
```cpp
grid2grid::grid_layout initial_layout(...);
grid2grid::grid_layout final_layout(...);
grid2grid::transform(initial_layout, final_layout, MPI_COMM_WORLD);
```

In order to create a grid_layout object, we need to provide the following information:
- `grid2D`: small struct describing the grid, basically 2 vectors, one describing where rows are split and one describing where columns are split in a grid.
- `owners`: a matrix that specifies the owner (i.e. the rank) of each block in `grid2D`.
- `local_blocks`: a vector of blocks that current rank owns. Each block is defined with a pointer to the beginning of that block in the local memory of current rank, a stride (by default equal to the number of rows of a block) and dimensions.

This is done for `ScaLAPACK` block-cyclic data layout, which can serve as an example of how `grid_layout` object can be created.

## Author
Marko Kabic (marko.kabic@cscs.ch)