Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/cyclops-community/ctf

Cyclops Tensor Framework: parallel arithmetic on multidimensional arrays
https://github.com/cyclops-community/ctf

Last synced: about 2 months ago
JSON representation

Cyclops Tensor Framework: parallel arithmetic on multidimensional arrays

Awesome Lists containing this project

README

        

## Cyclops Tensor Framework (CTF)

Cyclops is a parallel (distributed-memory) numerical library for multidimensional arrays (tensors) in C++ and Python.

Quick documentation links: [C++](http://solomon2.web.engr.illinois.edu/ctf/index.html) and [Python](http://solomon2.web.engr.illinois.edu/ctf_python/ctf.html#module-ctf.core).

Broadly, Cyclops provides tensor objects that are stored and operated on by all processes executing the program, coordinating via MPI communication.

Cyclops supports a multitude of tensor attributes, including sparsity, various symmetries, and user-defined element types.

The library is interoperable with ScaLAPACK at the C++ level and with numpy at the Python level. In Python, the library provides a parallel/sparse implementation of `numpy.ndarray` functionality.

## Building and Testing

See the [Github Wiki](https://github.com/cyclops-community/ctf/wiki/Building-and-testing) for more details on this. It is possible to build static and dynamic C++ libraries, the Python CTF library, as well as examples and tests for both via this repository. Cyclops follows the basic installation convention,
```sh
./configure
make
make install
```
(where the last command should usually be executed as superuser, i.e. requires `sudo`) below we give more details on how the build can be customized.

First, its necessary to run the configure script, which can be set to the appropriate type of build and is responsible for obtaining and checking for any necessary dependencies. For options and documentation on how to execute configure, run
```sh
./configure --help
```
then execute ./configure with the appropriate options. Successful execution of this script, will generate a `config.mk` file and a `setup.py` file, needed for C++ and Python builds, respectively, as well as a how-did-i-configure file with info on how the build was configured. You may modify the `config.mk` and `setup.py` files thereafter, subsequent executions of configure will prompt to overwrite these files.

Note: there is a (now-fixed) [bug](https://github.com/pmodels/mpich/pull/6543) in recent versions of MPICH that causes a segmentation fault in CTF when executing with 2 or more processors.
The bug can be remedied without rebuilding CTF by setting an environment variable as follows,
```sh
export MPIR_CVAR_DEVICE_COLLECTIVES=none
```

### Dependencies and Supplemental Packages

The strict library dependencies of Cyclops are MPI and BLAS libraries.

Some functionality in Cyclops requires LAPACK and ScaLAPACK. A standard build of the latter can be constructed automatically by running configure with `--build-scalapack` (requires cmake to build ScaLAPACK, manual build can also be provided along by providing the library path).

Faster transposition in Cyclops is made possible by the HPTT library. To obtain a build of HPTT automatically run configure with `--build-hptt`.

Efficient sparse matrix multiplication primitives and efficient batched BLAS primitives are available via the Intel MKL library, which is automatically detected for standard Intel compiler configurations or when appropriately supplied as a library.

### Building and Installing the Libraries

Once configured, you may install both the shared and dynamic libraries, by running `make`. Parallel make is supported.

To build exclusively the static library, run `make libctf`, to build exclusively the shared library, run `make shared`.

To install the C++ libraries to the prespecified build destination directory (`--build-dir` for `./configure`, `/usr/local/` by default), run `make install` (as superuser if necessary). If the CTF configure script built the ScaLAPACK and/or HPTT libraries automatically, the libraries for these will need to be installed system-wide manually.

To build the Python CTF library, execute `make python`.

To install the Python CTF library via pip, execute `make python_install` (as superuser if not in a virtual environment).

To uninstall, use `make uninstall` and `make python_uninstall`.

### Testing the Libraries

To test the C++ library with a sequential suite of tests, run `make test`. To test the library using 2 processors, execute `make test2`. To test the library using some number N processors, run `make testN`.

To test the Python library, run `make python_test` to do so sequentially and `make python_testN` to do so with N processors.

To debug issues with custom code execution correctness, build CTF libraries with `-DDEBUG=1 -DVERBOSE=1` (more info in `config.mk`).

To debug issues with custom code performance, build CTF libraries with `-DPROFILE -DPMPI` (more info in `config.mk`), which should lead to a performance log dump at the end of an execution of a code using CTF.

## Sample C++ Code and Minimal Tutorial

A simple Jacobi iteration code using CTF is given below, also found in [this example](examples/jacobi.cxx).

```cpp
Vector<> Jacobi(Matrix<> A , Vector<> b , int n){
Matrix<> R(A);
R["ii"] = 0.0;
Vector<> x (n), d(n), r(n);
Function<> inv([]( double & d ){ return 1./d; });
d["i"] = inv(A["ii"]); // set d to inverse of diagonal of A
do {
x["i"] = d["i"]*(b["i"]-R["ij"]*x["j"]);
r["i"] = b["i"]-A ["ij"]*x["j"]; // compute residual
} while ( r.norm2 () > 1. E -6); // check for convergence
return x;
}
```

The above Jacobi function accepts n-by-n matrix A and n-dimensional vector b containing double precision floating point elements and solves Ax=b using Jacobi iteration. The matrix R is defined to be a copy of the data in A. Its diagonal is subsequently set to 0, while the diagonal of A is extracted into d and inverted. A while loop then computes the jacobi iteration by use of matrix vector multiplication, vector addition, and vector Hadamard products.

This Jacobi code uses Vector and Matrix objects which are specializations of the Tensor object. Each of these is a distributed data structure, which is partitioned across an MPI communicator.

The key illustrative part of the above example is
```cpp
x["i"] = d["i"]*(b["i"]-R["ij"]*x["j"]);
```
to evaluate this expression, CTF would execute the following set of loops, each in parallel,
```cpp
double y[n];
for (int i=0; i inv([]( double & d ){ return 1./d; });
d["i"] = inv(A["ii"]); // set d to inverse of diagonal of A
```
the same code could have been written even more concisely,
```cpp
d["i"] = Function<> inv([]( double & d ){ return 1./d; })(A["ii"]);
```
This syntax defines and employs an elementwise function that inverts each element of A to which it is applied. The operation is executing the following loop,
```cpp
for (int i=0; i