https://github.com/ct-clmsn/zmq-collectives-rs
SPMD (HPC) collective communication algorithms for Rust using zeromq
https://github.com/ct-clmsn/zmq-collectives-rs
0mq hpc rust spmd supercomputing zeromq
Last synced: 3 months ago
JSON representation
SPMD (HPC) collective communication algorithms for Rust using zeromq
- Host: GitHub
- URL: https://github.com/ct-clmsn/zmq-collectives-rs
- Owner: ct-clmsn
- License: bsl-1.0
- Created: 2021-05-14T02:08:39.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2021-05-17T02:13:16.000Z (over 4 years ago)
- Last Synced: 2025-07-13T17:02:24.252Z (3 months ago)
- Topics: 0mq, hpc, rust, spmd, supercomputing, zeromq
- Language: Rust
- Homepage:
- Size: 74.2 KB
- Stars: 4
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE_1_0.txt
Awesome Lists containing this project
README
# [zmq-collectives-rs](https://github.com/ct-clmsn/zmq-collectives-rs)
This library implements a [SPMD](https://en.m.wikipedia.org/wiki/SPMD) (single program
multiple data) model and collective communication algorithms (Robert van de Geijn's
Binomial Tree) in Rust using [0MQ](https://zeromq.org). The library provides log2(N)
algorithmic performance for each collective operation over N compute hosts.Collective communication algorithms are used in HPC (high performance computing) / Supercomputing
libraries and runtime systems such as [MPI](https://www.open-mpi.org) and [OpenSHMEM](http://openshmem.org).Documentation for this library can be found on it's [wiki](https://github.com/ct-clmsn/zmq-collectives-rs/wiki).
### Algorithms Implemented
* Broadcast
* Reduction
* Scatter
* Gather
* Barrier### Configuring Distributed Program Execution
This library requires the use of environment variables
to configure distributed runs of SPMD applications.
Each of the following environment variables needs to be
supplied to correctly run programs:* ZMQ_COLLECTIVES_NRANKS
* ZMQ_COLLECTIVES_RANK
* ZMQ_COLLECTIVES_ADDRESSESZMQ_COLLECTIVES_NRANKS - unsigned integer value indicating
how many processes (instances or copies of the program)
are running.ZMQ_COLLECTIVES_RANK - unsigned integer value indicating
the process instance this program represents. This is
analogous to a user provided thread id. The value must
be 0 or less than ZMQ_COLLECTIVES_NRANKS.ZMQ_COLLECTIVES_ADDRESSES - should contain a ',' delimited
list of ip addresses and ports. The list length should be
equal to the integer value of ZMQ_COLLECTIVES_NRANKS. An
example for a 2 rank application name `app` is below:```
ZMQ_COLLECTIVES_NRANKS=2 ZMQ_COLLECTIVES_RANK=0 ZMQ_COLLECTIVES_ADDRESSES=127.0.0.1:5555,127.0.0.1:5556 ./appZMQ_COLLECTIVES_NRANKS=2 ZMQ_COLLECTIVES_RANK=1 ZMQ_COLLECTIVES_ADDRESSES=127.0.0.1:5555,127.0.0.1:5556 ./app
```In this example, Rank 0 maps to 127.0.0.1:5555 and Rank 1
maps to 127.0.0.1:5556.HPC batch scheduling systems like [Slurm](https://en.m.wikipedia.org/wiki/Slurm_Workload_Manager),
[TORQUE](https://en.m.wikipedia.org/wiki/TORQUE), [PBS](https://en.wikipedia.org/wiki/Portable_Batch_System),
etc. provide mechanisms to automatically define these
environment variables when jobs are submitted.### Notes
0MQ uses sockets/file descriptors (same thing) to
handle communication and asynchrony control. There
is a GNU/Linux kernel configurable ~2063 default
limit on the number of file descriptors/sockets a user
process is authorized to open during execution. The
TcpBackend uses 2 file descriptors/sockets. In 0MQ
terms these sockets are ZMQ_ROUTER.tcp is a "chatty" protocol; tcp requires round trips
between clients and servers during the data transmission
exchange to ensure data is communicated correctly. The
use of this protocol makes it less than ideal for jobs
requiring high performance. However, tcp is provided in
0MQ and is universally accessible (tcp is a commodity
protocol) and makes for a reasonable place to plant a
flag for providing an implementation.This library requires libzmq. LD_LIBRARY_FLAGS and
PKG_CONFIG_PATH needs to point to the directories that
the libzmq library has been is installed. As an example,
let's say a user has installed libzmq into a directory
with the environment variable named:$LIBZMQ_INSTALL_PREFIX_PATH
libzmq.a or libzmq.so would be installed in the directory:
$LIBZMQ_INSTALL_PREFIX_PATH/liblibzmq.pc can be found in the directory:
$LIBZMQ_INSTALL_PREFIX_PATH/lib/pkgconfig### License
Boost Version 1.0
### Date
03MAY2021
### Author
Christopher Taylor
### Dependencies
* pkg-config
* [rust](https://www.rust-lang.org/)
* [libzmq](https://github.com/zeromq/libzmq)
* [uuid](https://github.com/uuid-rs/uuid)
* [serde](https://github.com/serde-rs/serde)
* [bincode](https://github.com/bincode-org/bincode)
* [zmq](https://github.com/erickt/rust-zmq)