An open API service indexing awesome lists of open source software.

https://github.com/proxystore/tpds24-evaluation

Data, analysis, and figures for the TPDS24 ProxyStore paper
https://github.com/proxystore/tpds24-evaluation

Last synced: 4 months ago
JSON representation

Data, analysis, and figures for the TPDS24 ProxyStore paper

Awesome Lists containing this project

README

          

# TPDS 2024 ProxyStore Analysis

Data, analysis, and figures for the TPDS 2024 ProxyStore paper ([ArXiv Preprint](https://www.arxiv.org/abs/2407.01764)).

## Get Started

This repository requires Python 3.9, 3.10, or 3.11. To get started running
the benchmarks and analysis notebooks, create a new virtual environment,
install the dependencies, and launch Jupyter Lab.

```bash
python -m venv venv
. venv/bin/activate
pip install -r requirements.txt
jupyter-lab
```

## Artifacts Appendix

We have provided a Docker image and set of scripts for reproducing the benchmarks in our submission.

### Install

Build the Docker image which will install the necessary dependencies pinned in the `requirements.txt` file.
```bash
docker build -t tpds-proxystore .
```

### Run Task Pipelining

The `scripts/docker/0-task-pipelining.sh` will execute the experiment with the exact parameters used in the TPDS submission.
Use the following command to execute the experiment in the Docker image.
Completing all of the runs takes 20 minutes and requires 4 cores and 16 GB of memory.

```bash
docker run --rm -it -v $PWD:/analysis tpds-proxystore /bin/bash /analysis/scripts/docker/0-task-pipelining.sh
```

This will produce an output directory in `data/docker/0-task-pipelining/` containing the runtime log (`log.txt`) and results (`result.csv`).

### Run Stream Scaling

The `scripts/docker/1-stream-scaling.sh` will execute the experiment with the exact parameters used in the TPDS submission.
Use the following command to execute the experiment in the Docker image.
Completing all of the runs takes 30 minutes and requires 32 cores and 256 GB of memory.

**Note:** The numbers passed as arguments to the script are the number of workers to perform runs with.
The default below is to run the experiments with 2, 4, 8, 16, and 32 workers.
Modify this list based on how many cores your machine has.

```bash
docker run --rm -it -v $PWD:/analysis tpds-proxystore /bin/bash /analysis/scripts/docker/1-stream-scaling.sh 2 4 8 16 32
```

This will produce an output directory for each run configuration (worker counts) in `data/docker/1-stream-scaling/` containing the runtime log (`log.txt`) and results (`result.csv`).

### Run Memory Management

The `scripts/docker/2-memory-management.sh` will execute the experiment with the exact parameters used in the TPDS submission.
Use the following command to execute the experiment in the Docker image.
Completing all of the runs takes 1 hour and requires 32 cores and 64 GB of memory.

```bash
docker run --rm -it -v $PWD:/analysis tpds-proxystore /bin/bash /analysis/scripts/docker/2-memory-management.sh
```

This will produce four output directories for each of the memory management methods in `data/docker/2-memory-management/`.
Each directory will contain the runtime log (`log.txt`) and results (`result.csv`).

### Analyze Results

A Jupyter Notebook is provided for each experiment to analyze process results and create figures.
The Jupyter Lab server can be started in the Docker container.

```bash
docker run --rm -it -p 8888:8888 -v $PWD:/analysis tpds-proxystore /bin/bash jupyter-lab --ip 0.0.0.0 --no-browser --allow-root
```

The Jupyter Lab can be opened on the local host using the printed link that looks like:
```
http://127.0.0.1:8888/lab?token=
```

The `notebooks/` directory contains one notebook for each experiment in the TPDS submission.
Open one of `notebooks/0-task-pipelining.ipynb`, `notebooks/1-stream-scaling.ipynb`, and `notebooks/2-memory-management.ipynb` to analyze the data produced by the above experiments.
At the bottom of each notebook is a section called "Analyze Results" containing a `path` variable to the results of the experiment you want to analyze.
Update the `path` variable and run the notebook to produce figures.