Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/JaneliaSciComp/ray-janelia
Run Python Ray on the Janelia cluster
https://github.com/JaneliaSciComp/ray-janelia
cluster compute-clusters distributed-computing janelia ray
Last synced: 2 months ago
JSON representation
Run Python Ray on the Janelia cluster
- Host: GitHub
- URL: https://github.com/JaneliaSciComp/ray-janelia
- Owner: JaneliaSciComp
- License: bsd-3-clause
- Created: 2022-05-31T14:35:08.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2022-09-08T17:07:37.000Z (over 2 years ago)
- Last Synced: 2024-05-03T05:09:23.809Z (8 months ago)
- Topics: cluster, compute-clusters, distributed-computing, janelia, ray
- Language: Shell
- Homepage:
- Size: 18.6 KB
- Stars: 2
- Watchers: 5
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-janelia-software - ray-janelia - Python library for running Ray on the Janelia cluster (Janelia Site-specific Software)
- awesome-janelia-software - ray-janelia - Python library for running Ray on the Janelia cluster (Janelia Site-specific Software)
README
# ray-janelia
These scripts let you run [Ray](https://github.com/ray-project/ray) on the Janelia cluster (and maybe other LSF clusters).
You must have a Conda environment with [Ray installed](https://docs.ray.io/en/latest/ray-overview/installation.html).
## Create a cluster
This command will start a 20 slot cluster, using a conda environment called `ray-python`:
```bash
ray-janelia/ray-launch.sh -n 20 -e ray-python
```By default, the cluster will be divided into nodes of 4 slots each. To use a different tiling, specify the number of nodes you want with `-d `.
This command will start a cluster with 20 CPU and 2 GPU slots on a GPU enabled queue `gpu_queue`:
```bash
ray-janelia/ray-launch.sh -n 20 -e ray-python -b "-q gpu_queue -gpu num=2"
```## Run a job on a cluster
The output of launching the cluster above will print a remote address like `ray://head_node:10001`. You can simply pass this address into your job when creating your Ray client, like this:
```python
ray.init(address="ray://head_node:10001")
```The output will also print the address of the [Ray dashboard](https://docs.ray.io/en/latest/ray-core/ray-dashboard.html) for the launched Ray cluster.
## Create a cluster, run a job, then shut it down
Another option is to create a cluster and run a python job with a single command:
```bash
./ray-launch.sh -n 20 -e ray-python -p "/path/to/job.py --options"
```In this case, to connect to the Ray cluster created with the `ray-launch.sh` script, the python script `job.py` should contain:
```python
ray.init(address="auto")
```When the python script completes, the Ray cluster will be automatically shut down and the Janelia cluster job will be terminated.