Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/vwxyzjn/launcha

Launcha is a simple Docker-based cloud job launcher.
https://github.com/vwxyzjn/launcha

Last synced: about 2 months ago
JSON representation

Launcha is a simple Docker-based cloud job launcher.

Host: GitHub
URL: https://github.com/vwxyzjn/launcha
Owner: vwxyzjn
Created: 2021-11-17T00:47:27.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2022-04-19T02:58:14.000Z (almost 3 years ago)
Last Synced: 2024-11-09T18:49:19.512Z (3 months ago)
Language: Python
Homepage:
Size: 3.68 MB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Launcha

Launcha is a docker-based cloud job launcher.

## Quick Start

Set up docker's `buildx` and login in to your preferred registry.

```
docker buildx create --use
docker login
```

Then you could build a container using the `--build` flag based on the `Dockerfile` in the current directory. Also, `--push` will auto-push to the docker registry.

```
poetry run python -m cleanrl_utils.submit_exp \
--docker-tag vwxyzjn/cleanrl:latest \
--command "poetry run python cleanrl/ppo.py --gym-id CartPole-v1 --total-timesteps 100000 --track --capture-video" \
--build --push
```

### Inspection

Dry run to inspect the generated docker command
```
poetry run python -m cleanrl_utils.submit_exp \
--docker-tag vwxyzjn/cleanrl:latest \
--command "poetry run python cleanrl/ppo.py --gym-id CartPole-v1 --total-timesteps 100000 --track --capture-video" \
--num-seed 1
```

The generated docker command should look like
```
docker run -d --cpuset-cpus="0" -e WANDB_API_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx vwxyzjn/cleanrl:latest /bin/bash -c "poetry run python cleanrl/ppo.py --gym-id CartPole-v1 --total-timesteps 100000 --track --capture-video --seed 1"
```

### Run on AWS

Submit a job using AWS's compute-optimized spot instances
```
poetry run python -m cleanrl_utils.submit_exp \
--docker-tag vwxyzjn/cleanrl:latest \
--command "poetry run python cleanrl/ppo.py --gym-id CartPole-v1 --total-timesteps 100000 --track --capture-video" \
--job-queue c5a-large-spot \
--num-seed 1 \
--num-vcpu 1 \
--num-memory 2000 \
--num-hours 48.0 \
--provider aws
```

Submit a job using AWS's accelerated-computing spot instances
```
poetry run python -m cleanrl_utils.submit_exp \
--docker-tag vwxyzjn/cleanrl:latest \
--command "poetry run python cleanrl/ppo_atari.py --gym-id BreakoutNoFrameskip-v4 --track --capture-video" \
--job-queue g4dn-xlarge-spot \
--num-seed 1 \
--num-vcpu 1 \
--num-gpu 1 \
--num-memory 4000 \
--num-hours 48.0 \
--provider aws
```

Submit a job using AWS's compute-optimized on-demand instances
```
poetry run python -m cleanrl_utils.submit_exp \
--docker-tag vwxyzjn/cleanrl:latest \
--command "poetry run python cleanrl/ppo.py --gym-id CartPole-v1 --total-timesteps 100000 --track --capture-video" \
--job-queue c5a-large \
--num-seed 1 \
--num-vcpu 1 \
--num-memory 2000 \
--num-hours 48.0 \
--provider aws
```

Submit a job using AWS's accelerated-computing on-demand instances
```
poetry run python -m cleanrl_utils.submit_exp \
--docker-tag vwxyzjn/cleanrl:latest \
--command "poetry run python cleanrl/ppo_atari.py --gym-id BreakoutNoFrameskip-v4 --track --capture-video" \
--job-queue g4dn-xlarge \
--num-seed 1 \
--num-vcpu 1 \
--num-gpu 1 \
--num-memory 4000 \
--num-hours 48.0 \
--provider aws
```

Then you should see:

![aws_batch1.png](imgs/aws_batch1.png)
![aws_batch2.png](imgs/aws_batch2.png)
![wandb.png](imgs/wandb.png)

## Multi-arch images

To build a multi-arch image using `--archs linux/arm64,linux/amd64`:

Note that building an multi-arch image is quite slow but will allow you to use ARM instances such as `m6gd.medium` that is 20-70% cheaper than X86 instances.
However, note there is no cloud providers that give ARM instances with Nvidia's GPU (to my knowledge), so this effort might not be worth it.

If you still wants to pursue multi-arch, you can speed things up by using a native ARM server and connect it to your `buildx` instance:

```
docker -H ssh://costa@gpu info
docker buildx create --name remote --use
docker buildx create --name remote --append ssh://costa@gpu
docker buildx inspect --bootstrap
python -m cleanrl_utils.submit_exp -b --archs linux/arm64,linux/amd64
```