https://github.com/jinzcdev/occupied-gpus
The program used to occupy GPUs.
https://github.com/jinzcdev/occupied-gpus
deep-learning gpu
Last synced: 24 days ago
JSON representation
The program used to occupy GPUs.
- Host: GitHub
- URL: https://github.com/jinzcdev/occupied-gpus
- Owner: jinzcdev
- License: mit
- Created: 2021-11-20T09:59:37.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-03-24T10:43:45.000Z (about 2 years ago)
- Last Synced: 2025-04-13T18:18:42.502Z (about 2 months ago)
- Topics: deep-learning, gpu
- Language: Python
- Homepage:
- Size: 14.6 KB
- Stars: 7
- Watchers: 0
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Occupied GPUs
The program used to occupy GPUs.
## Preparation
```shell
pip install -r requirements.txt
```> Note: Please make sure that `cuda` is avaliable.
## Usage
### Python Module
This repository has been packaged and published in [PyPI](https://pypi.org). Please run the following command to install it.
```shell
pip install occupiedgpus
```And happy to occupy GPUs:
```python
python -m occupiedgpus.core --gpu-ids 0,1,2,3 --epochs 120 --options 0
```where `--options` can be assigned 0 or 1 ( `0` means to occupy the GPU when it is not used, and `1` means to occupy the remaining GPU memory at any time).
### Source Code
Clone this repository to your local and activate your Python environment.
```shell
git clone https://github.com/jinzcdev/occupied-gpus.git
```**single processing**
To occupy the corresponding GPUs in single processing, run:
```shell
sh train.sh 0,1,2,3 [option]
```or
```shell
chmod u+x ./train.sh
./train.sh 0,1,2,3 [option]
```where `0,1,2,3` stands for the GPU0-3 to be occpied. `[option]` can be assigned 0 or 1 as mentioned above, and the default is 0.
**multi-processing**
To occupy GPUs faster with multi-processing, run in **bash (NOT sh):**
```shell
bash ./multi_train.sh 0,1,2,3 [option] [port]
```where `[option]` can be assigned 0 or 1 as mentioned above, and the default is 0. The default `[port]` is 54886, and when a port conflict occurs, you may change the `[port]`.
> Note: If your pytorch version is less than 1.9.0, please replace `torchrun` with `python -m torch.distributed.launch --use_env` in [multi_train.sh](multi_train.sh).
## Run in the background
if you want to run the code in the background, run:
```shell
nohup sh train.sh ${GPU_IDS} &>> /dev/null &
or
nohup sh multi_train.sh ${GPU_IDS} &>> /dev/null &
or
nohup python -m occupiedgpus.core --gpu-ids ${GPU_IDS} --epochs 120 --options 0 &>> /dev/null &
```> replace ${GPU_IDS} with `0,1,...`