https://github.com/cea-list/sprofile
A program to print consumed resources at the end of a slurm job.
https://github.com/cea-list/sprofile
accounting cgroups cpu deep-learning gpu hpc optimization profiling resources slurm
Last synced: 6 months ago
JSON representation
A program to print consumed resources at the end of a slurm job.
- Host: GitHub
- URL: https://github.com/cea-list/sprofile
- Owner: CEA-LIST
- License: other
- Created: 2022-07-07T10:04:34.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2024-02-09T09:50:38.000Z (about 2 years ago)
- Last Synced: 2024-02-13T09:42:11.057Z (about 2 years ago)
- Topics: accounting, cgroups, cpu, deep-learning, gpu, hpc, optimization, profiling, resources, slurm
- Language: Python
- Homepage:
- Size: 17.6 KB
- Stars: 6
- Watchers: 4
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# SProfile
Sprofile prints the consumed CPU, RAM and GPU resources at the end of a slurm job.
It parses readily available resource usage information and therefore incurs no overhead.
Sprofile can be installed from pypi or from source:
```sh
pip install sprofile
```
For CPU and RAM statistics, slurm must be configured to use the cgroup plugin.
For GPU resource informations, accounting mode must be unabled in the nvidia driver (`nvidia-smi --accounting-mode=1`).
In order to use sprofile, add the following lines at the beginning and the end of the slurm script:
```sh
#!/usr/bin/env sh
...
srun sprofile start
...
srun sprofile stop
```
The last command will print actual resource utilization:
```
-- sprofile report (node27) --
Time: 0:00:25 / 1:00:00
CPU load: 0.9 / 2.0
RAM peak: 3G / 8G
GPU load: 0.9 / 1.0
GPU peak mem: 3G / 32G
GPU energy: 0.0kWh
```