Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/somajitdey/runman
Linux job scheduler / manager based on task-spooler. Notifies via Telegram on job completion.
https://github.com/somajitdey/runman
job-scheduler linux run-manager task-spooler telegram-bot workload-management
Last synced: 24 days ago
JSON representation
Linux job scheduler / manager based on task-spooler. Notifies via Telegram on job completion.
- Host: GitHub
- URL: https://github.com/somajitdey/runman
- Owner: SomajitDey
- License: gpl-3.0
- Created: 2023-04-06T20:40:47.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-05-11T14:21:04.000Z (over 1 year ago)
- Last Synced: 2024-10-12T12:18:44.334Z (3 months ago)
- Topics: job-scheduler, linux, run-manager, task-spooler, telegram-bot, workload-management
- Language: Shell
- Homepage:
- Size: 48.8 KB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Run Man(ager)
Run Man (`runman`) is a humble wrapper around [task-spooler](http://viric.name/soft/ts/) (`tsp`) for easily scheduling and managing jobs in any multicore Linux node. Basically, it's nothing but a poor and lazy man's workload manager. If [SLURM](https://slurm.schedmd.com/documentation.html), [PBS](https://www.openpbs.org/) and the likes seem like an overkill for your multiprocessing needs, and vanilla `tsp` seems too low-level, then `runman` might be your thing.### Features
- System-wide task-spooler queue with as many slots as number of CPUs in the system.
- `USER` submits [`.job`](/example.job) file that lists her requirements such as
- command/script/executable to be run
- number of processors needed
- paths to store STDOUT and/or STDERR
- max. execution time (wall clock)
- working directory
- dependency on a previous job
- `USER` is notified via [Telegram](https://telegram.org/) on job completion by a [bot](https://github.com/fabianonline/telegram.sh).
- Usual job management tools such as - status query, abort/removal, prioritization.
- Job arrays similar to [PBS job arrays](https://centers.hpc.mil/users/advancedTopics/Job_Arrays.html)### Installation
```bash
# Download
curl -LO https://raw.githubusercontent.com/SomajitDey/runman/main/runman# Set permission
chmod a+rx ./runman# Install at PATH
sudo install ./runman /usr/local/bin/# Local installation (for non-sudoers).
export PATH="${HOME}/.bin:${PATH}" # Put this in ~/.bashrc.
install -D -t "${HOME}"/.bin/ ./runman
```
Also install [`tsp`](https://command-not-found.com/tsp), [`telegram`](https://github.com/fabianonline/telegram.sh#installation--configuration), [`jq`](https://command-not-found.com/jq) and [`curl`](https://command-not-found.com/curl).### Usage
```bash
runman help
```### Pro tip
In a *trusted* environment, the sysadmin may [create](https://github.com/fabianonline/telegram.sh#installation--configuration) a single telegram bot to serve all the `USER`s. The token may be saved in the global config `/etc/telegram.sh.conf` as the key-value pair:
```bash
TELEGRAM_TOKEN='123456:AbcDefGhi-JlkMno'
```
In such a case, all the `USER`s must be sent the Telegram address of the bot: `@` or `https://t.me/` so they can subscribe to the feed using
```bash
runman tgfeed
```### Known Issue(s)
- The key value pair `stderr=` provided in `.job` file doesn't work, atleast in Ubuntu 20.04. The underlying reason seems to be a bug in task-spooler itself: `tsp -E` doesn't work as expected.- `runman` can't kill orphaned processes spawned by a job that has finished. The orphaned processes may consume CPU time at the background unbeknownst to `runman` or its core `tsp`. Hence, until fixed, it's the job of the user to make sure her job doesn't create (resource consuming) orphaned processes. For example, if her shell script spawns processes in background with `&`, she must use the `wait` statement at the end of her script, so that her job finishes only after all her spawned processes have exited.