{"id":20962223,"url":"https://github.com/lartpang/runit","last_synced_at":"2025-09-19T02:32:36.954Z","repository":{"id":39118552,"uuid":"376543811","full_name":"lartpang/RunIt","owner":"lartpang","description":"A simple program scheduler for your code on different devices.","archived":false,"fork":false,"pushed_at":"2024-08-15T04:29:17.000Z","size":35,"stargazers_count":11,"open_issues_count":3,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-12-28T23:32:37.193Z","etag":null,"topics":["deeplearning-tool","multi-gpu-scheduler","multi-process","multi-process-scheduler","python","python3","scheduler","scheduler-tool","single-file-scheduler","tool","utility"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lartpang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-06-13T12:54:59.000Z","updated_at":"2024-08-15T04:29:20.000Z","dependencies_parsed_at":"2024-04-19T08:51:25.830Z","dependency_job_id":"ad4c0490-e4ab-4954-9567-1b7f4b5a86d2","html_url":"https://github.com/lartpang/RunIt","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lartpang%2FRunIt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lartpang%2FRunIt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lartpang%2FRunIt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lartpang%2FRunIt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lartpang","download_url":"https://codeload.github.com/lartpang/RunIt/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":233546487,"owners_count":18692228,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deeplearning-tool","multi-gpu-scheduler","multi-process","multi-process-scheduler","python","python3","scheduler","scheduler-tool","single-file-scheduler","tool","utility"],"created_at":"2024-11-19T02:24:57.477Z","updated_at":"2025-09-19T02:32:31.686Z","avatar_url":"https://github.com/lartpang.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RunIt\n\n\u003e [!NOTE]\n\u003e This tool still has some limitations.\n\u003e If you encounter any problems in use, please feel free to ask.\n\nA simple program scheduler for your code on different devices.\n\nLet the machine move!\n\nPutting the machine into sleep is a disrespect for time.\n\n## Usage\n\n\u003e [!note]\n\u003e\n\u003e 2024-8-14: Now, the config file contains the information of your GPUs and jobs, more details can be found in [config.py](./examples/config.py).\n\n### Dependency\n\n- PyYAML==6.0\n- nvidia-ml-py (`pynvml` only for `runit_based_on_detected_memory.py`)\n\n### Scripts\n\nWe provides 3 scripts for different ways to run jobs.\n\n- `runit_with_exclusive_gpu.py`: One GPU can only be used by one job at a time.\n- `runit_based_on_memory`：One GPU can be used by many job at a time based on the memory usage.\n- `runit_based_on_detected_memory.py`: Use `pynvml` for detecting the total memory usage of each GPU. *But this may not be suitable for scenarios where the memory used by a running GPU application is unstable.*\n\n## demo\n\n```shell\n$ python run_it.py --config ./examples/config.yaml\n$ python run_it.py --max-workers 3 --config ./examples/config.yaml\n```\n\n```mermaid\ngraph TD\n    A[Start] --\u003e B[Read Configuration and Command Pool]\n    B --\u003e C[Initialize Shared Resources]\n    C --\u003e |Maximum number of requirements met| D[Loop Until All Jobs Done]\n    D --\u003e E[Check Available GPUs]\n    E --\u003e|Enough GPUs| F[Run Job in Separate Process]\n    E --\u003e|Not Enough GPUs| G[Wait and Retry]\n    F --\u003e H[Job Completes]\n    F --\u003e I[Job Fails]\n    H --\u003e J[Update Job Status and Return GPUs]\n    I --\u003e J\n    G --\u003e D\n    J --\u003e|All Jobs Done| K[End]\n    C --\u003e|Maximum number of requirements not met| L[Terminate Workers]\n    L --\u003e M[Shutdown Manager and Join Pool]\n    M --\u003e K\n```\n\n## Thanks\n\n- [@BitCalSaul](https://github.com/BitCalSaul): Thanks for the positive feedbacks!\n  - \u003chttps://github.com/lartpang/RunIt/issues/3\u003e\n  - \u003chttps://github.com/lartpang/RunIt/issues/2\u003e\n  - \u003chttps://github.com/lartpang/RunIt/issues/1\u003e\n- https://www.jb51.net/article/142787.htm\n- https://docs.python.org/zh-cn/3/library/subprocess.html\n- https://stackoverflow.com/a/23616229\n- https://stackoverflow.com/a/14533902\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flartpang%2Frunit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flartpang%2Frunit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flartpang%2Frunit/lists"}