{"id":19071719,"url":"https://github.com/blackhc/toma","last_synced_at":"2025-04-12T23:30:14.837Z","repository":{"id":41425456,"uuid":"254126044","full_name":"BlackHC/toma","owner":"BlackHC","description":"Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory","archived":false,"fork":false,"pushed_at":"2024-08-29T14:22:44.000Z","size":71,"stargazers_count":435,"open_issues_count":3,"forks_count":10,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-04-04T02:51:12.684Z","etag":null,"topics":["data-science","gpu","machine-learning","python","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BlackHC.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-04-08T15:25:03.000Z","updated_at":"2025-03-18T21:45:15.000Z","dependencies_parsed_at":"2024-11-09T01:30:28.803Z","dependency_job_id":"c59592a9-a3bb-4d54-b4bd-cc813dedcf2e","html_url":"https://github.com/BlackHC/toma","commit_stats":{"total_commits":26,"total_committers":1,"mean_commits":26.0,"dds":0.0,"last_synced_commit":"10cfe70efaba59ea669c50c0060cfddef65d0b16"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BlackHC%2Ftoma","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BlackHC%2Ftoma/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BlackHC%2Ftoma/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BlackHC%2Ftoma/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BlackHC","download_url":"https://codeload.github.com/BlackHC/toma/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248647249,"owners_count":21139081,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","gpu","machine-learning","python","pytorch"],"created_at":"2024-11-09T01:30:13.190Z","updated_at":"2025-04-12T23:30:14.791Z","avatar_url":"https://github.com/BlackHC.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Torch Memory-adaptive Algorithms (TOMA)\n\n[![Build Status](https://www.travis-ci.com/BlackHC/toma.svg?branch=master)](https://www.travis-ci.com/BlackHC/toma) [![codecov](https://codecov.io/gh/BlackHC/toma/branch/master/graph/badge.svg)](https://codecov.io/gh/BlackHC/toma) [![PyPI](https://img.shields.io/badge/PyPI-toma-blue.svg)](https://pypi.python.org/pypi/toma/)\n\nA collection of helpers to make it easier to write code that adapts to the available (CUDA) memory.\nSpecifically, it retries code that fails due to OOM (out-of-memory) conditions and lowers batchsizes automatically. \n\nTo avoid failing over repeatedly, a simple cache is implemented that memorizes that last successful batchsize given the call and available free memory.\n\n## Installation\n\nTo install using pip, use:\n\n```\npip install toma\n```\n\nTo run the tests, use:\n\n```\npython setup.py test\n```\n\n## Example\n\n```python\nfrom toma import toma\n\n@toma.batch(initial_batchsize=512)\ndef run_inference(batchsize, model, dataset):\n    # ...\n\nrun_inference(batchsize, model, dataset)\n```\n\nThis will try to execute train_model with batchsize=512. If a memory error is thrown, it will decrease the batchsize until it succeeds.\n\n**Note:** \nThis batch size can be different from the batch size used to accumulate gradients by only calling `optimizer.step()` every so often.\n\nTo make it easier to loop over a ranges, there are also `toma.range` and `toma.chunked`:\n\n```python\n@toma.chunked(initial_step=512)\ndef compute_result(out: torch.Tensor, start: int, end: int):\n    # ...\n\nresult = torch.empty((8192, ...))\ncompute_result(result)\n```\n\nThis will chunk `result` and pass the chunks to `compute_result` one by one. \nAgain, if it fails due to OOM, the step will be halfed etc.\nCompared to `toma.batch`, this allows for reduction of the step size while looping over the chunks.\nThis can save computation.\n\n```python\n@toma.range(initial_step=32)\ndef reduce_data(start: int, end: int, out: torch.Tensor, dataA: torch.Tensor, dataB: torch.Tensor):\n    # ...\n\nreduce_data(0, 1024, result, dataA, dataB)\n``` \n\n`toma.range` iterates over `range(start, end, step)` with `step=initial_step`. If it fails due to OOM, it will lower the step size and continue.\n\n### `toma.execute`\n\nTo make it easier to just execute a block without having to extract it into a function and then call it, we also provide `toma.execute.batch`, `toma.execute.range` and `toma.execute.chunked`, which are somewhat unorthodox and call the function that is passed to them right away. (Mainly because there is no support for anonymous functions in Python beyond lambda expressions.)\n\n```python\ndef function():\n    # ... other code\n\n    @toma.execute.chunked(batched_data, initial_step=128):\n    def compute(chunk, start, end):\n        # ...\n```\n\n## Cache\n\nThere are 3 available cache types at the moment. \nThey can be changed by either setting `toma.DEFAULT_CACHE_TYPE` or by passing `cache_type` to the calls.\n\nFor example:\n```python\n@toma.batch(initial_batchsize=512, cache_type=toma.GlobalBatchsizeCache)\n```\nor\n```python\ntoma.explicit.batch(..., toma_cache_type=toma.GlobalBatchsizeCache)\n```\n\n### `StacktraceMemoryBatchsizeCache`: Stacktrace \u0026 Available Memory (*the default*)\n\nThis memorizes the successful batchsizes for a given call trace and available memory at that point.\nFor most machine learning code, this is sufficient to remember the right batchsize without having to look at the actual arguments and understanding more of the semantics.\n\nThe implicit assumption is that after a few iterations a stable state will be reached in regards to GPU and CPU memory usage.\n\nTo limit the CPU memory of the process, toma provides:\n```python\nimport toma.cpu_memory\n\ntoma.cpu_memory.set_cpu_memory_limit(8)\n```\nThis can also be useful to avoid accidental swap thrashing.\n\n### `GlobalBatchsizeCache`: Global per Function\n\nThis reuses the last successful batchsize independently from where the call happened.\n\n### `NoBatchsizeCache`: No Caching\n\nAlways starts with the suggested batchsize and fails over if necessary.\n\n## Benchmark/Overhead\n\nThere is overhead involved. Toma should only be used with otherwise time/memory-consuming operations.\n\n```text\n---------------------------------------------------------------------------------- benchmark: 5 tests ----------------------------------------------------------------------------------\nName (time in ms)          Min                Max               Mean            StdDev             Median                IQR            Outliers       OPS            Rounds  Iterations\n----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\ntest_native             2.1455 (1.0)       3.7733 (1.0)       2.3037 (1.0)      0.1103 (1.0)       2.2935 (1.0)       0.1302 (1.0)          81;5  434.0822 (1.0)         448           1\ntest_simple            17.4657 (8.14)     27.0049 (7.16)     21.0453 (9.14)     2.6233 (23.79)    20.4881 (8.93)      3.4384 (26.42)        13;0   47.5165 (0.11)         39           1\ntest_toma_no_cache     31.4380 (14.65)    40.8567 (10.83)    33.2749 (14.44)    2.2530 (20.43)    32.2698 (14.07)     2.8210 (21.67)         4;1   30.0527 (0.07)         25           1\ntest_explicit          33.0759 (15.42)    52.1866 (13.83)    39.6956 (17.23)    6.9620 (63.14)    38.4929 (16.78)    11.2344 (86.31)         4;0   25.1917 (0.06)         20           1\ntest_toma              36.9633 (17.23)    57.0220 (15.11)    43.5201 (18.89)    6.7318 (61.05)    41.6034 (18.14)     7.2173 (55.45)         2;2   22.9779 (0.05)         13           1\n----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n```\n\n## Thanks\n\nThanks to [@y0ast](https://github.com/y0ast) for feedback and discussion.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblackhc%2Ftoma","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fblackhc%2Ftoma","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblackhc%2Ftoma/lists"}