{"id":27326812,"url":"https://github.com/parasj/checkmate","last_synced_at":"2025-04-12T11:59:38.267Z","repository":{"id":41053518,"uuid":"209406827","full_name":"parasj/checkmate","owner":"parasj","description":"Training neural networks in TensorFlow 2.0 with 5x less memory","archived":false,"fork":false,"pushed_at":"2022-02-21T18:35:56.000Z","size":573,"stargazers_count":124,"open_issues_count":2,"forks_count":15,"subscribers_count":9,"default_branch":"master","last_synced_at":"2024-04-15T14:21:41.394Z","etag":null,"topics":["compiler","deep-learning","distributed-systems","gpu-memory","memory","python","tensorflow","tensorflow2"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/parasj.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-09-18T21:26:09.000Z","updated_at":"2024-03-25T12:44:44.000Z","dependencies_parsed_at":"2022-07-14T06:40:35.592Z","dependency_job_id":null,"html_url":"https://github.com/parasj/checkmate","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parasj%2Fcheckmate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parasj%2Fcheckmate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parasj%2Fcheckmate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parasj%2Fcheckmate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/parasj","download_url":"https://codeload.github.com/parasj/checkmate/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248565089,"owners_count":21125415,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compiler","deep-learning","distributed-systems","gpu-memory","memory","python","tensorflow","tensorflow2"],"created_at":"2025-04-12T11:59:37.797Z","updated_at":"2025-04-12T11:59:38.261Z","avatar_url":"https://github.com/parasj.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![](https://checkmateai.github.io/img/dark_logo.png)\n\n_See the paper!_ [https://arxiv.org/abs/1910.02653](https://arxiv.org/abs/1910.02653)\n\n`checkmate` breaks the GPU memory wall by enabling researchers to train large state-of-the-art models that do not fit in GPU memory. Checkmate applies optimal tensor rematerialization \\(as detailed in our paper at MLSys 2020\\) to trade off space and time.\n\nAt the moment, Checkmate only supports TensorFlow 2.0. PyTorch support is coming soon!\u003c!-- To follow updates on PyTorch support, please suscribe to our [Google Group](https://groups.google.com/forum/#!forum/checkmate-dev). --\u003e\n\n**IF YOU ARE TRYING TO REPLICATE OUR MLSYS 2020 PAPER, USE THE `mlsys20_artifact` BRANCH!**\n\n## Installation\n\nCheckmate depends on:\n* [TensorFlow 2.0](https://www.tensorflow.org/install), i.e. `pip install tensorflow` or `pip install tensorflow-gpu`.\n* [CyLP solver](https://github.com/coin-or/CyLP)\n    \u003cdetails\u003e\u003csummary\u003eInstalling CyLP on Debian Linux / Ubuntu\u003c/summary\u003e\n    \u003cp\u003e\n\n    ```bash\n    $ sudo apt install coinor-cbc coinor-libcbc-dev\n    $ pip install cylp\n    ```\n    \u003c/p\u003e\n    \u003c/details\u003e\n    \u003cdetails\u003e\u003csummary\u003eInstalling CyLP on MacOS\u003c/summary\u003e\n    \u003cp\u003e\n    \n    The easiest way to set up CyLP is using [homebrew](https://brew.sh/).\n    ```bash\n    $ brew tap coin-or-tools/coinor\n    $ brew install coin-or-tools/coinor/cbc pkg-config\n    $ pip install cylp\n    ```\n    \u003c/p\u003e\n    \u003c/details\u003e\n\n\nOnce TensorFlow 2.0 and CyLP are installed, Checkmate can be installed using pip via `pip install \"https://github.com/parasj/checkmate/archive/master.zip#egg=checkmate\"`.\n\n## Quick start\n\n**Get started in 5m with our** [**TF2.0 quickstart tutorial**](https://colab.research.google.com/github/parasj/checkmate/blob/master/tutorials/tutorial_basic_tf2_example.ipynb)\n\nAdapt your Keras model to fit within the memory constraints of a single GPU:\n\n```python\nimport checkmate\nmodel = tf.keras.applications.vgg19.VGG19(...)\n...\n\ntrain_iteration_fn = checkmate.tf2.compile(model, loss, optimizer,\n    input_spec=sample_input[0], label_spec=sample_input[1])\n\nfor image, label in train_ds:\n    prediction, loss = train_iteration_fn(image, label)\n```\n\n## Key ideas\n\nFrom our [paper at MLSys 2020](https://arxiv.org/abs/1910.02653):\n```text\nModern neural networks are increasingly bottlenecked by the limited capacity of on-device\nGPU memory. Prior work explores dropping activations as a strategy to scale to larger\nneural networks under memory constraints. However, these heuristics assume uniform\nper-layer costs and are limited to simple architectures with linear graphs, limiting their\nusability. In this paper, we formalize the problem of trading-off DNN training time and\nmemory requirements as the tensor rematerialization optimization problem, a generalization\nof prior checkpointing strategies. We introduce Checkmate, a system that solves for\noptimal schedules in reasonable times (under an hour) using off-the-shelf MILP solvers,\nthen uses these schedules to accelerate millions of training iterations. Our method scales\nto complex, realistic architectures and is hardware-aware through the use of\naccelerator-specific, profile-based cost models. In addition to reducing training cost,\nCheckmate enables real-world networks to be trained with up to 5.1× larger input sizes.\n```\n\n## Citation\n\nIf you use Checkmate in your work, please cite us with:\n\n```text\n@incollection{mlsys2020_196,\n author = {Jain, Paras and Jain, Ajay and Nrusimha, Aniruddha and Gholami, Amir and Abbeel, Pieter and Gonzalez, Joseph and Keutzer, Kurt and Stoica, Ion},\n booktitle = {Proceedings of Machine Learning and Systems 2020},\n pages = {497--511},\n title = {Checkmate: Breaking the Memory Wall with Optimal Tensor Rematerialization},\n year = {2020}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparasj%2Fcheckmate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fparasj%2Fcheckmate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparasj%2Fcheckmate/lists"}