https://github.com/prigoyal/pytorch_memonger

Experimental ground for optimizing memory of pytorch models
https://github.com/prigoyal/pytorch_memonger

memory-management pytorch

Last synced: 3 months ago
JSON representation

Experimental ground for optimizing memory of pytorch models

Host: GitHub
URL: https://github.com/prigoyal/pytorch_memonger
Owner: prigoyal
License: gpl-3.0
Created: 2017-12-05T18:37:15.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-04-23T20:24:58.000Z (about 7 years ago)
Last Synced: 2025-04-02T11:05:25.332Z (3 months ago)
Topics: memory-management, pytorch
Language: Python
Homepage:
Size: 203 KB
Stars: 366
Watchers: 10
Forks: 35
Open Issues: 7
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# PyTorch Memory optimizations via gradient checkpointing

This repository contains implementation of various PyTorch models using the
**gradient checkpointing**[1] which allows trading compute for memory and hence allows
training bigger/wider models and use large minibatch sizes.

The application of checkpointing is showcased on various models:

- ResNet
- DenseNet
- LSTM model from pytorch examples [here](https://github.com/pytorch/examples/blob/master/word_language_model/model.py)
- VNet model which is used in medical imaging applications, available [here](https://github.com/mattmacy/vnet.pytorch)

Results of checkpointing on these models are showcased below:

In order to use the models, you need to install PyTorch master following instructions from [here](https://github.com/pytorch/pytorch/#from-source)

To run checkpointed models and their baseline tests, follow the commands below:
```
# for checkpointed
python test_memory_optimized.py

# for baseline
python test_memory_optimized.py
```

## Tutorial

We provide a [tutorial](https://github.com/prigoyal/pytorch_memonger/blob/master/tutorial/Checkpointing_for_PyTorch_models.ipynb) to describe how to use checkpointing for various kinds of
models.

There are few special kinds of layers like Batch normalization, dropout that should
be handled carefully. The details for handling those are also available in the
tutorial

## References

[1]. Siskind, Jeffrey Mark, and Barak A. Pearlmutter. "Divide-and-Conquer Checkpointing for Arbitrary Programs with No User Annotation." arXiv preprint arXiv:1708.06799 (2017).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/prigoyal/pytorch_memonger

Awesome Lists containing this project

README