https://github.com/gpleiss/efficient_densenet_pytorch

A memory-efficient implementation of DenseNets
https://github.com/gpleiss/efficient_densenet_pytorch

deep-learning densenet pytorch

Last synced: about 2 months ago
JSON representation

A memory-efficient implementation of DenseNets

Host: GitHub
URL: https://github.com/gpleiss/efficient_densenet_pytorch
Owner: gpleiss
License: mit
Created: 2017-05-31T17:19:25.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2023-06-01T13:11:51.000Z (about 2 years ago)
Last Synced: 2025-04-08T00:37:21.313Z (3 months ago)
Topics: deep-learning, densenet, pytorch
Language: Python
Homepage:
Size: 1.09 MB
Stars: 1,531
Watchers: 44
Forks: 325
Open Issues: 12
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # efficient_densenet_pytorch

A PyTorch >=1.0 implementation of DenseNets, optimized to save GPU memory.

## Recent updates

1. **Now works on PyTorch 1.0!** It uses the checkpointing feature, which makes this code WAY more efficient!!!

## Motivation

While DenseNets are fairly easy to implement in deep learning frameworks, most

implmementations (such as the [original](https://github.com/liuzhuang13/DenseNet)) tend to be memory-hungry.

In particular, the number of intermediate feature maps generated by batch normalization and concatenation operations

grows quadratically with network depth.

*It is worth emphasizing that this is not a property inherent to DenseNets, but rather to the implementation.*

This implementation uses a new strategy to reduce the memory consumption of DenseNets.

We use [checkpointing](https://pytorch.org/docs/stable/checkpoint.html?highlight=checkpointing) to compute the Batch Norm and concatenation feature maps.

These intermediate feature maps are discarded during the forward pass and recomputed for the backward pass.

This adds 15-20% of time overhead for training, but **reduces feature map consumption from quadratic to linear.**

This implementation is inspired by this [technical report](https://arxiv.org/pdf/1707.06990.pdf), which outlines a strategy for efficient DenseNets via memory sharing.

## Requirements

- PyTorch >=1.0.0

- CUDA

## Usage

**In your existing project:**

There is one file in the `models` folder.

 - `models/densenet.py` is an implementation based off the [torchvision](https://github.com/pytorch/vision/blob/master/torchvision/models/densenet.py) and

[project killer](https://github.com/felixgwu/img_classification_pk_pytorch/blob/master/models/densenet.py) implementations.

If you care about speed, and memory is not an option, pass the `efficient=False` argument into the `DenseNet` constructor.

Otherwise, pass in `efficient=True`.

**Options:**

- All options are described in [the docstrings of the model files](https://github.com/gpleiss/efficient_densenet_pytorch/blob/master/models/densenet_efficient.py#L189)

- The depth is controlled by `block_config` option

- `efficient=True` uses the memory-efficient version

- If you want to use the model for ImageNet, set `small_inputs=False`. For CIFAR or SVHN, set `small_inputs=True`.

**Running the demo:**

The only extra package you need to install is [python-fire](https://github.com/google/python-fire):

```sh

pip install fire

```

- Single GPU:

```sh

CUDA_VISIBLE_DEVICES=0 python demo.py --efficient True --data  --save 

```

- Multiple GPU:

```sh

CUDA_VISIBLE_DEVICES=0,1,2 python demo.py --efficient True --data  --save 

```

Options:

- `--depth` (int) - depth of the network (number of convolution layers) (default 40)

- `--growth_rate` (int) - number of features added per DenseNet layer (default 12)

- `--n_epochs` (int) - number of epochs for training (default 300)

- `--batch_size` (int) - size of minibatch (default 256)

- `--seed` (int) - manually set the random seed (default None)

## Performance

A comparison of the two implementations (each is a DenseNet-BC with 100 layers, batch size 64, tested on a NVIDIA Pascal Titan-X):

| Implementation | Memory cosumption (GB/GPU) | Speed (sec/mini batch) |

|----------------|------------------------|------------------------|

| Naive          |  2.863  | 0.165                  |

| Efficient      |  1.605  | 0.207                  |

| Efficient (multi-GPU)      |  0.985  | -                  |

## Other efficient implementations

- [LuaTorch](https://github.com/liuzhuang13/DenseNet/tree/master/models) (by Gao Huang)

- [Tensorflow](https://github.com/joeyearsley/efficient_densenet_tensorflow) (by Joe Yearsley)

- [Caffe](https://github.com/Tongcheng/DN_CaffeScript) (by Tongcheng Li)

## Reference

```

@article{pleiss2017memory,

  title={Memory-Efficient Implementation of DenseNets},

  author={Pleiss, Geoff and Chen, Danlu and Huang, Gao and Li, Tongcheng and van der Maaten, Laurens and Weinberger, Kilian Q},

  journal={arXiv preprint arXiv:1707.06990},

  year={2017}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/gpleiss/efficient_densenet_pytorch

Awesome Lists containing this project

README