Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mitscha/strassennets
Deep learning with a multiplication budget
https://github.com/mitscha/strassennets
Last synced: 2 months ago
JSON representation
Deep learning with a multiplication budget
- Host: GitHub
- URL: https://github.com/mitscha/strassennets
- Owner: mitscha
- Created: 2017-12-03T19:29:08.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2018-07-15T13:14:32.000Z (over 6 years ago)
- Last Synced: 2024-08-01T22:41:43.048Z (5 months ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 728 KB
- Stars: 47
- Watchers: 4
- Forks: 10
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- Awesome-MXNet - StrassenNets
README
## StrassenNets
Code to reproduce the experiments in the paper [StrassenNets: Deep learning with a multiplication budget, M. Tschannen, A. Khanna, A. Anandkumar, 2018](https://arxiv.org/abs/1712.03942). In a nutshell, the proposed approach approximates the (generalized) matrix multiplications in deep neural network (DNN) layers by 2-layer sum-product networks (SPNs) and learns the SPNs end-to-end. The CIFAR10 code (Jupyter Notebook) gives an overview of the method.
### Requirements
The code was tested using Python 3.5 and [MXNet](https://github.com/apache/incubator-mxnet) 1.0. The language model code only runs on GPU.
### CIFAR-10
To train Strassen-ResNet-20 on CIFAR-10, use StrassenNetsCIFAR10.ipynb with the [Jupyter Notebook](http://jupyter.org/). For the VGG-inspired 7-layer architecture, run
```
python3 StrassenNetworksCIFAR10_VGG.py --nbr_mul 1 --out_patch 1 --gpu_index 0
```
The multiplication budget can be adjusted using `--nbr_mul` and `--out_patch`.### Penn Tree Bank (PTB)
1. Download the preprocessed PTB training, testing, and validation files e.g. from [here](https://github.com/yoonkim/lstm-char-cnn/tree/master/data/ptb) (the default location used by the training script is `langmod/data/ptb/`, adapt it using the option `--data_dir` in the command below)
2. Use
```
python3 train.py --quant_mode 5 --max_epochs 40 --decay_when 0.5 --learning_rate 2
```
(in `langmod/`) to train the full precision model, and
```
python3 train.py --quant_mode 2 --max_epochs 20 --decay_when 0.5 --nbr_mul 1.0 --out_nbr_mul 2000 --learning_rate 0.2 --epochs_pre 20 --learning_rate_pre 2
```
to train a Strassen language model. To change the multiplication budget, adapt `--nbr_mul` (budget of all but the last layer, relative to the number of hidden units) and `--out_nbr_mul` (budget of the last layer). The log files and trained models are stored in the folder `langmod/logs/`. To train with a full precision teacher model, use the option `--teacher_model `.