Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/an-kumar/caffe-theano-conversion


https://github.com/an-kumar/caffe-theano-conversion

Last synced: 18 days ago
JSON representation

Lists

README

        

# caffe_theano_conversion
This is part of a project for CS231N at Stanford University, written by Ankit Kumar, Mathematics major, Class of 2015

This is a repository that allows you to convert pretrained caffe models into models in Lasagne, a thin wrapper around Theano. You can also convert a caffe model's architecture to an equivalent one in Lasagne. You do not need caffe installed to use this module.

Currently, the following caffe layers are supported:

* Convolution
* LRN
* Pooling
* Inner Product
* Relu
* Softmax

You can also load in a mean file using conversion's convert_mean_file function. Future work is to put that in the conversion of net architectures itself, which might have a data layer with transformparameter. That can be automated.

Right now, you have to put the cost layer on yourself, as well as do the backprop code. This is a future step for me, however, Lasagne is very easy to use and you can learn how to add your own stuff very easily. I want to keep this as configurable as possible because that's the benefit of theano.

## DEPENDENCIES:
Theano (http://deeplearning.net/software/theano/) needs to be bleeding-edge:
```
pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git
```

Lasagne (https://github.com/benanne/Lasagne), Google protobuf. Pylearn2 for cuda convnet wrappers (see below). If you have Caffe installed, you already have google protobuf, otherwise see here: https://code.google.com/p/protobuf/

##USING CUDA CONVNET WRAPPERS:

Install pylearn2 as develop:
```
git clone git://github.com/lisa-lab/pylearn2.git
cd pylearn2
python setup.py develop
```

The cuda-convnet wrappers in pylearn2 are much faster than the GPU implementations of convolutions in Theano. Lasagne has cuda-convnet layers, and I have created a caffe version of these layers. However, they require you to go into pylearn2 and change some of the files. I don't know what the best way to package that change in this repo is, so until someone tells me a better way I'll just describe what to do:

In this file: https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/sandbox/cuda_convnet/base_acts.py, change:
```
class BaseActs(GpuOp):
"""
Shared code for wrapping various convnet operations.
"""
def __init__(self, pad=0, partial_sum=None, stride=1):
```
to:
```
class BaseActs(GpuOp):
"""
Shared code for wrapping various convnet operations.
"""
def __init__(self, pad=0, partial_sum=None, stride=1, numGroups=1):
```

Then, in that init function, change:

```
self.dense_connectivity = True
```
to:
```
if numGroups == 1:
self.dense_connectivity = True
else:
self.dense_connectivity = False
```

and add a line:
```
self.numGroups = numGroups
```

then, in this file: https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/sandbox/cuda_convnet/filter_acts.py, change FilterActs c_code function from:

```
if self.dense_connectivity:
basic_setup += """
#define numGroups 1
"""
```

to:
```
if self.dense_connectivity:
basic_setup += """
#define numGroups 1
"""
else:
basic_setup += """
#define numGroups %s
""" % self.numGroups
```

You should be able to not have that if statement at all, but I kept it in. You will have to do the same change (the #define numGroups) change for ImageActs and WeightActs as well if you want to backprop through the convolutional layers; if you just want to run the forward function, that isn't needed.

## USAGE:
You can test the repo by python tests.py.
All the tests should pass, but **terminal is out of GPUs, so I haven't been able to run the tests.py script on GPU**. However, I have used this repo with GPUs, and it worked, the only question is if it still works after I moved files around. If it doesn't let me know.

The file conversion.py has a function ```convert``` that takes a prototxt file and a caffemodel file, and returns a lasagne base model. I have plans to superclass base models with other models, for the purposes of training, but I haven't yet best figured out how to do this. This repo is still in active development while I use it for my project. If you have ideas for additions, let me know.

usage:
```
>>> from conversion import convert
>>> from models import *
>>> lmodel = convert('/path/to/deploy.prototxt', '/path/to/pretrained_model.caffemodel')
>>> dump(lmodel, 'filename')
```