https://github.com/pseyfert/tmva-mlp

learning by doing: speeding up tmva's mlp
https://github.com/pseyfert/tmva-mlp

Last synced: about 1 year ago
JSON representation

learning by doing: speeding up tmva's mlp

Host: GitHub
URL: https://github.com/pseyfert/tmva-mlp
Owner: pseyfert
License: lgpl-2.1
Created: 2015-02-08T13:14:51.000Z (over 11 years ago)
Default Branch: master
Last Pushed: 2015-03-19T11:48:59.000Z (about 11 years ago)
Last Synced: 2023-08-02T20:58:42.690Z (almost 3 years ago)
Language: C
Size: 5.56 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          tmva-mlp

========

learning by doing: speeding up tmva's mlp

Motivation

----------

I took an existing [TMVA] [MLP] class.C file and try to make it's evaluation

faster. The project is thought of being potentially interesting to read, the

network itself will probably not be useful to many people. Copying the changes

may or may not be useful.

The steps are:

 * Avoid exp evaluations

 * convert from double precision to float

 * profile the code with callgrind

 * try using math libraries (e.g. eigen)

 * rearrange computations (evaluation -> instantiation)

 * rearrange code to achieve auto vectorisation

 * reading https://research.google.com/pubs/archive/37631.pdf

why this is not just code you can use directly

----------------------------------------------

The work is at the moment very specific to my existing network. This does not

use a decorrelation transformation, which may change the picture. I used my

private simple sigmoid activation function in the hidden layer in the training

[ROOT-7062]. As it turns out, I failed to replace the output layer activation

function (cross entropy estimator in back propagation needs the standard

sigmoid in the training) but since i'm fine with monotoneous transformations of

the evaluated network output, I replaced the output layer activation function

after the training.

The actual profiles of the test case reveal that the test spends many cycles on

i/o and ROOT overhead.  Since in a real use scenario, the data will either come

from different sources or the i/o will anyhow be unavoidable, I still

concentrate on speeding up the MLP.

Warnings

--------

To avoid copying data around, I changed the interface such that the input data

is altered! Eventually, also the offset needs to be supplied in some versions

by the user.

License

-------

The code is released under the [GNU Lesser General Public License, version

2.1][LGPL], the license under which [ROOT] itself is provided.

Machine info

------------

The profiles were generated on an AMD Opteron(tm) Processor 6344.

The tools in use are:

gcc 4.8.1

ROOT 6.03/02

valgrind 3.8.1

[ROOT-7062]: https://sft.its.cern.ch/jira/browse/ROOT-7062

[TMVA]:      http://tmva.sourceforge.net/

[MLP]:       http://en.wikipedia.org/wiki/Multilayer_perceptron

[LGPL]:      https://gnu.org/licenses/old-licenses/lgpl-2.1

[ROOT]:      http://root.cern.ch

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/pseyfert/tmva-mlp

Awesome Lists containing this project

README