Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/manuelblancovalentin/ethw
Edge Trainable Hardware repo.
https://github.com/manuelblancovalentin/ethw
Last synced: 25 days ago
JSON representation
Edge Trainable Hardware repo.
- Host: GitHub
- URL: https://github.com/manuelblancovalentin/ethw
- Owner: manuelblancovalentin
- Created: 2023-01-31T19:41:59.000Z (almost 2 years ago)
- Default Branch: master
- Last Pushed: 2023-03-16T05:30:57.000Z (almost 2 years ago)
- Last Synced: 2024-11-08T10:34:32.317Z (3 months ago)
- Language: C++
- Size: 67.7 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Edge Trainable Hardware (ETHW)
Author: Manuel Blanco Valentin ([email protected])
Supervisor: Seda Memik ([email protected])## Structure of this repo
* [README.md](README.md): This file. Contains information regarding this repo, the research and the files found in this directory.
* **datasets**: Folder containing dummy datasets created just to
* **docs**: Folder containing documentation and literature regarding the research.
* **tutorials**: Folder containing tutorials and external repos regarding hls4ml, ML, conversion of models, etc.
* hls4ml-tutorial: Cloned from the [hls4ml-tutorial repo](https://github.com/fastmachinelearning/hls4ml-tutorial). Contains some tutorials on how to use hls4ml to create and convert a ml model into synthesizable code.
* custom: These are custom tutorials created in the process and research of turning hls4ml models into fully on-edge trainable models.
## 1. TutorialsBefore diving into the creation of an ML model for an actual application, let's start with the tutorials.
## 1. Getting used to hls4ml
### 1.1. hls4ml-tutorial
For this, the first thing to do is to create a conda environment just for hls4ml. Make sure you have conda with python3 installed and create a new environment using the *environment.yml* file in *tutorials/hls4ml-tutorial/environment.yml*.```shell
conda env create -f environment.yml
conda activate hls4ml-tutorial
```### 1.2. Compile the neural network using hls4ml and qkeras
Follow tutorials/custom/part1_custom_dummy_forward_network.ipynb### 1.3. Run vivado hls directly from the cpp generated by hls4ml
```bash
cd /home/manuelbv/ETHW/tutorials/custom/model_dummy_forward/hls4ml_prj
vivado_hls build_prj.tcl
```This should generate a folder named "myproject_prj"
```bash
vivado_hls -p "myproject_prj"
```GIUSEPPE'S COMMENT:
> We might want to start using floating point instead of fixed or fixed with a lot of bits <128,64>, to make sure that we actually "force" the RTL synthesis to generate MACs, cause for very simple models with very simple arithmetic it might happen that vivado just creates other logic instead of MACs.
Take a look at the documentation/paper of QKeras to see how they implemented backprop in there. Did they use floating point for backprop or fixed point?Okay, so let's implement giuseppe's suggestion for now. Let's open `myproject.cpp` and let's go to the declaration of `myproject.h` and inside to the declaration of `defines.h`. Then let's change the typedef of all variables to something ridiculous like `<128,64>`, like so:
```cpp
// [@manuelbv]: Changed this to a very large precision for the manual testing/computation of loss
typedef ap_fixed<128,64> model_default_t;
typedef ap_fixed<128,64> input_t;
typedef ap_fixed<128,64> layer2_t;
typedef ap_fixed<128,64> weight2_t;
typedef ap_uint<1> bias2_t;
typedef ap_fixed<128,64,AP_RND,AP_SAT> result_t;
```Save the file and now run the simulation. To run the simulation, simply click the button highlighted in the following screenshot.
![a](docs/imgs/custom_tutorial0_running_csim_tb.png)
The simulation should run, but it should tell you that it wasn't able to find the tb_input_data. That's fine, let's now create two files in
`/home/manuelbv/ETHW/tutorials/custom/model_dummy_forward/hls4ml_prj/tb_data/`One will be called `tb_input_features.dat` and will contain simply the number `1.0`:
```text
1.0
```The second will be called `tb_output_predictions.dat` and will contain simple the number `0.5`:
```text
0.5
```Now re-run simulation using vivado's gui and a message like the following in the log should appear:
```text
INFO: [SIM 211-4] CSIM will launch GCC as the compiler.
Compiling ../../../../myproject_test.cpp in debug mode
Compiling ../../../../firmware/myproject.cpp in debug mode
Generating csim.exe
Processing input 0
Predictions
0.5
Quantized predictions
0.5
INFO: Saved inference results to file: tb_data/csim_results.log
INFO: [SIM 211-1] CSim done with 0 errors.
INFO: [SIM 211-3] *************** CSIM finish ***************
Finished C simulation.
```Beautiful, this means our simulation is working and it's taking the predictions, as expected. Let's now move-on and start implementing backprop.
## 2. Implementing backprop in c++
We are going to modify the c++ scripts generated automatically by hls4ml, so it's a good thing if you get acquainted with whatever hls4ml generates (the translation from qkeras/keras to c++).
### 2.0. Backprop recap
This section is under construction. I'll add any further info about backpropagation and how it works as I need to implement each step of the process.
These are the steps I hope to divide this section (and the implementation) into:
* Computation of the loss at the final layer
* Computation of the gradient at the final layer
* Propagation of the gradient for previous layers
* Update of the weights & biases### 2.1. Computation of losses at the final layer
To see how to integrate these losses to the cpp code go to `2.1.x. Integration`
#### 2.1.1. MSE/MAE
Let's create the cpp code that computes mse and mae computation
```cpp
```
### 2.1.x. Integration
Let's now integrate the computation of the losses with the cpp code we got from hls4ml
```cpp
void myproject(
input_t fc1_input[N_INPUT_1_1],
result_t layer3_out[N_LAYER_2],
unsigned short &const_size_in_1,
unsigned short &const_size_out_1
) {
...
}
```### 3. Integration to hls4ml
Here I present the changes I needed to apply to HLS4ML to adapt it to generate models with training capabilities.
#### 3.1. Setting up the env
The first thing I tried was cloning the hls4ml main branch and modifying that, however this caused an error down the line, cause apparently we need to use `hls4ml[profiling]` instead of hls4ml. I couldn't find the git repo for `hls4ml[profiling]`, so what I did is the following:
- First, install all the dependencies by using the environment.yml in the hls4ml-tutorial dir/repo.
- Then, copy the hls4ml folder from `~//anaconda3/envs/hls4ml-tutorial/lib/python3.8/site-packages/hls4ml` to `~/ETHW/`.
- After that, simply uninstall `hls4ml` using the `--force` flag (this will uninstall only hls4ml, without the dependencies): `conda uninstall --force hls4ml[profiling]`
- Now, in your code, add the following lines to import the local hls4ml code:
```python
import sys
sys.path.append('/home/manuelbv/ETHW')
import hls4ml
```Note: After doing this, you might get dependency errors like numpy complaints and stuff. If so, just reinstall numpy with something like `conda install numpy` or so, and it should work...
Now let's go into the repo and create a new branch for this project
```shell
cd ~/ETHW/hls4ml && git checkout -b ethw
```#### 3.2. hls4ml reverse engineering
As shown previously, the way we organized our project is by creating custom extra c++ headers implementing the backprop layer-bylayer, which are then used as templates "pulled" by hls4ml, and instantiated in the final neural network c++ code. This means we basically need to add those templates somewhere in the hls4ml source dir structure, and ask it to pull them when generating the neural network.That is, of course, if the user actually wants the final neural network to have training capabilities. We want to retain the original behavior of hls4ml, so the first thing we need to do is to allow the user to decide whether the final network should be trainable or not. This yells "flag".
If we open the jupyter notebook in `tutorials/custom/part1_custom_dummy_forward_network.ipynb` and look at the instruction that starts the building process (translation from keras to c++), we can find something like the following instruction at some point:
```python
ls_model = hls4ml.converters.convert_from_keras_model(model,
hls_config=config,
output_dir=f'{model_name}/hls4ml_prj',
part='xcu250-figd2104-2L-e')
```Alright, here comes the first decision we need to make. It's clear that we need to integrate the "trainable" flag in this instruction, but how should we pass it? One option would be to pass it as a global flag directly to "convert_from_keras_model", something like:
```python
ls_model = hls4ml.converters.convert_from_keras_model(model,
trainable = True,
hls_config=config,
output_dir=f'{model_name}/hls4ml_prj',
part='xcu250-figd2104-2L-e')
```However, this wouldn't give the user much control over the trainability of the network. This would be a global setting, that would result in ALL the layers in the structure to be trainable. We know that adding the backprop structure results in a big area overhead, so what if the user only wants to train a specific part of the network and freeze the rest of it? Wouldn't it be better if the user could specify the trainability of individual layers?
If we inspect the instruction above we see we are also passing a variable `config`. Pretty self-explanatory variable name, but let's take a look at it and see what contains. We see this config is generated by a previous instruction:
```python
config = hls4ml.utils.config_from_keras_model(model, granularity='name')
```Now if we print this we get something like:
```
-----------------------------------
Model
Precision: ap_fixed<16,6>
ReuseFactor: 1
Strategy: Latency
LayerName
fc1_input
Precision
result: ap_fixed<16,6>
fc1
Precision
weight: ap_fixed<6,1>
bias: ap_fixed<6,1>
ReuseFactor: 1
-----------------------------------
```This is cool. The config contains info about specific properties of each layer. Like for instance, you see the fc1 layer's precision for weights and biases, as well as a parameter called ReuseFactor in there. We could add our trainability there. An extra configuration for the trainability of our network, per layer.
What's more, if we investigate a bit further the tutorial code, we see a line similar to:
```python
#config['LayerName']['softmax']['exp_table_t'] = 'ap_fixed<18,8>'
```This is telling us we have the ability to access and set specific configuration parameters of our model. In the previous case we are setting the precision of a layer called "LayerName" to be `ap_fixed<18,8>`. We could use this to set the trainability of specific layers of our model.
Furthermore, we could add a global flag that we pass to `hls4ml.utils.config_from_keras_model(...)` which enables global trainability, something like `hls4ml.utils.config_from_keras_model(..., trainable = True, ...)`, which would make all the layers trainable, or in case the user wants to control the trainability of specific layers, then simply set that flag to `False` and modify the `config['LayerName']['...']['trainable'] = 'True'` layer per layer.
So let's do that.
#### 3.2.1. Hacking the hls4ml config structure to add global/local trainability of layers
Let's find the definition of function `hls4ml.utils.config_from_keras_model`. Open `~/ETHW/hls4ml/hls4ml/utils/config.py` and take a look at it. Search for `config_from_keras_model`: that's the definition we were looking for.If we go a bit down in that code we'll find the following definition for the `make_layer_config` function. This method sets specific configs per layer.
```python
...
def make_layer_config(layer):
layer_config = {}
if layer['class_name'] in dense_layers + conv_layers:
layer_config['Precision'] = {}
layer_config['Precision']['weight'] = default_precision
layer_config['Precision']['bias'] = default_precision
layer_config['Precision']['result'] = default_precision
layer_config['ReuseFactor'] = default_reuse_factorelif layer['class_name'] in activation_layers:
layer_config['Precision'] = default_precision
layer_config['ReuseFactor'] = default_reuse_factor
layer_config['table_size'] = 1024
...
```This is where we can integrate our global flag `trainable`. So the first thing we want to do is to go to the definition of this function `config_from_keras_model` and change it to:
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.1` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
# [@manuelbv]: CHANGELOG_a.1 I added the flag "trainable = False" to allow the user to implement trainable layers
def config_from_keras_model(
model, granularity='model', backend=None, default_precision='fixed<16,6>', default_reuse_factor=1,
trainable=False
):
...
```Now let's implement the point where we use this variable to specify whether layers are trainable or not. So let's go back to the `make_layer_config` definition, add the trainability def for dense layers, conv layers and activation layers (why activation layers? Because even if there's no trainable weight, we need to create the structure that propagates the gradients thru them):
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.2` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
if layer['class_name'] in dense_layers + conv_layers:
layer_config['Precision'] = {}
layer_config['Precision']['weight'] = default_precision
layer_config['Precision']['bias'] = default_precision
layer_config['Precision']['result'] = default_precision
layer_config['ReuseFactor'] = default_reuse_factor
# [@manuelbv]: CHANGELOG_a.2 Setting the trainability of specific layers
layer_config['Trainable'] = trainable
```Now let's do the same for activation layers:
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.3` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
elif layer['class_name'] in activation_layers:
layer_config['Precision'] = default_precision
layer_config['ReuseFactor'] = default_reuse_factor
layer_config['table_size'] = 1024
is_softmax = layer['class_name'] == 'Softmax'
if 'config' in layer.keys():
if 'activation' in layer['config'].keys():
is_softmax = is_softmax or (layer['config']['activation'] == 'softmax')
if is_softmax:
layer_config['exp_table_t'] = 'ap_fixed<18,8,AP_RND,AP_SAT>'
layer_config['inv_table_t'] = 'ap_fixed<18,8,AP_RND,AP_SAT>'
else:
layer_config['table_t'] = 'ap_fixed<18,8>'
# [@manuelbv]: CHANGELOG_a.3 Setting the trainability of specific layers
layer_config['Trainable'] = trainable
```And for the qkeras_layers:
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.4` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
elif layer['class_name'] in qkeras_layers:
if 'precision' in layer:
layer_config['Precision'] = {}
for name, precision in layer['precision'].items():
layer_config['Precision'][name] = precision
else:
print('WARNING: Found no precision information in QKeras layer {} ({})'.format(layer['name'], layer['class_name']))
layer_config['Precision'] = default_precision
layer_config['ReuseFactor'] = default_reuse_factor
# [@manuelbv]: CHANGELOG_a.4 Setting the trainability of specific layers
layer_config['Trainable'] = trainable
```Finally, let's also add a global flag of trainability. I'm antecipating that this will help us further down the line, to manage whether we need to initialize the global structure for trainable networks or not. So go to the part of the code where the config is initialized, something like `config = {}`, and change it to the following (note that we are adding the last two lines, basically, the `model_config['Trainable'] = trainable` is the important part there, the rest is added for reference):
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.5` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
model_config = {}
model_config['Precision'] = default_precision
model_config['ReuseFactor'] = default_reuse_factor
model_config['Strategy'] = 'Latency'
#model_config['Compression'] = False
#model_config['Trace'] = False
# [@manuelbv]: CHANGELOG_a.5 Setting the trainability of global model
model_config['Trainable'] = trainable
```Let's test this out. Restart the jupyter notebook kernel where you were running `part2_custom_dummy_2weights_forward_network.ipynb` and re-run everything til you reach the config part.
Now if we print the config, we should see the trainable flags in there. Beautiful!
```
-----------------------------------
Model
Precision: ap_fixed<16,6>
ReuseFactor: 1
Strategy: Latency
Trainable: True
LayerName
fc1_input
Precision
result: ap_fixed<16,6>
fc1
Precision
weight: ap_fixed<6,1>
bias: ap_fixed<6,1>
ReuseFactor: 1
Trainable: True
-----------------------------------
```Note that we could specify the trainability of a specific layer now by running something like:
```python
config['LayerName']['fc1']['Trainable'] = False
```#### 3.2.2. Hacking the converter method
The function that effectively converts our keras/qkeras model into actual c++ code is the `hls4ml.converters.convert_from_keras_model` method. So let's open the file `~/ETHW/hls4ml/converters/__init__.py`. We can see the definition for `convert_from_keras_model` is there.In our tutorial (jupyter notebook) we see that we invoke this instruction and pass the config dictionary we generated in the previous step by using the `hls_config` flag, like so:
```python
hls_model = hls4ml.converters.convert_from_keras_model(model,
hls_config=config,
output_dir=f'{model_name}/hls4ml_prj',
part='xcu250-figd2104-2L-e')
```Now, looking at the definition of this method `convert_from_keras_model` we see that apart from checking some stuff, the bulk of the conversion is actually run at the last line, in the return statement itself, which invokes `keras_to_hls(config)`. This is what we want to check, so let's fetch the definition for that function. Open file `~/ETHW/hls4ml/converters/keras_to_hls.py` and search for `keras_to_hls`.
The part that actually writes out the files (cpp files) is `hls_model.compile()`. We see that our hls_model object is generated when we invoke `convert_from_keras_model`. Thus, let's actually open `~/ETHW/hls4ml/model/hls_model.py` and look at the compile method for HLSModel. In here you will see a `self.write()` method. Follow that path. Inside the `write` method you will see we are calling a `self.config.writer.write_hls`. We can see that the config object (HLSConfig) is defined in the `__init__` method when we create the HLSModel object (`self.config = HLSConfig(config)`), so let's go to the definition of the `HLSConfig` class.
In the `HLSConfig` `__init__` method we see we initialize the writer attribute by calling a function `get_writer`. This function is imported from `hls4ml.writer.get_writer` so let's open file `~/ETHW/hls4ml/writer/__init__.py` and take a look at the definitions of the writers. Here we see that we are registering the class `VivadoWriter` as `Vivado`. Cool. In here we are also importing the `get_writer` method from `~/ETHW/hls4ml/writer/writers.py`. Let's open this file. We can see that get_writer points to `VivadoWriter` which is imported from `~/ETHW/hls4ml/writer/vivado_writer.py`. And finally, here, in this file, is where we have everything we will need to tweak in the `VivadoWriter` to make sure we pull the right templates we modified to add the backprop functionalities. Keep this file open, cause we will need it in a second.
Now, let's go back to the `compile` method inside `hls_model.py`. We can see that inside this method most of the work is executed by invoking the `write` method, which by its turn invokes the `writer` (remember, the `VivadoWriter` object we just saw) and its `write_hls` method. So let's go back to the `VivadoWriter` class definition and search for this `write_hls` method. You should see something like this:
```python
...
def write_hls(self, model):
print('Writing HLS project')
self.write_project_dir(model)
self.write_project_cpp(model)
self.write_project_header(model)
self.write_weights(model)
self.write_defines(model)
self.write_parameters(model)
self.write_test_bench(model)
self.write_bridge(model)
self.write_build_script(model)
self.write_nnet_utils(model)
self.write_yml(model)
self.write_tar(model)
print('Done')
...
```Let's see what these steps are doing. One by one, we might skip those that are irrelevant for us.
**write_project_dir**:
This method simply creates the folder structure `{}/firmware/weights` in the output directory. We don't need to modify this.**write_project_cpp**:
This method, on the other hand, is prob gonna take us quite some time to analyze. So let's take it easy and analyze it step by step so we make sure we aren't skipping anything important.The method starts with something very simple:
- We open a template file for reading (f)
- We open the output cpp code we will generate for writing (fout).```python
def write_project_cpp(self, model):
###################
## myproject.cpp
###################filedir = os.path.dirname(os.path.abspath(__file__))
f = open(os.path.join(filedir,'../templates/vivado/firmware/myproject.cpp'),'r')
fout = open('{}/firmware/{}.cpp'.format(model.config.get_output_dir(), model.config.get_project_name()),'w')...
```
Let's take a look at this template we are picking and see what's defined in there. Open `~/ETHW/hls4ml/templates/vivado/firmware/myproject.cpp`. First of all, this file looks pretty empty for a template, but something interesting can be seen in it. We can see some interesting comments like:
```cpp
//hls-fpga-machine-learning insert header
```I'm antecipating that these will be fetch by the writer, and then stuff will be inserted in-place. This is a good method to populate template files. Keep this in mind, cause in the future we might have to add our own comments to add parts of our backpropagating structure.
Let's go back to the `write_project_cpp` method and keep going. After the previous block, we basically fetch the model inputs, outputs and brams.
```python
...
model_inputs = model.get_input_variables()
model_outputs = model.get_output_variables()
model_brams = model.get_bram_variables()indent = ' '
...
```Now, after this, we basically start looping thru each one of the lines in our template file and parsing them. As we expected, depending on what that line contains (or, more specifically, if that line contains a specific comment), we might write some specific definition for our neural network or just copy the original one.
First check: if we find `myproject` in the line, we replace that with the project name (which, to be fair, will be `myproject` most of the time, unless the user changed it). We don't need to change anything here.
```python
...
for line in f.readlines():
#Add headers to weights and biases
if 'myproject' in line:
newline = line.replace('myproject', model.config.get_project_name())
...
```Second check: if we find the comment `//hls-fpga-machine-learning insert header` we will insert the header.
```python
...
elif '//hls-fpga-machine-learning insert header' in line:
inputs_str = ', '.join([self.variable_definition_cpp(model, i, as_reference=True) for i in model_inputs])
outputs_str = ', '.join([self.variable_definition_cpp(model, o, as_reference=True) for o in model_outputs])
brams_str = ', \n'.join([indent + self.variable_definition_cpp(model, b, as_reference=False) for b in model_brams])
insize_str = ', '.join(['unsigned short &const_size_in_{}'.format(i) for i in range(1, len(model_inputs) + 1)])
outsize_str = ', '.join(['unsigned short &const_size_out_{}'.format(i) for i in range(1, len(model_outputs) + 1)])newline = ''
newline += indent + inputs_str + ',\n'
newline += indent + outputs_str + ',\n'
if len(model_brams) > 0:
newline += brams_str + ',\n'
newline += indent + insize_str + ',\n'
newline += indent + outsize_str + '\n'
...
```What does the header consists on? Let's take a closer look at the changes we are applying. Here we can see we are basically defining the ports of our cpp module. For instance, for each model_input we call the `self.variable_definition_cpp` method. Let's find the definition of this method and see what it does:
```python
...
def variable_definition_cpp(self, model, var, name_suffix='', as_reference=False):
var_class = var.__class__.__name__
if var_class == 'ArrayVariable':
return '{type} {name}{suffix}[{shape}]'.format(type=var.type.name, name=var.cppname, suffix=name_suffix, shape=var.size_cpp())
elif var_class == 'StreamVariable':
if as_reference: # Function parameter
return 'hls::stream<{type}> &{name}{suffix}'.format(type=var.type.name, name=var.cppname, suffix=name_suffix)
else: # Declaration
return 'hls::stream<{type}> {name}{suffix}("{name}")'.format(type=var.type.name, name=var.cppname, suffix=name_suffix)
elif var_class == 'WeightVariable':
return '{type} {name}{suffix}[{size}]'.format(type=var.type.name, name=var.cppname, suffix=name_suffix, size=var.data_length)
elif var_class == 'InplaceVariable':
return None
else:
raise Exception('Unknown variable class "{}"'.format(var_class))
...
```For instance, for the simple linear NN we created with 1 input and 1 output, and 2 weights, the fc1_input layer of the network will be represented as an "ArrayVariable" in the model_inputs var. Its type is `input_t`, which was defined previously. Cppname will be `fc1_input`. Name suffix should be empty. Size_cpp() should return `N_INPUT_1_1`. We might have to go back to the definition of these types, as well as the sizes, but for now, let's skip this. In any case, we see that the definition for this variable should be `input_t fc1_input[N_INPUT_1_1]`, which is exactly what we get if we execute `self.variable_definition_cpp(model, model_inputs[0], as_reference=True)` in our python debugger.
The same thing applies to the outputs and the brams.
Okay, so the first thing we need to do is to modify the template for myproject.cpp that we opened before. Open it again and go to the part that has the following definition.
```cpp
...
#include "myproject.h"
#include "parameters.h"void myproject(
...
```In here, we basically want to add the following comment: `//hls-fpga-machine-learning insert autograd-def`
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.6` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```cpp
...
#include "myproject.h"
#include "parameters.h"//hls-fpga-machine-learning insert autograd-def
void myproject(
...
```This is because, in case our network needs to be trainable, we will need to change this comment by the following block, which as you can see, simply includes the definition of the losses.
```cpp
// -------------------------- AUTOGRAD --------------------------
// [@manuelbv]: Manually including parameters for autograd
#include "losses/losses_parameters.h"
// --------------------------------------------------------------
```Great. Now that we have that, let's go back to the `vivado_writer.py` file and add this file to be parsed during the write_project_cpp method. We will add this right below the `if 'myproject' in line:` check.
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.7` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
...
elif '//hls-fpga-machine-learning insert autograd-def' in line:
# [@manuelbv]: CHANGELOG_a.6 If this is a trainable network, include autograd losses definition
if model.config.config['HLSConfig']['Model']['Trainable']:
newline = "// -------------------------- AUTOGRAD --------------------------\n"
newline += "// [@manuelbv]: Manually including parameters for autograd\n"
newline += '#include "losses/losses_parameters.h"\n'
newline += "// --------------------------------------------------------------\n"...
```The next change we have to make to hls4ml is, in case this is a trainable network, we must declare two new ports: the loss, and the ground truth value. The loss will hold the loss value computed after the forward pass for whatever data we inputed to the network thru the inputs. It's of the same type as the result (output of the network), except that the size will always be 1, cause it's a simple number. We might want to change this in the future, cause using the same type for the loss, as the output type might not be ideal. Maybe we want to use something like a predefined fixed point number. Idk. But for now, this just makes everything easier. The ground truth is very easy to implement too, because it's basically a clone (type-wise) of the output(s) of the network.
Thus, let's go back to the `elif '//hls-fpga-machine-learning insert header' in line:` check in the `vivado_writer.py` file and let's modify it to the following code:
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.8` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
...
elif '//hls-fpga-machine-learning insert header' in line:
inputs_str = ', '.join([self.variable_definition_cpp(model, i, as_reference=True) for i in model_inputs])
outputs_str = ', '.join([self.variable_definition_cpp(model, o, as_reference=True) for o in model_outputs])
brams_str = ', \n'.join([indent + self.variable_definition_cpp(model, b, as_reference=False) for b in model_brams])#[@manuelbv]: CHANGELOG_a.8 If this is a trainable network, then we need to add the loss definitions as pointers
_autograd_loss_definition = lambda var: '{type} &loss_{name}'.format(type=var.type.name, name=var.cppname)
loss_str = ', '.join([_autograd_loss_definition(o) for o in model_outputs])
ground_truth_str = ', '.join([self.variable_definition_cpp(model, o, as_reference=True, name_suffix="_ground_truth") for o in model_outputs])insize_str = ', '.join(['unsigned short &const_size_in_{}'.format(i) for i in range(1, len(model_inputs) + 1)])
outsize_str = ', '.join(['unsigned short &const_size_out_{}'.format(i) for i in range(1, len(model_outputs) + 1)])newline = ''
newline += indent + inputs_str + ',\n'
newline += indent + outputs_str + ',\n'
if len(model_brams) > 0:
newline += brams_str + ',\n'
# [@manuelbv]: If model is trainable, add loss + ground truth IOs
if model.config.config['HLSConfig']['Model']['Trainable']:
newline += indent + loss_str + ',\n'
newline += indent + insize_str + ',\n'
newline += indent + outsize_str + ',\n'
newline += indent + ground_truth_str + '\n'
else:
newline += indent + insize_str + ',\n'
newline += indent + outsize_str + '\n'
...
```Let's go back to the template for a second and add a second comment that will allow us to include some definitions for the backprop layer wrappers. We will change the following section:
```cpp
...
#endif// ****************************************
// NETWORK INSTANTIATION
// ****************************************//hls-fpga-machine-learning insert layers
}
...
```and we will add the following comment: `//hls-fpga-machine-learning autograd-layer-wrappers`
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.9` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```cpp
...
#endif// ****************************************
// NETWORK INSTANTIATION
// ****************************************//hls-fpga-machine-learning insert layers
//hls-fpga-machine-learning autograd-layer-wrappers
}
...
```And now, I just realized that we forgot to add something to the configuration inside the `hls4ml/utils/config.py` block. Let's go back there and add something else. We forgot to add the loss definition in the config. Open `~/ETHW/hls4ml/utils/config.py` and go to where we defined `model_config['Trainable'] = trainable` (CHANGELOG_a.5). Let's add the following after that:
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.10` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python# [@manuelbv]: CHANGELOG_a.10 Add definition of the losses for future use when instantiating the grads and losses
model_config['Losses'] = list(model.loss)```
Now that we have that comment in the template, let's again modify the writer code so we add a new condition to parse this comment and add whatever we need to add. So fly back to `vivado_writer.py` file and let's modify it to the following code after the `//hls-fpga-machine-learning insert layers` checking:
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.11` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
...
elif '//hls-fpga-machine-learning autograd-layer-wrappers' in line:
# [@manuelbv]: CHANGELOG_a.11 If this is a trainable network, include the definition of the autograd layer wrappers
if model.config.config['HLSConfig']['Model']['Trainable']:
# [@manuelbv]: Add a comment for later change traceability
newline = ' ' + "// -------------------------------- AUTOGRAD ---------------------------------\n"
newline += ' ' + '// [@manuelbv]: Instantiation of grads and computation of loss for each output\n'
newline += ' ' + "// ---------------------------------------------------------------------------\n"# [@manuelbv]: Get outputs
outputs = model.get_output_variables()# [@manuelbv]: Get losses
losses = model.config.config['HLSConfig']['Model']['Losses']# [@manuelbv]: Loop thru outputs
for no,(o,lo) in enumerate(zip(outputs,losses)):
grad_str = self.variable_definition_cpp(model, o, as_reference=True, name_suffix="_grads")
# [@manuelbv]: Add definition of grad var
newline += ' ' + f'// [@manuelbv]: Definition of the gradient for output {o.cppname}\n'
newline += ' ' + f'{grad_str};\n'
# [@manuelbv]: Now check if ground truth is a valid pointer (valid data), if not, loss will be zero and grad will not be applied
ground_truth_str = f'{o.cppname}_ground_truth'
newline += ' ' + f'if ({ground_truth_str} != nullptr) ' + "{\n"
# [@manuelbv]: Add a placeholder for printing out information in case we want it
newline += ''.join([' ']*2) + f'// [@manuelbv]: Uncomment this for debugging\n'
newline += ''.join([' ']*2) + f'//std::cout << "Ground truth passed to nnet thru output {o.cppname} seems valid, computing loss" << std::endl ;\n'
# [@manuelbv]: Instantiation of the actual loss computation
if lo == "mse":
newline += ''.join([' ']*2) + f'losses::mse<{o.type.name}, mse_config>({o.cppname}, {ground_truth_str}, loss_{o.cppname}, {o.cppname}_grads);\n'
else:
raise ValueError(f"Loss {lo} not implemented yet.")
newline += ' ' + '} else {\n'
newline += ''.join([' ']*2) + f'// [@manuelbv]: Uncomment this for debugging\n'
newline += ''.join([' ']*2) + f'//std::cout << "Ground truth passed to nnet thru output {o.cppname} is invalid (nullptr). Loss=0. Not performing backprop." << std::endl ;\n'
if lo == "mse":
newline += ''.join([' ']*2) + f'losses::mse<{o.type.name}, mse_config>({o.cppname}, {o.cppname}, loss_{o.cppname}, {o.cppname}_grads);\n'
else:
raise ValueError(f"Loss {lo} not implemented yet.")
newline += ' ' + '}\n'# [@manuelbv] At the end we want something like this:
"""
//[@manuelbv]: Instantiate the loss and grads
result_t grads_layer3_out[N_LAYER_2];
if(layer3_ground_truth != nullptr) {
//std::cout << "Ground truth passed to nnet seems valid, computing loss" << std::endl ;
losses::mse(layer3_out,layer3_ground_truth,loss,grads_layer3_out);
} else {
//std::cout << "Ground truth passed to nnet is nullptr, invalid loss" << std::endl ;
losses::mse(layer3_out,layer3_out,loss,grads_layer3_out);
}
"""# [@manuelbv]: Now we need to add the actual backpropagation
inputs = model.get_input_variables()
outputs = model.get_output_variables()
grads = []
newline += "\n" + ' ' + "// [@manuelbv]: Backpass\n"
#print("----")
#newline = ""
for layer in reversed(model.get_layers()):
layer_type = layer.attributes['class_name']
layer_name = layer.attributes['name']
vars = layer.get_variables()
layer_config_cpp = layer.config_cpp()
layer_config = dict()
if layer_config_cpp:
layer_config_cpp = layer_config_cpp.split("struct ")[1].split(" :")[0]
print(layer_config_cpp)
if layer_type.lower() == 'activation':
layer_config['activation'] = layer.attributes['activation']
elif layer_type.lower() == 'dense' or layer_type.lower() == 'qdense':
layer_config['WnB'] = [layer.weights['weight'].cppname, layer.weights['bias'].cppname]
for var in vars:
grads.append([layer_name, layer_type, var.cppname, var.type.name, var.size_cpp(), layer_config_cpp, layer_config])
# if var not in inputs and var not in outputs:
# print(var.type.name)
# else:
# print(f"Input/Output: {var.type.name}")
if len(grads) > 1:
# Make declaration with previous grads
prev_grad = grads[-2]
prev_grad_layer_type = prev_grad[1]
prev_grad_outvar_name = prev_grad[2]
prev_grad_outvar_type = prev_grad[3]
prev_grad_config = prev_grad[5]
prev_grad_layer_config = prev_grad[6]
this_grad_name = f"{var.cppname}_grads"
this_grad_size = var.size_cpp()
this_grad_type = var.type.name
this_grad_def = f"{this_grad_type} {this_grad_name}[{this_grad_size}];"
#print(this_grad_def)
newline += ' ' + this_grad_def + '\n'
if prev_grad_layer_type.lower() == 'activation':
if prev_grad_layer_config['activation'] == 'linear':
backpass_def = 'linear_backpass'
else:
raise ValueError(f"Activation {prev_grad_layer_config['activation']} not implemented yet")
backpass = f"nnet::{backpass_def}<{prev_grad_outvar_type}, {this_grad_type}, {prev_grad_config}>({prev_grad_outvar_name}_grads, {this_grad_name}); // {prev_grad[0]}"
elif prev_grad_layer_type.lower() == 'dense' or prev_grad_layer_type.lower() == 'qdense':
print(prev_grad_layer_config)
WnB = prev_grad_layer_config['WnB']
backpass = f"nnet::dense_backpass<{prev_grad_outvar_type}, {this_grad_type}, {prev_grad_config}>({prev_grad_outvar_name}_grads, {this_grad_name}, {var.cppname}, {', '.join(WnB)}); // {prev_grad[0]}"
else:
raise ValueError(f"Unknown type of layer {prev_grad_layer_type.lower()}")
newline += ' ' + backpass + '\n\n'
# print(prev_grad, layer_type,layer_name)
#print("----")newline += "\n\n"
#print(newline)"""
// [@manuelbv]: Backpass
// see: https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/// linear
result_t grads_layer2_out[N_LAYER_2];
nnet::linear_backpass(grads_layer3_out, grads_layer2_out); // fc1_linear// fc1
result_t grads_fc1_out[N_LAYER_2];
nnet::dense_backpass(grads_layer2_out, grads_fc1_out, fc1_input, w2, b2); // fc1"""
...
```Now, if we try to run the previous code we will face an error telling us that the file "losses/losses_parameters.h" was not found, which is true. That's a custom file we created, but hls4ml is not aware of it. So we need to create a couple of soft links (or simply copy, whatever you prefer, I prefer links) in the vivado templates folder inside hls4ml, so these dependencies can be used while compiling our model.
```shell
ln -sf /mnt/raid0/asic/projects/NU/ETHW/manuelbv/include/losses ~/ETHW/hls4ml/templates/vivado/losses
ln -sf /mnt/raid0/asic/projects/NU/ETHW/manuelbv/include/autograd ~/ETHW/hls4ml/templates/vivado/autograd
```Alright, let's keep it moving. We need to do something else before we try to compile this, cause rght now the dependencies for autograd are not automatically copied. First, let's make sure we include them in the `build_lib.sh` script when creating it. To do so, let's open `~/ETHW/hls4ml/writer/vivado_writer.py` and go to `write_build_script` method defintion. Find the part that creates the build_lib.sh file and add the two lines below (after my comment) which adds the dependencies in INCFLAGS (we are also adding autograd and losses, as you can see, so the compiler is aware it needs to use that).
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.12` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
...
###################
# build_lib.sh
###################f = open(os.path.join(filedir,'../templates/vivado/build_lib.sh'),'r')
fout = open('{}/build_lib.sh'.format(model.config.get_output_dir()),'w')for line in f.readlines():
line = line.replace('myproject', model.config.get_project_name())
line = line.replace('mystamp', model.config.get_config_value('Stamp'))
# [@manuelbv]: CHANGELOG_a.12 -> Make sure we add all dependencies to the output build script, if trainable
if model.config.config['HLSConfig']['Model']['Trainable']:
line = line.replace('INCFLAGS="-Ifirmware/ap_types/"', 'INCFLAGS="-Ifirmware/ap_types/ -Ifirmware/autograd/ -Ifirmware/losses/"')fout.write(line)
f.close()
fout.close()
...
```Let's now create a new method to copy over the autograd dependencies to the output firmware, in case we need it. In `~/ETHW/hls4ml/writer/vivado_writer.py`, create this method after `write_nnet_utils`:
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.13` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
# [@manuelbv]: CHANGELOG_a.13 - > Adding the autograd dependencies to the output folder
def write_autograd_utils(self, model):
###################
## autograd utilities
###################filedir = os.path.dirname(os.path.abspath(__file__))
if model.config.config['HLSConfig']['Model']['Trainable']:
for ffdep in ["autograd","losses"]:
srcpath = os.path.join(filedir,f'../templates/vivado/{ffdep}/')
dstpath = f'{model.config.get_output_dir()}/firmware/{ffdep}/'if os.path.exists(dstpath):
rmtree(dstpath)copytree(srcpath, dstpath)
```
One more change in the `vivado_writer.py` file. We need to invoke this method when we call `write_hls`. The final blok should look like this:
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.14` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
def write_hls(self, model):
print('Writing HLS project')
self.write_project_dir(model)
self.write_project_cpp(model)
self.write_project_header(model)
self.write_weights(model)
self.write_defines(model)
self.write_parameters(model)
self.write_test_bench(model)
self.write_bridge(model)
self.write_build_script(model)
self.write_nnet_utils(model)
#[@manuelbv]: CHANGELOG_a.14: Added newly defined method to copy autograd/losses definitions to output dir
self.write_autograd_utils(model)
self.write_yml(model)
self.write_tar(model)
print('Done')```
Let's open the `~/ETHW/hls4ml/templates/vivado/firmware/myproject.h` template and add the `hls-fpga-machine-learning insert autograd-headers` line under `#include "defines.h"` so we can add the header files for backprop.
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.15` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```cpp
...
#include "defines.h//hls-fpga-machine-learning insert autograd-headers
...
```Inside the `vivado_writer`, function `write_project_header`, let's add another check and try to search for `//hls-fpga-machine-learning insert autograd-headers`. If we find it, then we parse the different types of losses and layers in our design, and add the headers for the required backprop wrappers.
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.16` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
...
elif '//hls-fpga-machine-learning insert autograd-headers' in line:
#[@manuelbv]: CHANGELOG_a.16 If we find the "//hls-fpga-machine-learning insert autograd-headers" comment, and this is a trainable network
# then include the autograd headers.
newline = ''
if model.config.config['HLSConfig']['Model']['Trainable']:
newline = '// -------------------------- AUTOGRAD --------------------------\n'
newline += '// [@manuelbv]: Manually including definitions for autograd\n'
newline += '#include "autograd/autograd_defines.h"\n\n'
# [@manuelbv]: Parse the different types of losses used
losses = model.config.config['HLSConfig']['Model']['Losses']
losses = list(np.unique(losses))
if len(losses) > 0:
newline += '// [@manuelbv]: Manually importing losses\n'
for l in losses:
if l == 'mse':
newline += '#include "losses/mse.h"\n'
else:
raise ValueError(f"Unknown loss {l}")
newline += "\n"
# [@manuelbv]: Import backprop implementations for diff layer types
newline += "// [@manuelbv]: Import backprop implementations\n"inputs = model.get_input_variables()
outputs = model.get_output_variables()
# [@manuelbv]: List with all types of layers we are using in the model
unique_layer_types = ['activation']
for layer in model.get_layers():
vars = layer.get_variables()
for var in vars:
if var not in inputs and var not in outputs:
layer_type = layer.attributes['class_name']
if layer_type not in unique_layer_types:
unique_layer_types.append(layer_type)
for ut in unique_layer_types:
if ut.lower() == "qdense" or ut.lower() == "dense":
newline += '#include "autograd/nnet_dense_backprop.h"\n'
elif ut.lower() == "activation" or ut.lower() == "relu":
newline += '#include "autograd/nnet_activation_backprop.h"\n'
else:
raise ValueError(f"Unknown type of layer {ut}")
# [@manuelbv]: Finally, close
newline += "// --------------------------------------------------------------\n"
...
```⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.17` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
...
elif '//hls-fpga-machine-learning insert header' in line:
inputs_str = ', '.join([self.variable_definition_cpp(model, i, as_reference=True) for i in model_inputs])
outputs_str = ', '.join([self.variable_definition_cpp(model, o, as_reference=True) for o in model_outputs])
brams_str = ', \n'.join([indent + self.variable_definition_cpp(model, b, as_reference=False) for b in model_brams])
#[@manuelbv]: CHANGELOG_a.17 If this is a trainable network, then add loss + ground truth IOs in header
_autograd_loss_definition = lambda var: '{type} &loss_{name}'.format(type=var.type.name, name=var.cppname)
loss_str = ', '.join([_autograd_loss_definition(o) for o in model_outputs])
ground_truth_str = ', '.join([self.variable_definition_cpp(model, o, as_reference=True, name_suffix="_ground_truth") for o in model_outputs])
insize_str = ', '.join(['unsigned short &const_size_in_{}'.format(i) for i in range(1, len(model_inputs) + 1)])
outsize_str = ', '.join(['unsigned short &const_size_out_{}'.format(o) for o in range(1, len(model_outputs) + 1)])newline = ''
newline += indent + inputs_str + ',\n'
newline += indent + outputs_str + ',\n'
if len(model_brams) > 0:
newline += brams_str + ',\n'
# [@manuelbv]: If model is trainable, add loss + ground truth IOs
if model.config.config['HLSConfig']['Model']['Trainable']:
newline += indent + loss_str + ',\n'
newline += indent + insize_str + ',\n'
newline += indent + outsize_str + ',\n'
newline += indent + ground_truth_str + ' = nullptr\n'
else:
newline += indent + insize_str + ',\n'
newline += indent + outsize_str + '\n'
...
```Let's continue modifying the templates. Two other files are generated and used when compiling the model. These are `myproject_bridge.cpp` and `myproject_test.cpp`. So let's modify them to incorporate the new definitions for the trainable networks. Let's start with the bridge.
Open the file `~/ETHW/hls4ml/templates/vivado/myproject_bridge.cpp` and take a look at it. Now go back to the `vivado_writer.py` file and go to the function `write_bridge` definition. The changes we need to apply here are pretty similar to the ones we've applied before to create `myproject.cpp`. Take a look at the check statement `elif '//hls-fpga-machine-learning insert header' in line:`. We will modify this to look like the following:
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.18` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
...
elif '//hls-fpga-machine-learning insert header' in line:
dtype = line.split('#', 1)[1].strip()
inputs_str = ', '.join(['{type} {name}[{shape}]'.format(type=dtype, name=i.cppname, shape=i.size_cpp()) for i in model_inputs])
outputs_str = ', '.join(['{type} {name}[{shape}]'.format(type=dtype, name=o.cppname, shape=o.size_cpp()) for o in model_outputs])
#[@manuelbv]: CHANGELOG_a.18 If this is a trainable network, then add loss + ground truth IOs in header
_autograd_loss_definition = lambda var: '{type} &loss_{name}'.format(type=dtype, name=var.cppname)
loss_str = ', '.join([_autograd_loss_definition(o) for o in model_outputs])
ground_truth_str = ', '.join([f'{dtype} {o.cppname}_ground_truth[{o.size_cpp()}]' for o in model_outputs])
insize_str = ', '.join(['unsigned short &const_size_in_{}'.format(i) for i in range(1, len(model_inputs) + 1)])
outsize_str = ', '.join(['unsigned short &const_size_out_{}'.format(o) for o in range(1, len(model_outputs) + 1)])newline = ''
newline += indent + inputs_str + ',\n'
newline += indent + outputs_str + ',\n'# [@manuelbv]: If model is trainable, add loss + ground truth IOs
if model.config.config['HLSConfig']['Model']['Trainable']:
newline += indent + loss_str + ',\n'
newline += indent + insize_str + ',\n'
newline += indent + outsize_str + ',\n'
newline += indent + ground_truth_str + '\n'
else:
newline += indent + insize_str + ',\n'
newline += indent + outsize_str + '\n'
...
```⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.19` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
...
elif '//hls-fpga-machine-learning insert wrapper' in line:
dtype = line.split('#', 1)[1].strip()
newline = ''
for i in model_inputs:
newline += indent + '{var};\n'.format(var=self.variable_definition_cpp(model, i, name_suffix='_ap'))
newline += indent + 'nnet::convert_data<{}, {}, {}>({}, {}_ap);\n'.format(dtype, i.type.name, i.size_cpp(), i.cppname, i.cppname)
newline += '\n'
for o in model_outputs:
newline += indent + '{var};\n'.format(var=self.variable_definition_cpp(model, o, name_suffix='_ap'))
if model.config.config['HLSConfig']['Model']['Trainable']:
newline += indent + '{type} loss_{name}_{name_suffix};\n'.format(type=o.type.name, name=o.cppname, name_suffix="ap")
newline += indent + '{type} {name}_ground_truth_{name_suffix}[{thisshape}];\n'.format(type=o.type.name, name=o.cppname, name_suffix="ap", thisshape=o.size_cpp())
newline += '\n'input_size_vars = ','.join(['const_size_in_{}'.format(i) for i in range(1, len(model.get_input_variables()) + 1)])
output_size_vars = ','.join(['const_size_out_{}'.format(o) for o in range(1, len(model.get_output_variables()) + 1)])
input_vars = ','.join([i.cppname + '_ap' for i in model.get_input_variables()])
bram_vars =','.join([b.cppname for b in model.get_bram_variables()])
output_vars = ','.join([o.cppname + '_ap' for o in model.get_output_variables()])
# Concatenate the input, output, and bram variables. Filter out empty/null values
all_vars = ','.join(filter(None, [input_vars, output_vars, bram_vars]))#[@manuelbv]: CHANGELOG_a.19 If this is a trainable network, then add loss + ground truth IOs in instantiation of top module
_autograd_loss_definition = lambda var: 'loss_{name}_{name_suffix}'.format(name=var.cppname, name_suffix="ap")
loss_str = ', '.join([_autograd_loss_definition(o) for o in model_outputs])
ground_truth_str = ', '.join([f'{o.cppname}_ground_truth_ap' for o in model_outputs])
if model.config.config['HLSConfig']['Model']['Trainable']:
top_level = indent + '{}({},{},{},{},{});\n'.format(model.config.get_project_name(), all_vars, loss_str, input_size_vars, output_size_vars, ground_truth_str)
else:
top_level = indent + '{}({},{},{});\n'.format(model.config.get_project_name(), all_vars, input_size_vars, output_size_vars)
newline += top_levelnewline += '\n'
for o in model_outputs:
newline += indent + 'nnet::convert_data<{}, {}, {}>({}_ap, {});\n'.format(o.type.name, dtype, o.size_cpp(), o.cppname, o.cppname)
if model.config.config['HLSConfig']['Model']['Trainable']:
newline += indent + 'nnet::convert_data<{}, {}, {}>({}_ground_truth_ap, {}_ground_truth);\n'.format(o.type.name, dtype, o.size_cpp(), o.cppname, o.cppname)
newline += indent + 'nnet::convert_data<{}, {}, 1>(&loss_{}_ap, &loss_{});\n'.format(o.type.name, dtype, o.cppname, o.cppname)
...
```Let's now modify the testbench that gets generated by hls4ml so we can add the parts that train the network automatically. I'll do this in one go, that is, I will give you here how your myproject_tet.cpp template has to look like, instead of individual changes.
This is how your `~/ETHW/hls4ml/templates/vivado/myproject_test.cpp` needs to look like:
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.20` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```cpp
//
// rfnoc-hls-neuralnet: Vivado HLS code for neural-net building blocks
//
// Copyright (C) 2017 EJ Kreinar
//
// This program is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with this program. If not, see .
//
#include
#include
#include
#include
#include
#include
#include
#include#include "firmware/myproject.h"
#include "firmware/nnet_utils/nnet_helpers.h"//hls-fpga-machine-learning insert bram
//hls-fpga-machine-learning insert autograd-helpers-include
#define CHECKPOINT 5000
namespace nnet {
bool trace_enabled = true;
std::map *trace_outputs = NULL;
size_t trace_type_size = sizeof(double);
}int main(int argc, char **argv)
{
//load input data from text file
std::ifstream fin("tb_data/tb_input_features.dat");
//load predictions from text file
std::ifstream fpr("tb_data/tb_output_predictions.dat");#ifdef RTL_SIM
std::string RESULTS_LOG = "tb_data/rtl_cosim_results.log";
#else
std::string RESULTS_LOG = "tb_data/csim_results.log";
#endif
std::ofstream fout(RESULTS_LOG);//hls-fpga-machine-learning insert autograd-output-file-declaration
std::string iline;
std::string pline;
int e = 0;if (fin.is_open() && fpr.is_open()) {
while ( std::getline(fin,iline) && std::getline (fpr,pline) ) {
if (e % CHECKPOINT == 0) std::cout << "Processing input " << e << std::endl;
char* cstr=const_cast(iline.c_str());
char* current;
std::vector in;
current=strtok(cstr," ");
while(current!=NULL) {
in.push_back(atof(current));
current=strtok(NULL," ");
}
cstr=const_cast(pline.c_str());
std::vector pr;
current=strtok(cstr," ");
while(current!=NULL) {
pr.push_back(atof(current));
current=strtok(NULL," ");
}//hls-fpga-machine-learning insert data
//hls-fpga-machine-learning insert top-level-function
if (e % CHECKPOINT == 0) {
//hls-fpga-machine-learning insert autograd_custom_printing
std::cout << "Predictions" << std::endl;
//hls-fpga-machine-learning insert predictions
std::cout << "Quantized predictions" << std::endl;
//hls-fpga-machine-learning insert quantized
}
e++;//hls-fpga-machine-learning insert tb-output
}
fin.close();
fpr.close();
//hls-fpga-machine-learning insert autograd-output-file-closure
} else {
std::cout << "INFO: Unable to open input/predictions file, using default input." << std::endl;//hls-fpga-machine-learning insert zero
//hls-fpga-machine-learning insert top-level-function
//hls-fpga-machine-learning insert output
//hls-fpga-machine-learning insert tb-output
}
fout.close();
std::cout << "INFO: Saved inference results to file: " << RESULTS_LOG << std::endl;return 0;
}```
By now you know the procedure. And this is how your function `write_test_bench` inside the `vivado_writer.py` file needs to look like:
⚠️⚠️⚠️⚠️⚠️⚠️⚠️ **CHANGE IN hls4ml!!!!!**: `CHANGELOG_a.21` ⚠️⚠️⚠️⚠️⚠️⚠️⚠️
```python
...
elif '//hls-fpga-machine-learning insert bram' in line:
newline = line
for bram in model.get_bram_variables():
newline += '#include \"firmware/weights/{}.h\"\n'.format(bram.cppname)
# [@manuelbv]: We added the following extra check to add the helpers for backprop
elif '//hls-fpga-machine-learning insert autograd-helpers-include' in line:
newline = line
newline += '#include "firmware/autograd/trainer_helpers.h"\n'
...
```#ifdef RTL_SIM
std::string RESULTS_LOG = "tb_data/rtl_cosim_results.log";
#else
std::string RESULTS_LOG = "tb_data/csim_results.log";
#endif
std::ofstream fout(RESULTS_LOG);//hls-fpga-machine-learning insert autograd-output-file-declaration
## complete CHANGELOG: Hls4ml
This list should contain ALL changes I made to hls4ml, but in case I forgot something, everytime I change something I always add a comment with the `#[@manuelbv]` tag before it, so if you grep for that in the hls4ml directory, you should see absolutely all changes I made.
* `CHANGELOG_a.1`: I added the flag "trainable = False" to allow the user to implement trainable layers in `~/ETHW/hls4ml/hls4ml/utils/config.py`, on method `config_from_keras_model`.
* `CHANGELOG_a.2`: Inside file `~/ETHW/hls4ml/hls4ml/utils/config.py`, on method `config_from_keras_model`, inside the condition `if layer['class_name'] in dense_layers + conv_layers:` I am passing the `trainable` flag to the configuration of each specific layer, so that at the end the configuration dictionary contains the trainable flag.
* `CHANGELOG_a.3`: The same as `_a.2` but for activation layers.
* `CHANGELOG_a.4`: The same as `_a.2` and `_a.3` but for qkeras_layers.
* `CHANGELOG_a.5`: Inside file `~/ETHW/hls4ml/hls4ml/utils/config.py`, on method `config_from_keras_model`, I'm adding the trainable flag also to the global configuration of the whole network iself (apart from layer by layer).
* `CHANGELOG_a.6`: Change the vivado template for the top myproject.cpp definition in `~/ETHW/hls4ml/templates/vivado/firmware/myproject.cpp` and add the `//hls-fpga-machine-learning insert autograd-def` comment so we can later on add the definition of the losses.
* `CHANGELOG_a.7`: Added checking for `//hls-fpga-machine-learning insert autograd-def` in `~/ETHW/hls4ml/writer/vivado_writer.py` file. If trainable, then we basically include the losses header definition to the final cpp file.
* `CHANGELOG_a.8`: Modified the `//hls-fpga-machine-learning insert header` check statement in `~/ETHW/hls4ml/writer/vivado_writer.py` file to make sure we are adding the loss + ground truth IO definitions to the top module definition.
* `CHANGELOG_a.9`: Change the vivado template for the top myproject.cpp definition in `~/ETHW/hls4ml/templates/vivado/firmware/myproject.cpp` and add the `//hls-fpga-machine-learning autograd-layer-wrappers` comment so we can later on add the definition of the auto-grad layer wrappers.
* `CHANGELOG_a.10`: I added the definition of the losses in `~/ETHW/hls4ml/hls4ml/utils/config.py`, on method `config_from_keras_model`.
* `CHANGELOG_a.11`: Implementation of the autograd in the main myproject.cpp output file. This is defined in the `//hls-fpga-machine-learning autograd-layer-wrappers` condition in `~/ETHW/hls4ml/writer/vivado_writer.py`. This is basically instantiating the loss layer + backprop layers (the propagation of the gradient from last layer back).
* `CHANGELOG_a.12`: Modified the `~/ETHW/hls4ml/writer/vivado_writer.py`, method `write_build_script`, to make sure we modify the the build_lib.sh script to include the `autograd` and `losses` libraries when compiling.
* `CHANGELOG_a.13`: Created a new method `write_autograd_utils` in `~/ETHW/hls4ml/writer/vivado_writer.py`, to make sure we copy the autograd/losses dependencies to the project output dir.
* `CHANGELOG_a.14`: In the `write_hls` method, we need to invoke the `write_autograd_utils` we created in `~/ETHW/hls4ml/writer/vivado_writer.py`. Let's do this after `write_nnet_utils` so the autograd/losses dependencies are copied out.
* `CHANGELOG_a.15`: Modify the `~/ETHW/hls4ml/templates/vivado/firmware/myproject.h` template and add the `hls-fpga-machine-learning insert autograd-headers` line under `#include "defines.h"` so we can add the header files for backprop.
* `CHANGELOG_a.16`: Inside the `vivado_writer`, function `write_project_header`, let's add another check and try to search for `//hls-fpga-machine-learning insert autograd-headers`. If we find it, then we parse the different types of losses and layers in our design, and add the headers for the required backprop wrappers.
* `CHANGELOG_a.17`: Modified the `//hls-fpga-machine-learning insert header` check statement in `~/ETHW/hls4ml/writer/vivado_writer.py` in the `write_project_header` so we add the IO ports for Loss & ground truth to the header definition.
* `CHANGELOG_a.18`: Modified the `elif '//hls-fpga-machine-learning insert header' in line:` check statement in `~/ETHW/hls4ml/writer/vivado_writer.py` in the `write_bridge` to add the loss and ground truth IO defs.
* `CHANGELOG_a.19`:
* `CHANGELOG_a.20`: Added `//hls-fpga-machine-learning insert autograd-helpers-include` to the `myproject_test.cpp` template in hls4ml, so we can add the include headers to the testbench automtically generated by hls4ml.
* `CHANGELOG_a.21`: Added the check for `//hls-fpga-machine-learning insert autograd-helpers-include` in `vivado_writer.py`.