Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/impetus-udes/rule4ml

Resource Utilization and Latency Estimation for ML on FPGA.
https://github.com/impetus-udes/rule4ml

fpga hls keras machine-learning neural-network onnx prediction python pytorch regression-models

Last synced: 11 days ago
JSON representation

Resource Utilization and Latency Estimation for ML on FPGA.

Host: GitHub
URL: https://github.com/impetus-udes/rule4ml
Owner: IMPETUS-UdeS
License: gpl-3.0
Created: 2024-07-03T13:57:48.000Z (7 months ago)
Default Branch: main
Last Pushed: 2025-01-13T20:44:42.000Z (17 days ago)
Last Synced: 2025-01-13T21:31:45.264Z (17 days ago)
Topics: fpga, hls, keras, machine-learning, neural-network, onnx, prediction, python, pytorch, regression-models
Language: Python
Homepage:
Size: 2.82 MB
Stars: 5
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

README

[![License](https://img.shields.io/badge/License-GPL_3.0-red.svg)](https://opensource.org/license/gpl-3-0)

# rule4ml: Resource Utilization and Latency Estimation for ML

`rule4ml` is a tool designed for pre-synthesis estimation of FPGA resource utilization and inference latency for machine learning models.

## Installation

`rule4ml` releases are uploaded to the Python Package Index for easy installation via `pip`.

```bash
pip install rule4ml
```

This will only install the [base package](https://github.com/IMPETUS-UdeS/rule4ml/tree/main/rule4ml) and its dependencies for resources and latency prediction. The [data_gen](https://github.com/IMPETUS-UdeS/rule4ml/tree/main/data_gen/) scripts and the [Jupyter notebooks](https://github.com/IMPETUS-UdeS/rule4ml/tree/main/notebooks) are to be cloned from the repo if needed. The data generation dependencies are listed seperately in [data_gen/requirements.txt](https://github.com/IMPETUS-UdeS/rule4ml/tree/main/data_gen/requirements.txt).

## Getting Started

### Tutorial
To get started with `rule4ml`, please refer to the detailed Jupyter Notebook [tutorial](https://github.com/IMPETUS-UdeS/rule4ml/tree/main/notebooks/tutorial.ipynb). This tutorial covers:

- Using pre-trained estimators for resources and latency predictions.
- Generating synthetic datasets.
- Training and testing your own predictors.

### Usage
Here's a quick example of how to use `rule4ml` to estimate resources and latency for a given model:

```python
import keras
from keras.layers import Input, Dense, Activation

from rule4ml.models.estimators import MultiModelEstimator

# Example of a simple keras Model
input_size = 16
inputs = Input(shape=(input_size,))
x = Dense(32, activation="relu")(inputs)
x = Dense(32, activation="relu")(x)
x = Dense(32, activation="relu")(x)
outputs = Dense(5, activation="softmax")(x)

model_to_predict = keras.Model(inputs=inputs, outputs=outputs, name="Jet Classifier")
model_to_predict.build((None, input_size)) # building keras models is required

# Loading default predictors
estimator = MultiModelEstimator()
estimator.load_default_models()

# MultiModelEstimator predictions are formatted as a pandas DataFrame
prediction_df = estimator.predict(model_to_predict)

# Further formatting can applied to organize the DataFrame
if not prediction_df.empty:
prediction_df = prediction_df.groupby(
["Model", "Board", "Strategy", "Precision", "Reuse Factor"], observed=True
).mean() # each row is unique in the groupby, mean() is only called to convert DataFrameGroupBy

# Outside of Jupyter notebooks, we recommend saving the DataFrame as HTML for better readability
prediction_df.to_html("keras_example.html")
```

**keras_example.html** (truncated)

BRAM (%)
DSP (%)
FF (%)
LUT (%)
CYCLES

Model
Board
Strategy
Precision
Reuse Factor

Jet Classifier
pynq-z2
Latency
ap_fixed<2, 1>
1
2.77
0.89
2.63
30.02
54.68

2
2.75
0.86
2.62
29.91
55.84

4
2.70
0.79
2.58
29.80
55.78

8
2.97
0.67
2.49
29.79
68.84

16
2.97
0.63
2.50
30.24
75.38

32
2.26
0.74
2.43
30.90
76.19

64
0.83
0.47
2.19
32.89
112.04

ap_fixed<8, 3>
1
2.63
1.58
13.91
115.89
53.96

2
2.63
1.50
13.63
111.75
54.70

4
2.59
1.25
13.07
108.52
56.16

8
2.76
1.41
12.22
108.01
53.07

16
3.42
1.96
11.98
104.58
64.71

32
2.99
1.93
12.74
94.71
83.06

64
0.56
1.70
14.74
92.78
104.88

ap_fixed<16, 6>
1
1.78
199.86
45.96
184.86
66.59

2
2.30
198.30
45.71
190.51
68.14

4
2.38
198.50
45.95
195.05
73.15

8
1.48
175.18
46.42
188.65
95.70

16
2.90
83.85
48.13
184.96
101.44

32
4.43
51.04
51.83
193.38
141.07

64
0.75
30.32
55.36
193.26
229.37

## Datasets
Training accurate predictors requires large datasets of synthesized neural networks. We used [hls4ml](https://github.com/fastmachinelearning/hls4ml) to synthesize neural networks generated with parameters randomly sampled from predefined ranges (defaults of data classes in the code). Our models' training data is publicly available at [https://borealisdata.ca/dataverse/rule4ml](https://borealisdata.ca/dataverse/rule4ml).

## Limitations
In their current iteration, the predictors can process [Keras](https://keras.io/about/) or [PyTorch](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html) models to generate FPGA resources (**BRAM**, **DSP**, **FF**, **LUT**) and latency (**Clock Cycles**) estimations for various synthesis configurations. However, the training models are limited to specific layers: **Dense/Linear**, **ReLU**, **Tanh**, **Sigmoid**, **Softmax**, **BatchNorm**, **Add**, **Concatenate**, and **Dropout**. They are also constrained by synthesis parameters, notably **clock_period** (10 ns) and **io_type** (io_parallel). Inputs outside these configurations may result in inaccurate predictions.

## License
This project is licensed under the GPL-3.0 License. See the [LICENSE](https://github.com/IMPETUS-UdeS/rule4ml/tree/main/LICENSE) file for details.