https://github.com/impetus-udes/rule4ml
Resource Utilization and Latency Estimation for ML on FPGA.
https://github.com/impetus-udes/rule4ml
fpga hls keras machine-learning neural-network onnx prediction python pytorch regression-models resource-utilization surrogate-models vitis vivado
Last synced: 24 days ago
JSON representation
Resource Utilization and Latency Estimation for ML on FPGA.
- Host: GitHub
- URL: https://github.com/impetus-udes/rule4ml
- Owner: IMPETUS-UdeS
- License: gpl-3.0
- Created: 2024-07-03T13:57:48.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-09-23T05:25:15.000Z (about 1 month ago)
- Last Synced: 2025-09-23T12:01:01.753Z (about 1 month ago)
- Topics: fpga, hls, keras, machine-learning, neural-network, onnx, prediction, python, pytorch, regression-models, resource-utilization, surrogate-models, vitis, vivado
- Language: Python
- Homepage:
- Size: 8.11 MB
- Stars: 15
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://opensource.org/license/gpl-3-0)
# rule4ml: Resource Utilization and Latency Estimation for ML
`rule4ml` is a tool designed for pre-synthesis estimation of FPGA resource utilization and inference latency for machine learning models.
## Installation
`rule4ml` releases are uploaded to the Python Package Index for easy installation via `pip`.
```bash
pip install rule4ml
```This will only install the [base package](https://github.com/IMPETUS-UdeS/rule4ml/tree/main/rule4ml) and its dependencies for resources and latency prediction. The [data_gen](https://github.com/IMPETUS-UdeS/rule4ml/tree/main/data_gen/) scripts and the [Jupyter notebooks](https://github.com/IMPETUS-UdeS/rule4ml/tree/main/notebooks) are to be cloned from the repo if needed. The data generation dependencies are listed seperately in [data_gen/requirements.txt](https://github.com/IMPETUS-UdeS/rule4ml/tree/main/data_gen/requirements.txt).
## Getting Started
### Tutorial
To get started with `rule4ml`, please refer to the detailed Jupyter Notebook [tutorial](https://github.com/IMPETUS-UdeS/rule4ml/tree/main/notebooks/tutorial.ipynb). This tutorial covers:- Using pre-trained estimators for resources and latency predictions.
- Generating synthetic datasets.
- Training and testing your own predictors.### Usage
Here's a quick example of how to use `rule4ml` to estimate resources and latency for a given model:```python
import keras
from keras.layers import Input, Dense, Activationfrom rule4ml.models.estimators import MultiModelEstimator
# Example of a simple keras Model
input_size = 16
inputs = Input(shape=(input_size,))
x = Dense(32, activation="relu")(inputs)
x = Dense(32, activation="relu")(x)
x = Dense(32, activation="relu")(x)
outputs = Dense(5, activation="softmax")(x)model_to_predict = keras.Model(inputs=inputs, outputs=outputs, name="Jet Classifier")
model_to_predict.build((None, input_size)) # building keras models is required# Loading default predictors
estimator = MultiModelEstimator()
estimator.load_default_models()# MultiModelEstimator predictions are formatted as a pandas DataFrame
prediction_df = estimator.predict(model_to_predict)# Further formatting can applied to organize the DataFrame
if not prediction_df.empty:
prediction_df = prediction_df.groupby(
["Model", "Board", "Strategy", "Precision", "Reuse Factor"], observed=True
).mean() # each row is unique in the groupby, mean() is only called to convert DataFrameGroupBy# Outside of Jupyter notebooks, we recommend saving the DataFrame as HTML for better readability
prediction_df.to_html("keras_example.html")
```**keras_example.html** (truncated)
BRAM (%)
DSP (%)
FF (%)
LUT (%)
CYCLES
Model
Board
Strategy
Precision
Reuse Factor
Jet Classifier
pynq-z2
Latency
ap_fixed<2, 1>
1
2.77
0.89
2.63
30.02
54.68
2
2.75
0.86
2.62
29.91
55.84
4
2.70
0.79
2.58
29.80
55.78
8
2.97
0.67
2.49
29.79
68.84
16
2.97
0.63
2.50
30.24
75.38
32
2.26
0.74
2.43
30.90
76.19
64
0.83
0.47
2.19
32.89
112.04
ap_fixed<8, 3>
1
2.63
1.58
13.91
115.89
53.96
2
2.63
1.50
13.63
111.75
54.70
4
2.59
1.25
13.07
108.52
56.16
8
2.76
1.41
12.22
108.01
53.07
16
3.42
1.96
11.98
104.58
64.71
32
2.99
1.93
12.74
94.71
83.06
64
0.56
1.70
14.74
92.78
104.88
ap_fixed<16, 6>
1
1.78
199.86
45.96
184.86
66.59
2
2.30
198.30
45.71
190.51
68.14
4
2.38
198.50
45.95
195.05
73.15
8
1.48
175.18
46.42
188.65
95.70
16
2.90
83.85
48.13
184.96
101.44
32
4.43
51.04
51.83
193.38
141.07
64
0.75
30.32
55.36
193.26
229.37
## Datasets
Training accurate predictors requires large datasets of synthesized neural networks. We used [hls4ml](https://github.com/fastmachinelearning/hls4ml) to synthesize neural networks generated with parameters randomly sampled from predefined ranges (defaults of data classes in the code). Our models' training data is publicly available at [https://borealisdata.ca/dataverse/rule4ml](https://borealisdata.ca/dataverse/rule4ml).## Limitations
In their current iteration, the predictors can process [Keras](https://keras.io/about/) or [PyTorch](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html) models to generate FPGA resources (**BRAM**, **DSP**, **FF**, **LUT**) and latency (**Clock Cycles**) estimations for various synthesis configurations. However, the training models are limited to specific layers: **Dense/Linear**, **ReLU**, **Tanh**, **Sigmoid**, **Softmax**, **BatchNorm**, **Add**, **Concatenate**, and **Dropout**. They are also constrained by synthesis parameters, notably **clock_period** (10 ns) and **io_type** (io_parallel). Inputs outside these configurations may result in inaccurate predictions.## License
This project is licensed under the GPL-3.0 License. See the [LICENSE](https://github.com/IMPETUS-UdeS/rule4ml/tree/main/LICENSE) file for details.