{"id":20465489,"url":"https://github.com/ucb-bar/baremetal-nn","last_synced_at":"2026-03-17T02:43:22.590Z","repository":{"id":200232831,"uuid":"692600166","full_name":"ucb-bar/Baremetal-NN","owner":"ucb-bar","description":"Tool for converting PyTorch models into raw C codes with minimal dependency and some performance optimizations.","archived":false,"fork":false,"pushed_at":"2025-04-09T17:24:21.000Z","size":82084,"stargazers_count":30,"open_issues_count":2,"forks_count":5,"subscribers_count":13,"default_branch":"main","last_synced_at":"2025-04-09T18:28:23.354Z","etag":null,"topics":["c","neural-network","pytorch","risc-v"],"latest_commit_sha":null,"homepage":"https://ucb-bar.github.io/Baremetal-NN/","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ucb-bar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-17T01:40:39.000Z","updated_at":"2025-04-09T17:24:25.000Z","dependencies_parsed_at":"2024-11-15T13:29:22.973Z","dependency_job_id":null,"html_url":"https://github.com/ucb-bar/Baremetal-NN","commit_stats":null,"previous_names":["ucb-bar/baremetal-nn"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ucb-bar%2FBaremetal-NN","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ucb-bar%2FBaremetal-NN/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ucb-bar%2FBaremetal-NN/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ucb-bar%2FBaremetal-NN/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ucb-bar","download_url":"https://codeload.github.com/ucb-bar/Baremetal-NN/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248687705,"owners_count":21145755,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","neural-network","pytorch","risc-v"],"created_at":"2024-11-15T13:18:47.428Z","updated_at":"2026-03-17T02:43:22.584Z","avatar_url":"https://github.com/ucb-bar.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"![](docs/cover.png)\n\n# Baremetal-NN\n\n![CI-status](https://img.shields.io/github/actions/workflow/status/ucb-bar/Baremetal-NN/run-tests.yaml?branch=main\u0026style=flat-square\u0026label=CI\u0026logo=githubactions\u0026logoColor=fff)\n![API-Docs-status](https://img.shields.io/github/actions/workflow/status/ucb-bar/Baremetal-NN/build-docs.yaml?branch=main\u0026style=flat-square\u0026label=Docs\u0026logo=googledocs\u0026logoColor=fff)\n[![License](https://img.shields.io/badge/license-MIT-yellow.svg?style=flat-square\u0026label=License)](https://opensource.org/license/apache-2-0)\n\nBaremetal-NN is a tool for converting PyTorch models into raw C codes that can be executed standalone in a baremetal runtime on research chips. \n\n![](docs/overview.png)\n\n\u003e Note:\n\u003e After a discussion with [@iansseijelly](https://github.com/iansseijelly), we decided to switch to the simpler way of assuming array will be contiguous, and therefore directly use shape to index into elements, instead of the more generic strided access. The previous strided implementation can be access on the [\"strided\"](https://github.com/ucb-bar/Baremetal-NN/tree/strided) branch.\n\n## Getting Started\n\nRefer to the [API Doc](https://ucb-bar.github.io/Baremetal-NN/nn_8h.html) for an overview of the available datatypes and functions.\n\n## Run Test\n\n### Building for x86\n\nfirst, we clean any previous builds\n\n```bash\nrm -rf ./build/\n```\n\n```bash\ncmake -S ./ -B ./build/ -D CMAKE_BUILD_TYPE=Debug\ncmake --build ./build/ --target tests\n./build/tests/tests\n```\n\n### Building for RISC-V\n\nfirst, we clean any previous builds\n\n```bash\nrm -rf ./build/\n```\n\n```bash\n# make sure $RISCV is set\ncmake -S ./ -B ./build/ -D CMAKE_BUILD_TYPE=Debug -D CMAKE_TOOLCHAIN_FILE=./riscv-gcc.cmake\ncmake --build ./build/ --target tests\nspike ./build/tests/tests.elf\n```\n\n### Building for RISC-V with Vector Support\n\nfirst, we clean any previous builds\n\n```bash\nrm -rf ./build/\n```\n\n```bash\n# make sure $RISCV is set\ncmake -S ./ -B ./build/ -D CMAKE_BUILD_TYPE=Debug -D CMAKE_TOOLCHAIN_FILE=./riscv-gcc.cmake -D CONFIG_BACKEND_RISCV_V=ON\ncmake --build ./build/ --target tests\nspike --isa=rv64gcv_zicntr_zfh ./build/tests/tests.elf\n```\n\nRunning with FP16 support\n\n```bash\ncmake -S ./ -B ./build/ -D CMAKE_BUILD_TYPE=Debug -D CMAKE_TOOLCHAIN_FILE=./riscv-gcc.cmake -D CONFIG_BACKEND_RISCV_V=ON -D CONFIG_BACKEND_RISCV_ZVFH=ON\ncmake --build ./build/ --target tests\nspike --isa=rv64gcv_zicntr_zfh_zvfh ./build/tests/tests.elf\n```\n\nRunning with FP16 support with GCC\u003c14.0\n\nFor GCC\u003c14.0, it does not support the fp16 intrinsics, so we need to use the assembly implementation. (TO BE FIXED)\n\n```bash\ncmake -S ./ -B ./build/ -D CMAKE_BUILD_TYPE=Debug -D CMAKE_TOOLCHAIN_FILE=./riscv-gcc.cmake -D CONFIG_BACKEND_RISCV_V=ON -D RISCV_ZVFH=ON -D RISCV_V_ASM=ON\ncmake --build ./build/ --target tests\nspike --isa=rv64gcv_zicntr_zfh_zvfh ./build/tests/tests.elf\n```\n\n### Building for RISC-V with Gemmini (Not working for now)\n\nfirst, we clean any previous builds\n\n```bash\nrm -rf ./build/\n```\n\n```bash\ncmake -S ./ -B ./build/ -D CMAKE_BUILD_TYPE=Debug -D CMAKE_TOOLCHAIN_FILE=./riscv-gcc.cmake -D GEMMINI=ON\ncmake --build ./build/ --target all\nspike --extension=gemmini ./build/tests/tests.elf\n```\n\n### Building for K230 board\n\nfirst, we clean any previous builds\n\n```bash\nrm -rf ./build/\n```\n\n```bash\ncmake -S ./ -B ./build/ -G \"Unix Makefiles\" -D CMAKE_TOOLCHAIN_FILE=./k230-gcc.cmake -D CMAKE_BUILD_TYPE=Debug -D RISCV_V=ON -D RISCV_V_ASM=ON\ncmake --build ./build/ --target all\n```\n\n### Cleaning build files\n\n```\ncmake --build ./build/ --target clean\n```\n\n### Cleaning CMake files\n\n```\nrm -rf ./build/\n```\n\n\n## Supported config flags\n\nCONFIG_DTYPE_ENABLE_F16: enable F16 support.\n\nCONFIG_DTYPE_ENABLE_I32: enable I32 support.\n\nCONFIG_BACKEND_RISCV_V: use RISC-V Vector backend.\n\nCONFIG_BACKEND_RISCV_ZVFH: use RISC-V Vector Floating-Point Hardware for the FP16 operations.\n\nCONFIG_DEBUG_RISCV_V_USE_REDOSUM: use REDOSUM for the reduction operation in RVV. By default, it uses REDUSUM.\n\n\n## Support matrix of backends and operators\n\n| Operator                | Variants  | Scalar CPU | RISC-V Vector | Gemmini |\n| ----------------------- | --------- | ---------- | ------------- | ------- |\n| min                     | F16       | ✅         | ✅ (ZVFH)    |         |\n|                         | F32       | ✅         | ✅           |         |\n| max                     | F16       | ✅         | ✅ (ZVFH)    |         |\n|                         | F32       | ✅         | ✅           |         |\n| add                     | I32       | ✅         | 🔜           |         |\n|                         | F16       | ✅         | ✅ (ZVFH)    |         |\n|                         | F32       | ✅         | ✅           |         |\n| addscalar               | F16       | ✅         | ✅ (ZVFH)    |         |\n|                         | F32       | ✅         | ✅           |         |\n| mul                     | F16       | ✅         | ❌ (ZVFH)    |         |\n|                         | F32       | ✅         | ❌           |         |\n| mulscalar               | F16       | ✅         | ❌ (ZVFH)    |         |\n|                         | F32       | ✅         | ❌           |         |\n| matmul (mm)             | F16       | ✅         | ✅ (ZVFH)    |         |\n|                         | F32       | ✅         | ✅           |         |\n| addmatmul (addmm)       | I32       | ✅         | 🔜           |         |\n|                         | F16       | ✅         | ✅ (ZVFH)    |         |\n|                         | F32       | ✅         | ✅           |         |\n| linear                  | F16       | ✅         | ✅ (ZVFH)    |         |\n|                         | F32       | ✅         | ✅           |         |\n| elu                     | F16       | ✅         |              |         |\n|                         | F32       | ✅         |              |         |\n| relu                    | F16       | ✅         | ✅ (ZVFH)    |         |\n|                         | F32       | ✅         | ✅           |         |\n| tanh                    | F16       | ✅         |              |         |\n|                         | F32       | ✅         |              |         |\n| softmax                 | F16       |            |              |         |\n|                         | F32       | ✅         |              |         |\n| scaled_dot_product_attention | F16       |            |              |         |\n|                         | F32       | ⚠️         |              |         |\n\n✅: supported\n\n⚠️: partially supported, failing on some tests\n\n❌: not supported\n\n🔜: planned\n\n\n## Convert the model\n\nFirst, we need to install the Baremetal-NN converter Python library.\n\n```bash\n# Install from PyPI\npip install baremetal-nn\n\n# Install locally\npip install -e ./baremetal-nn/\n```\n\n\n\nTo export PyTorch model, we will use the `TracedModule` in this converter library. Assuming the PyTorch model is named `model`, we will wrap it with `TracedModule`.\n\n```python\nfrom baremetal_nn import TracedModule\n\nm = TracedModule(model)\n```\n\nThen, we need to perform at least one round of inference to let the tool trace the entire forward flow and record the dimension and shape of each layer.\n\n```python\nexample_output = m.forward(example_input)\n```\n\nThe output content is not used. It is a good idea to examine the output value to make sure that our model still functions correctly.\n\nFinally, we can convert the model to C files.\n\n```python\nm.convert(\n  output_directory=\"./\",\n  model_name=\"model\"\n)\n```\n\nWe should get a `model.h` and a `model.bin` files under the specified execution directory.\n\nMore examples can be found in the `examples/` folder.\n\n\n\n## Memory layout\n\nBaremetal-NN uses the NHWC memory layout and supports up to 4-dimension tensor.\n\n**N**: batch, **H**: height, **W**: width, **C**: channels\n\n\n## Stats\n\n### Star History\n\n![](https://api.star-history.com/svg?repos=ucb-bar/Baremetal-NN\u0026type=Date\u0026theme=dark)\n\n\n## Acknowledgement\n\nIf you find this code useful, we would appreciate if you would cite it with the following:\n\n```\n@software{baremetal-nn,\n  author = {Yufeng Chi},\n  title = {{Baremetal-NN: A tool for running PyTorch models in resource-constrained embedded environments.}},\n  url = {https://github.com/ucb-bar/Baremetal-NN},\n  year = {2024},\n  version = {0.2.0}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fucb-bar%2Fbaremetal-nn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fucb-bar%2Fbaremetal-nn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fucb-bar%2Fbaremetal-nn/lists"}