{"id":28710821,"url":"https://github.com/arm-software/cmsis-nn","last_synced_at":"2025-06-14T21:08:21.591Z","repository":{"id":63362200,"uuid":"498779556","full_name":"ARM-software/CMSIS-NN","owner":"ARM-software","description":"CMSIS-NN Library","archived":false,"fork":false,"pushed_at":"2025-05-19T08:11:50.000Z","size":7344,"stargazers_count":273,"open_issues_count":2,"forks_count":74,"subscribers_count":14,"default_branch":"main","last_synced_at":"2025-05-19T09:27:14.016Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://arm-software.github.io/CMSIS-NN","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ARM-software.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-06-01T14:52:41.000Z","updated_at":"2025-05-19T08:10:58.000Z","dependencies_parsed_at":"2023-10-16T23:59:21.148Z","dependency_job_id":"998fba03-d059-46f0-afe1-94c1ebeb68f6","html_url":"https://github.com/ARM-software/CMSIS-NN","commit_stats":null,"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"purl":"pkg:github/ARM-software/CMSIS-NN","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ARM-software%2FCMSIS-NN","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ARM-software%2FCMSIS-NN/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ARM-software%2FCMSIS-NN/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ARM-software%2FCMSIS-NN/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ARM-software","download_url":"https://codeload.github.com/ARM-software/CMSIS-NN/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ARM-software%2FCMSIS-NN/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259884525,"owners_count":22926446,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-14T21:08:19.932Z","updated_at":"2025-06-14T21:08:21.564Z","avatar_url":"https://github.com/ARM-software.png","language":"C","readme":"# CMSIS NN\nCMSIS NN software library is a collection of efficient neural network kernels developed to maximize the\nperformance and minimize the memory footprint of neural networks on Arm Cortex-M processors.\n\n## Supported Framework\nThe library follows the [int8](https://www.tensorflow.org/lite/performance/quantization_spec) and int16 quantization specification of TensorFlow Lite for Microcontrollers.\nThis means CMSIS-NN is bit-exact with Tensorflow Lite reference kernels. In some cases TFL and TFLM reference kernels may not be bit-exact. In that case CMSIS-NN follows TFLM reference kernels. The unit test readme provides an [overview](https://github.com/ARM-software/CMSIS-NN/blob/main/Tests/UnitTest/README.md#tests-depending-on-tflm-interpreter).\n\n## Branches and Tags\nThere is a single branch called 'main'.\nTags are created during a release. Two releases are planned to be done in a year. The releases can be found\n[here](https://github.com/ARM-software/CMSIS-NN/releases) .\n\n## Current Operator Support\nIn general optimizations are written for an architecture feature. This falls into one of the following categories.\nBased on feature flags for a processor or architecture provided to the compiler, the right implementation is picked.\n### Pure C\n There is always a pure C implementation for an operator. This is used for processors like Arm Cortex-M0 or Cortex-M3.\n### DSP Extension\nProcessors with DSP extension uses Single Instruction Multiple Data(SIMD) instructions for optimization. Examples of\nprocessors here are Cortex-M4 or a Cortex-M33 configured with optional DSP extension.\n\n### MVE Extension\nProcessors with Arm Helium Technology use the Arm M-profile Vector Extension(MVE) instructions for optimization.\nExamples are Cortex-M55 or Cortex-M85 configured with MVE.\n\n| Operator        | C \u003cbr\u003e int8 | C\u003cbr\u003eint16 | C\u003cbr\u003eint4* | DSP\u003cbr\u003eint8 | DSP\u003cbr\u003eint16 | DSP\u003cbr\u003eint4* | MVE\u003cbr\u003eint8 | MVE\u003cbr\u003eint16 | MVE\u003cbr\u003eint4* |\n| --------------- | ----------- | ---------- |------------|-------------| -------------|--------------|-------------| -------------|--------------|\n| Conv2D          | Yes         | Yes        | Yes        | Yes         | Yes          | Yes          | Yes         | Yes          | Yes          |\n| DepthwiseConv2D | Yes         | Yes        | Yes        | Yes         | Yes          | Yes          | Yes         | Yes          | Yes          |\n| TransposeConv2D | Yes         | No         | No         | Yes         | No           | No           | Yes         | No           | No           |\n| Fully Connected | Yes         | Yes        | Yes        | Yes         | Yes          | Yes          | Yes         | Yes          | Yes          |\n| Batch Matmul    | Yes         | Yes        | No         | Yes         | Yes          | No           | Yes         | Yes          | No           |\n| Add             | Yes         | Yes        | N/A        | Yes         | Yes          | N/A          | Yes         | Yes          | N/A          |\n| Minimum         | Yes         | No         | N/A        | No          | No           | N/A          | Yes         | No           | N/A          |\n| Maximum         | Yes         | No         | N/A        | No          | No           | N/A          | Yes         | No           | N/A          |\n| Mul             | Yes         | Yes        | N/A        | Yes         | Yes          | N/A          | Yes         | Yes          | N/A          |\n| MaxPooling      | Yes         | Yes        | N/A        | Yes         | Yes          | N/A          | Yes         | Yes          | N/A          |\n| AvgPooling      | Yes         | Yes        | N/A        | Yes         | Yes          | N/A          | Yes         | Yes          | N/A          |\n| Softmax         | Yes         | Yes        | N/A        | Yes         | Yes          | N/A          | Yes         | No           | N/A          |\n| LSTM            | Yes         | Yes        | No         | Yes         | Yes          | No           | Yes         | Yes          | No           |\n| SVDF            | Yes         | No         | No         | Yes         | No           | No           | Yes         | No           | No           |\n| Pad             | Yes         | No         | N/A        | No          | No           | N/A          | Yes         | No           | N/A          |\n| Transpose       | Yes         | No         | N/A        | No          | No           | N/A          | Yes         | No           | N/A          |\n\n* int4 weights + int8 activations\n\n## Contribution Guideline\nFirst, a thank you for the contribution. Here are some guidelines and good to know information to get started.\n\n### Coding Guideline\nBy default, follow the style used in the file. You'll soon start noticing a pattern like\n* Variable and function names are lower case with an underscore separator.\n* Hungarian notation is not used. Well, almost.\n* If the variable names don't convey the action, then add comments.\n\n### New Files\nOne function per file is followed in most places. In those cases, the file name must match the function name. Connect\nthe function to an appropriate Doxygen group as well.\n\n### Doxygen\nFunction prototypes must have a detailed comment header in Doxygen format. You can execute the doxygen document generation\nscript in the Documentation/Doxygen folder to check that no errors are introduced.\n\n### Unit Tests\nFor any new features and bug fixes, new unit tests are needed. Improvements have to be verifed by unit tests. If you do\nnot have the means to execute the tests, you can still make the PR and comment that you need help in completing/executing\nthe unit tests.\n\n### Version \u0026 Date\nEach File has a version number and a date field that must be updated when making any change to that file. The versioning\nfollows Semantic Versioning 2.0.0 format. For details check: https://semver.org/\n\n## Building CMSIS-NN as a library\nIt is recommended to use toolchain files from [Arm Ethos-U Core Platform](https://review.mlplatform.org/admin/repos/ml/ethos-u/ethos-u-core-platform) project. These are supporting TARGET_CPU, which is a required argument. Note that if not specifying TARGET_CPU, these toolchains will set some default. The format must be TARGET_CPU=cortex-mXX, see examples below.\n\nHere is an example:\n\n```\ncd \u003c/path/to/CMSIS_NN\u003e\nmkdir build\ncd build\ncmake .. -DCMAKE_TOOLCHAIN_FILE=\u003c/path/to/ethos-u-core-platform\u003e/cmake/toolchain/arm-none-eabi-gcc.cmake -DTARGET_CPU=cortex-m55\nmake\n```\n\nSome more examples:\n\n```\ncmake .. -DCMAKE_TOOLCHAIN_FILE=\u003c/path/to/ethos-u-core-platform\u003e/cmake/toolchain/armclang.cmake -DTARGET_CPU=cortex-m55\ncmake .. -DCMAKE_TOOLCHAIN_FILE=\u003c/path/to/ethos-u-core-platform\u003e/cmake/toolchain/arm-none-eabi-gcc.cmake -DTARGET_CPU=cortex-m7\ncmake .. -DCMAKE_TOOLCHAIN_FILE=\u003c/path/to/ethos-u-core-platform\u003e/cmake/toolchain/armclang.cmake -DTARGET_CPU=cortex-m3\n```\n\n### Compiler Options\nDefault optimization level is set at Ofast. This can be overwritten with CMake on command line by using \u003cnobr\u003e*\"-DCMSIS_OPTIMIZATION_LEVEL\"*\u003c/nobr\u003e. Please change according to project needs.\nJust bear in mind this can impact performance. With only optimization level -O0, *ARM_MATH_AUTOVECTORIZE* needs to be defined for processors with Helium\nTechnology.\n\nThe compiler option *'-fomit-frame-pointer'* is enabled by default at -O and higher. When no optimization level is specified,\nyou may need to specify '-fomit-frame-pointer'.\n\nThe compiler option *'-fno-builtin'* does not utilize optimized implementations of e.g. memcpy and memset, which are heavily used by CMSIS-NN. It can significantly downgrade performance. So this should be avoided. The compiler option *'-ffreestanding'* should also be avoided as it enables '-fno-builtin' implicitly.\n\nFor processors with DSP extension, int4 and int8 convolutions make use of the restrict keyword for the output pointer. This can allow the compiler to make optimizations but the actual performance result depends on the Arm(R) Cortex(R)-M processor, the compiler and the model. This optimization can be enabled by providing the compiler with a defition of OPTIONAL_RESTRICT_KEYWORD=__restrict . In general Arm Cortex-M7 will benefit from this. Similar Arm Cortex-M4 and Cortex-M33, will generally not benefit from it, but it may still bring an uplift depending on the model and compiler. It is recommended to enable this for Cortex-M7.\n\nFurther compile-time options:\n\n| Name | Explanation | Affects headers(*) |\n|------|-----|-----|\n| CMSIS_NN_USE_SINGLE_ROUNDING | Use a single instead of double rounding in requantizazion. This may affect the output. | Yes |\n| CMSIS_NN_USE_REQUANTIZE_INLINE_ASSEMBLY | Use inline assembly for `arm_nn_requantize`. This code branch is faster on Cortex-M4, but slower on others. Results should be bit-identical, but was observed to cause differences with Arm Compiler and Cortex-M7. | Yes |\n\n(*) If you enable an option that affects headers, also enable the equivalent option in TFL/TFLM.\n\n\n### Supported Compilers\n* CMSIS-NN is tested on Arm Compiler 6 and on Arm GNU Toolchain.\n* IAR compiler is not tested and there can be compilation and/or performance issues.\n* Compilation for Host is not supported out of the box. It should be possible to use the C implementation and compile for host with minor stubbing effort.\n\n## Inclusive Language\nThis product confirms to Arm’s inclusive language policy and, to the best of our knowledge, does not contain any non-inclusive language. If you find something that concerns you, email terms@arm.com.\n\n## Support / Contact\n\nFor any questions or to reach the CMSIS-NN team, please create a new issue in https://github.com/ARM-software/CMSIS-NN/issues\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farm-software%2Fcmsis-nn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Farm-software%2Fcmsis-nn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farm-software%2Fcmsis-nn/lists"}