https://github.com/ARM-software/ComputeLibrary
  
  
    The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies. 
    https://github.com/ARM-software/ComputeLibrary
  
aarch64 android arm armv7 armv8 computer-vision cpp linux machine-learning neon neural-network opencl simd sve
        Last synced: 8 months ago 
        JSON representation
    
The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
- Host: GitHub
- URL: https://github.com/ARM-software/ComputeLibrary
- Owner: ARM-software
- License: mit
- Created: 2017-03-10T14:51:43.000Z (over 8 years ago)
- Default Branch: main
- Last Pushed: 2024-08-28T12:51:06.000Z (about 1 year ago)
- Last Synced: 2024-08-28T14:15:02.967Z (about 1 year ago)
- Topics: aarch64, android, arm, armv7, armv8, computer-vision, cpp, linux, machine-learning, neon, neural-network, opencl, simd, sve
- Language: C++
- Homepage:
- Size: 830 MB
- Stars: 2,790
- Watchers: 232
- Forks: 771
- Open Issues: 21
- 
            Metadata Files:
            - Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Security: SECURITY.md
- Support: support/AclRequires.h
 
Awesome Lists containing this project
- awesome-simd - ComputeLibrary - C++: Library for Computer Vision and Machine Learning (ARM only) (Image processing)
- awesome-edge-machine-learning - https://github.com/ARM-software/ComputeLibrary
- awesome-list - Compute Library - A set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies. (Deep Learning Framework / Deployment & Distribution)
- awesome-gemm - ARM Compute Library: Optimized for ARM platforms - 2.0/MIT) (Libraries 🗂️ / Cross-Platform Libraries 🌍)
- Awesome-Embeded-AI - ARM Compute Library - A 优化的神经网络基本算子,可以使用 ARM Compute Library 的 API 直接自己搭建神经网络算法,或者使用在这里介绍的 ARM NN,转化主流神经网络框架训练好的模型并使用。ARM NN 是 ARM 公司 在 Cortex-A 嵌入式端 "[AI and Machine Learning](https://developer.arm.com/solutions/machine-learning-on-arm)" 主题的 关键项目。 (微处理器 MPU 端 / ARM NN)
README
          
> **⚠ Deprecation Notice**
> 24.01 announcement: NCHW data format specific optimizations will gradually be removed from the code base in
> future releases. The implication of this is that the user is expected to translate NCHW models into NHWC in
> order to benefit from the optimizations.
 
# Compute Library 
The Compute Library is a collection of low-level machine learning functions optimized for Arm® Cortex®-A, Arm® Neoverse™ and Arm® Mali™ GPUs architectures.
The library provides superior performance to other open source alternatives and immediate support for new Arm® technologies e.g. SVE2.
Key Features:
- Open source software available under a permissive MIT license
- Over 100 machine learning functions for CPU and GPU
- Multiple convolution algorithms (GeMM, Winograd, FFT, Direct and indirect-GeMM)
- Support for multiple data types: FP32, FP16, INT8, UINT8, BFLOAT16
- Micro-architecture optimization for key ML primitives
- Highly configurable build options enabling lightweight binaries
- Advanced optimization techniques such as kernel fusion, Fast math enablement and texture utilization
- Device and workload specific tuning using OpenCL tuner and GeMM optimized heuristics
| Repository  | Link                                                             |
| ----------- | ---------------------------------------------------------------- |
| Release     | https://github.com/arm-software/ComputeLibrary                   |
| Development | https://review.mlplatform.org/#/admin/projects/ml/ComputeLibrary |
## Documentation
[](https://artificial-intelligence.sites.arm.com/computelibrary/v25.02.1/index.xhtml)
> Note: The documentation includes the reference API, changelogs, build guide, contribution guide, errata, etc.
## Pre-built binaries
All the binaries can be downloaded from [here](https://github.com/ARM-software/ComputeLibrary/releases) or from the tables below.
| Platform       | Operating System | Release archive (Download) |
| -------------- | ---------------- | -------------------------- |
| Raspberry Pi 4 | Linux® 32bit      | [](https://github.com/ARM-software/ComputeLibrary/releases/download/v25.02.1/arm_compute-v25.02.1-linux-armv7a-cpu-bin.tar.gz) |
| Raspberry Pi 4 | Linux® 64bit      | [](https://github.com/ARM-software/ComputeLibrary/releases/download/v25.02.1/arm_compute-v25.02.1-linux-aarch64-cpu-bin.tar.gz) |
| Odroid N2      | Linux® 64bit      | [](https://github.com/ARM-software/ComputeLibrary/releases/download/v25.02.1/arm_compute-v25.02.1-linux-aarch64-cpu-bin.tar.gz) [](https://github.com/ARM-software/ComputeLibrary/releases/download/v25.02.1/arm_compute-v25.02.1-linux-aarch64-cpu-gpu-bin.tar.gz) |
| HiKey960       | Linux® 64bit      | [](https://github.com/ARM-software/ComputeLibrary/releases/download/v25.02.1/arm_compute-v25.02.1-linux-aarch64-cpu-bin.tar.gz) [](https://github.com/ARM-software/ComputeLibrary/releases/download/v25.02.1/arm_compute-v25.02.1-linux-aarch64-cpu-gpu-bin.tar.gz) |
| Architecture | Operating System | Release archive (Download) |
| ------------ | ---------------- | -------------------------- |
| armv7        | Linux®            | [](https://github.com/ARM-software/ComputeLibrary/releases/download/v25.02.1/arm_compute-v25.02.1-linux-armv7a-cpu-bin.tar.gz) [](https://github.com/ARM-software/ComputeLibrary/releases/download/v25.02.1/arm_compute-v25.02.1-linux-armv7a-cpu-gpu-bin.tar.gz) |
| arm64-v8a    | Android™          | [](https://github.com/ARM-software/ComputeLibrary/releases/download/v25.02.1/arm_compute-v25.02.1-android-aarch64-cpu-bin.tar.gz) [](https://github.com/ARM-software/ComputeLibrary/releases/download/v25.02.1/arm_compute-v25.02.1-android-aarch64-cpu-gpu-bin.tar.gz) |
| arm64-v8a    | Linux®            | [](https://github.com/ARM-software/ComputeLibrary/releases/download/v25.02.1/arm_compute-v25.02.1-linux-aarch64-cpu-bin.tar.gz) [](https://github.com/ARM-software/ComputeLibrary/releases/download/v25.02.1/arm_compute-v25.02.1-linux-aarch64-cpu-gpu-bin.tar.gz) |
Please refer to the following link for more pre-built binaries: [](https://github.com/ARM-software/ComputeLibrary/releases/tag/v25.02.1)
Pre-build binaries are generated with the following security / good coding practices related flags:
> -Wall, -Wextra, -Wformat=2, -Winit-self, -Wstrict-overflow=2, -Wswitch-default, -Woverloaded-virtual, -Wformat-security, -Wctor-dtor-privacy, -Wsign-promo, -Weffc++, -pedantic, -fstack-protector-strong
## Supported Architectures/Technologies
- Arm® CPUs:
    - Arm® Cortex®-A processor family using Arm® Neon™ technology
    - Arm® Neoverse™ processor family
    - Arm® Cortex®-R processor family with Armv8-R AArch64 architecture using Arm® Neon™ technology
    - Arm® Cortex®-X1 processor using Arm® Neon™ technology
- Arm® Mali™ GPUs:
    - Arm® Mali™-G processor family
    - Arm® Mali™-T processor family
- x86
## Supported Systems
- Android™
- Bare Metal
- Linux®
- OpenBSD®
- macOS®
- Tizen™
## Resources
- [Tutorial: Running AlexNet on Raspberry Pi with Compute Library](https://community.arm.com/processors/b/blog/posts/running-alexnet-on-raspberry-pi-with-compute-library)
- [Gian Marco's talk on Performance Analysis for Optimizing Embedded Deep Learning Inference Software](https://www.embedded-vision.com/platinum-members/arm/embedded-vision-training/videos/pages/may-2019-embedded-vision-summit)
- [Gian Marco's talk on optimizing CNNs with Winograd algorithms at the EVS](https://www.embedded-vision.com/platinum-members/arm/embedded-vision-training/videos/pages/may-2018-embedded-vision-summit-iodice)
- [Gian Marco's talk on using SGEMM and FFTs to Accelerate Deep Learning](https://www.embedded-vision.com/platinum-members/arm/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit-iodice)
## Experimental builds
**⚠ Important** Bazel and CMake builds are experimental CPU only builds, please see the [documentation](https://artificial-intelligence.sites.arm.com/computelibrary/v25.02.1/how_to_build.xhtml) for more details.
## How to contribute
Contributions to the Compute Library are more than welcome. If you are interested on contributing, please have a look at our [how to contribute guidelines](https://artificial-intelligence.sites.arm.com/computelibrary/v25.02.1/contribution_guidelines.xhtml).
### Developer Certificate of Origin (DCO)
Before the Compute Library accepts your contribution, you need to certify its origin and give us your permission. To manage this process we use the Developer Certificate of Origin (DCO) V1.1 (https://developercertificate.org/)
To indicate that you agree to the the terms of the DCO, you "sign off" your contribution by adding a line with your name and e-mail address to every git commit message:
```Signed-off-by: John Doe ```
You must use your real name, no pseudonyms or anonymous contributions are accepted.
### Public mailing list
For technical discussion, the ComputeLibrary project has a public mailing list: acl-dev@lists.linaro.org
The list is open to anyone inside or outside of Arm to self subscribe.  In order to subscribe, please visit the following website:
https://lists.linaro.org/mailman3/lists/acl-dev.lists.linaro.org/
## License and Contributions
The software is provided under MIT license. Contributions to this project are accepted under the same license.
### Other Projects
This project contains code from other projects as listed below. The original license text is included in those source files.
* The OpenCL header library is licensed under Apache License, Version 2.0, which is a permissive license compatible with MIT license.
* The half library is licensed under MIT license.
* The libnpy library is licensed under MIT license.
* The stb image library is either licensed under MIT license or is in Public Domain. It is used by this project under the terms of MIT license.
* The KleidiAI library is licensed under Apache License, Version 2.0.
* The GoogleTest library is used by KleidiAI and is licensed under BSD-3-Clause license.
* The Benchmark library is used by KleidiAI and is licensed under Apache License, Version 2.0.
## Trademarks and Copyrights
Android is a trademark of Google LLC.
Arm, Cortex, Mali and Neon are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere.
Bazel is a trademark of Google LLC., registered in the U.S. and other
countries.
CMake is a trademark of Kitware, Inc., registered in the U.S. and other
countries.
Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries.
Mac and macOS are trademarks of Apple Inc., registered in the U.S. and other
countries.
Tizen is a registered trademark of The Linux Foundation.
Windows® is a trademark of the Microsoft group of companies.