{"id":18417585,"url":"https://github.com/xiaomi/mobile-ai-bench","last_synced_at":"2025-04-05T20:08:04.339Z","repository":{"id":145786508,"uuid":"141548224","full_name":"XiaoMi/mobile-ai-bench","owner":"XiaoMi","description":"Benchmarking Neural Network Inference on Mobile Devices","archived":false,"fork":false,"pushed_at":"2023-04-10T13:50:29.000Z","size":546,"stargazers_count":369,"open_issues_count":8,"forks_count":57,"subscribers_count":29,"default_branch":"master","last_synced_at":"2025-03-29T19:04:50.725Z","etag":null,"topics":["benchmarking","deep-learning"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/XiaoMi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-07-19T08:26:51.000Z","updated_at":"2025-03-26T00:03:48.000Z","dependencies_parsed_at":null,"dependency_job_id":"b43c3e46-a787-41f0-bcc8-e614d838a056","html_url":"https://github.com/XiaoMi/mobile-ai-bench","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/XiaoMi%2Fmobile-ai-bench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/XiaoMi%2Fmobile-ai-bench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/XiaoMi%2Fmobile-ai-bench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/XiaoMi%2Fmobile-ai-bench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/XiaoMi","download_url":"https://codeload.github.com/XiaoMi/mobile-ai-bench/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247393570,"owners_count":20931812,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmarking","deep-learning"],"created_at":"2024-11-06T04:10:07.456Z","updated_at":"2025-04-05T20:08:04.321Z","avatar_url":"https://github.com/XiaoMi.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"logo.png\" width=\"400\" alt=\"Mobile AI Bench\" /\u003e\n\u003c/div\u003e\n\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)\n[![pipeline status](https://gitlab.com/llhe/mobile-ai-bench/badges/master/pipeline.svg)](https://gitlab.com/llhe/mobile-ai-bench/pipelines)\n\n[FAQ](#FAQ) |\n[中文](README_zh.md)\n\nIn recent years, the on-device deep learning applications are getting more and\nmore popular on mobile phones or IoT devices. It's a challenging task for the developers to deploy their\ndeep learning models in their mobile applications or IoT devices.\n\nThey need to optionally choose a cost-effective hardware solution (i.e. chips and boards),\nthen a proper inference framework, optionally utilizing quantization or compression\ntechniques regarding the precision-performance trade-off, and finally\nrun the model on one or more of heterogeneous computing devices. How to make an\nappropriate decision among these choices is a tedious and time-consuming task.\n\n**Mobile AI Benchmark** (i.e. **MobileAIBench**) is an end-to-end benchmark tool\nwhich covers different chips and inference frameworks, with results\ninclude both speed and model accuracy, which will give insights for developers.\n\n## Daily Benchmark Results\nPlease check *benchmark* step in [daily CI pipeline page](https://gitlab.com/llhe/mobile-ai-bench/pipelines), due to the lack of test devices, the CI result may not cover all hardwares and frameworks.\n\n## FAQ\n**Q: Why are benchmark results not stable on my device?**\n\n**A**: Due to power save considerations, some SoCs have aggressive and advanced\npower control scheduling to reduce power consumption which make performance\nquite unstable (especially CPU). Benchmark results highly depend on\nstates of devices, e.g., running processes, temperature, power control policy.\nIt is recommended to disable power control policy (as shown in `tools/power.sh`) if possible (e.g., rooted phone).\nOtherwise, keep your device at idle state with low temperature, and benchmark one model on one framework each time.\n\n**Q: Why do some devices run faster (or slower) than expected in the CI benchmark result?**\n\n**A**: Some devices is rooted and has some specialized performance tuning while some\nothers is not rooted and failed to make such tuning (see the code for more details).\n\n**Q: Why is ncnn initialization time much less than others?**\n\n**A**: ncnn benchmark uses fake model parameters and skips loading weights from filesystem.\n\n**Q: Does benchmark use all available cores of devices?**\n\n**A**: Most modern Android phones use [ARM big.LITTLE](https://en.wikipedia.org/wiki/ARM_big.LITTLE) architecture which can lead to significant variance between different runs of the benchmark, we use only available big cores to reduce this variance by `taskset` command for MACE/NCNN/TFLITE benchmark.\nMoreover, there are no well-defined APIs for SNPE to bind to big cores and set thread count.\nThread count can be set by adding `--num_threads` to `tools/benchmark.sh` command.\n\n\n## Environment requirement\n\nMobileAIBench supports several deep learning frameworks (called `executor` in this project, i.e., [MACE](https://github.com/XiaoMi/mace), [SNPE](https://developer.qualcomm.com/software/qualcomm-neural-processing-sdk), [ncnn](https://github.com/Tencent/ncnn), [TensorFlow Lite](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite) and [HIAI](https://developer.huawei.com/consumer/en/devservice/doc/2020314)) currently, which may require the following dependencies:\n\n| Software  | Installation command  | Tested version  |\n| :-------: | :-------------------: | :-------------: |\n| Python  |   | 2.7  |\n| ADB  | apt-get install android-tools-adb  | Required by Android run, \u003e= 1.0.32  |\n| Android NDK  | [NDK installation guide](https://developer.android.com/ndk/guides/setup#install) | Required by Android build, r15c |\n| Bazel  | [bazel installation guide](https://docs.bazel.build/versions/master/install.html)  | 0.13.0  |\n| CMake  | apt-get install cmake  | \u003e= 3.11.3  |\n| FileLock  | pip install -I filelock==3.0.0  | Required by Android run  |\n| PyYaml  | pip install -I pyyaml==3.12  | 3.12.0  |\n| sh  | pip install -I sh==1.12.14  | 1.12.14  |\n| SNPE (optional) | [download](https://developer.qualcomm.com/software/qualcomm-neural-processing-sdk) and uncompress  | 1.18.0  |\n\n**Note 1:** [SNPE](https://developer.qualcomm.com/software/qualcomm-neural-processing-sdk)\nhas strict license that disallows redistribution, so the default link in the\nBazel `WORKSPACE` file is only accessible by the CI server. To benchmark SNPE\nin your local system (i.e. set `--executors` with `all` or `SNPE` explicitly),\nyou need to download the SDK [here](https://developer.qualcomm.com/software/qualcomm-neural-processing-sdk),\nuncompress it, [copy libgnustl_shared.so](https://developer.qualcomm.com/docs/snpe/setup.html)\n and modify `WORKSPACE` as the following:\n```python\n#new_http_archive(\n#    name = \"snpe\",\n#    build_file = \"third_party/snpe/snpe.BUILD\",\n#    sha256 = \"8f2b92b236aa7492e4acd217a96259b0ddc1a656cbc3201c7d1c843e1f957e77\",\n#    strip_prefix = \"snpe-1.22.2.233\",\n#    urls = [\n#        \"https://cnbj1-fds.api.xiaomi.net/aibench/third_party/snpe-1.22.2_with_libgnustl_shared.so.zip\",\n#    ],\n#)\n\nnew_local_repository(\n    name = \"snpe\",\n    build_file = \"third_party/snpe/snpe.BUILD\",\n    path = \"/path/to/snpe\",\n)\n```\n\n**Note 2:** [HIAI](https://developer.huawei.com/consumer/en/devservice/doc/2020301)\nhas strict license that disallows redistribution, so the default link in the\nBazel `WORKSPACE` file is only accessible by the CI server. To benchmark HIAI\nin your local system (i.e. set `--executors` with `all` or `HIAI` explicitly),\nyou need to login and download the SDK [here](https://developer.huawei.com/consumer/en/devservice/doc/2020301),\nuncompress it and get the `HiAI_DDK_100.200.010.011.zip` file, uncompress it\n and modify `WORKSPACE` as the following:\n```python\n#new_http_archive(\n#    name = \"hiai\",\n#    build_file = \"third_party/hiai/hiai.BUILD\",\n#    sha256 = \"8da8305617573bc495df8f4509fcb1655ffb073d790d9c0b6ca32ba4a4e41055\",\n#    strip_prefix = \"HiAI_DDK_100.200.010.011\",\n#    type = \"zip\",\n#    urls = [\n#        \"http://cnbj1.fds.api.xiaomi.com/aibench/third_party/HiAI_DDK_100.200.010.011_LITE.zip\",\n#    ],\n#)\n\nnew_local_repository(\n    name = \"hiai\",\n    build_file = \"third_party/hiai/hiai.BUILD\",\n    path = \"/path/to/hiai\",\n)\n```\n\n## Architecture\n```\n+-----------------+         +------------------+      +---------------+\n|   Benchmark     |         |   BaseExecutor   | \u003c--- | MaceExecutor  |\n+-----------------+         +------------------+      +---------------+\n| - executor      |-------\u003e | - executor       |\n| - model_name    |         | - device_type    |      +---------------+\n| - quantize      |         |                  | \u003c--- | SnpeExecutor  |\n| - input_names   |         +------------------+      +---------------+\n| - input_shapes  |         | + Init()         |\n| - output_names  |         | + Prepare()      |      +---------------+\n| - output_shapes |         | + Run()          | \u003c--- | NcnnExecutor  |\n| - run_interval  |         | + Finish()       |      +---------------+\n| - num_threads   |         |                  |\n+-----------------+         |                  |      +---------------+\n| - Run()         |         |                  | \u003c--- | TfLiteExecutor|\n+-----------------+         |                  |      +---------------+\n        ^     ^             |                  |\n        |     |             |                  |      +---------------+\n        |     |             |                  | \u003c--- | HiaiExecutor  |\n        |     |             +------------------+      +---------------+\n        |     |\n        |     |             +--------------------+\n        |     |             |PerformanceBenchmark|\n        |     --------------+--------------------+\n        |                   | - Run()            |\n        |                   +--------------------+\n        |\n        |                   +---------------+      +---------------------+                           \n+--------------------+ ---\u003e |PreProcessor   | \u003c--- |ImageNetPreProcessor |\n| PrecisionBenchmark |      +---------------+      +---------------------+\n+--------------------+\n| - pre_processor    |      +---------------+      +---------------------+\n| - post_processor   | ---\u003e |PostProcessor  | \u003c--- |ImageNetPostProcessor|\n| - metric_evaluator |      +---------------+      +---------------------+\n+--------------------+\n| - Run()            |      +---------------+\n+--------------------+ ---\u003e |MetricEvaluator|\n                            +---------------+\n```\n\n## How To Use\n\n### Benchmark Performance of all models on all executors\n\n```bash\nbash tools/benchmark.sh --benchmark_option=Performance \\\n                        --target_abis=armeabi-v7a,arm64-v8a,aarch64,armhf\n```\n\nThe whole benchmark may take a few time, and continuous benchmarking may heat\nthe device very quickly, so you may set the following arguments according to your\ninterests. Only MACE supports precision benchmark right now.\n\n| option         | type | default     | explanation |\n| :-----------:  | :--: | :----------:| ------------|\n| --benchmark_option | str | Performance | Benchmark options, Performance/Precision. |\n| --output_dir   | str  | output      | Benchmark output directory. |\n| --executors    | str  | all         | Executors(MACE/SNPE/NCNN/TFLITE/HIAI), comma separated list or all. |\n| --device_types | str  | all         | DeviceTypes(CPU/GPU/DSP/NPU), comma separated list or all. |\n| --target_abis  | str  | armeabi-v7a | Target ABIs(armeabi-v7a,arm64-v8a,aarch64,armhf), comma separated list. |\n| --model_names  | str  | all         | Model names(InceptionV3,MobileNetV1...), comma separated list or all. |\n| --run_interval | int  | 10          | Run interval between benchmarks, seconds. |\n| --num_threads  | int  | 4           | The number of threads. |\n| --input_dir    | str  | \"\"          | Input data directory for precision benchmark. |\n\n### Configure ssh devices\nFor embedded ARM-Linux devices whose abi is aarch64 or armhf, ssh connection is supported.\nConfigure ssh devices in `generic-mobile-devices/devices_for_ai_bench.yml`, for example:\n```yaml\ndevices:\n  nanopi:\n    target_abis: [aarch64, armhf]\n    target_socs: RK3333\n    models: Nanopi M4\n    address: 10.231.46.118\n    username: pi\n```\n\n### Adding a model to run on existing executor\n\n* Add the new model name in `aibench/proto/base.proto` if not in there.\n\n* Configure the model info in `aibench/proto/model.meta`.\n\n* Configure the benchmark info in `aibench/proto/benchmark.meta`.\n\n* Run benchmark\n\n    Performance benchmark.\n\n    ```bash\n    bash tools/benchmark.sh --benchmark_option=Performance \\\n                            --executors=MACE --device_types=CPU --model_names=MobileNetV1 \\\n                            --target_abis=armeabi-v7a,arm64-v8a,aarch64,armhf\n    ```\n\n    Precision benchmark. Only supports ImageNet images as inputs for benchmarking MACE precision.\n\n    ```bash\n    bash tools/benchmark.sh --benchmark_option=Precision --input_dir=/path/to/inputs \\\n                            --executors=MACE --device_types=CPU --model_names=MobileNetV1 \\\n                            --target_abis=armeabi-v7a,arm64-v8a,aarch64,armhf\n    ```\n* Check benchmark result\n\n    ```bash\n    python report/csv_to_html.py\n    ```\n\n  Open the corresponding link in a browser to see the report.\n\n\n### Adding a new AI executor\n\n* Define `executor` and implement the interfaces:\n\n    ```c++\n    class YourExecutor : public BaseExecutor {\n     public:\n      YourExecutor() :\n          BaseExecutor(executor_type, device_type, model_file, weight_file) {}\n      \n      // Init method should invoke the initializing process for your executor \n      // (e.g.  Mace needs to compile OpenCL kernel once per target). It will be\n      // called only once when creating executor engine.\n      virtual Status Init(int num_threads);\n\n      // Load model and prepare to run. It will be called only once before \n      // benchmarking the model.\n      virtual Status Prepare();\n      \n      // Run the model. It will be called more than once.\n      virtual Status Run(const std::map\u003cstd::string, BaseTensor\u003e \u0026inputs,\n                         std::map\u003cstd::string, BaseTensor\u003e *outputs);\n      \n      // Unload model and free the memory after benchmarking. It will be called\n      // only once.\n      virtual void Finish();\n    };\n    ```\n\n* Include your executor header in `aibench/benchmark/benchmark_main.cc`:\n\n    ```c++\n    #ifdef AIBENCH_ENABLE_YOUR_EXECUTOR\n    #include \"aibench/executors/your_executor/your_executor.h\"\n    #endif\n    ```\n    \n* Add dependencies to `third_party/your_executor`, `aibench/benchmark/BUILD` and `WORKSPACE`.\n    Put macro `AIBENCH_ENABLE_YOUR_EXECUTOR` into `aibench/benchmark/BUILD` at `model_benchmark` target. \n\n* Benchmark a model on existing executor\n\n    Refer to [Adding a model to run on existing executor](#Adding a model to run on existing executor).\n\n## License\n[Apache License 2.0](LICENSE).\n\n## Notice\nFor [third party](third_party) dependencies, please refer to their licenses.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxiaomi%2Fmobile-ai-bench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxiaomi%2Fmobile-ai-bench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxiaomi%2Fmobile-ai-bench/lists"}