{"id":20115397,"url":"https://github.com/amperecomputingai/ampere_model_library","last_synced_at":"2025-05-06T13:32:43.544Z","repository":{"id":37447633,"uuid":"351079358","full_name":"AmpereComputingAI/ampere_model_library","owner":"AmpereComputingAI","description":"Ampere Model Library - AML","archived":false,"fork":false,"pushed_at":"2023-12-18T19:51:57.000Z","size":1591,"stargazers_count":11,"open_issues_count":16,"forks_count":3,"subscribers_count":7,"default_branch":"main","last_synced_at":"2023-12-19T16:50:23.940Z","etag":null,"topics":["aarch64","ampere","arm64","armv8-a","artificial-intelligence","computer-vision","inference","machine-learning","mlperf-inference","model-zoo","natural-language-processing","onnxruntime","pytorch","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AmpereComputingAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2021-03-24T12:54:53.000Z","updated_at":"2024-01-04T16:33:39.933Z","dependencies_parsed_at":"2023-02-16T23:30:52.379Z","dependency_job_id":"46910c5e-6d05-4f3d-a5d5-357119af3c80","html_url":"https://github.com/AmpereComputingAI/ampere_model_library","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AmpereComputingAI%2Fampere_model_library","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AmpereComputingAI%2Fampere_model_library/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AmpereComputingAI%2Fampere_model_library/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AmpereComputingAI%2Fampere_model_library/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AmpereComputingAI","download_url":"https://codeload.github.com/AmpereComputingAI/ampere_model_library/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252693714,"owners_count":21789747,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aarch64","ampere","arm64","armv8-a","artificial-intelligence","computer-vision","inference","machine-learning","mlperf-inference","model-zoo","natural-language-processing","onnxruntime","pytorch","tensorflow"],"created_at":"2024-11-13T18:35:04.907Z","updated_at":"2025-05-06T13:32:43.538Z","avatar_url":"https://github.com/AmpereComputingAI.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![Ampere AI](https://ampereaimodelzoo.s3.eu-central-1.amazonaws.com/ampere_logo_®_primary_stacked_rgb.png \"Ampere AI\")\n# Ampere Model Library\n![CI tests](https://github.com/AmpereComputingAI/ampere_model_library/actions/workflows/test.yml/badge.svg)\n![PyTorch pull count](https://img.shields.io/docker/pulls/amperecomputingai/pytorch?logo=pytorch\u0026label=PyTorch\u0026labelColor=%23ffc9bb\u0026color=%23ffa590\u0026link=https%3A%2F%2Fhub.docker.com%2Fr%2Famperecomputingai%2Fpytorch)\n![TF pull count](https://img.shields.io/docker/pulls/amperecomputingai/tensorflow?logo=tensorflow\u0026label=TensorFlow\u0026labelColor=%23e6cc00\u0026color=%23e69b00\u0026link=https%3A%2F%2Fhub.docker.com%2Fr%2Famperecomputingai%2Ftensorflow)\n![ORT pull count](https://img.shields.io/docker/pulls/amperecomputingai/onnxruntime?logo=onnx\u0026logoColor=black\u0026label=ONNXRT\u0026labelColor=%23e5e5e5\u0026color=%23cccccc\u0026link=https%3A%2F%2Fhub.docker.com%2Fr%2Famperecomputingai%2Fonnxruntime)\n![llama.cpp pull count](https://img.shields.io/docker/pulls/amperecomputingai/llama.cpp?logo=meta\u0026logoColor=black\u0026label=llama.cpp\u0026labelColor=violet\u0026color=purple)\n\nAML's goal is to make benchmarking of various AI architectures on Ampere CPUs a pleasurable experience :)\n\nThis means we want the library to be quick to set up and to get you numbers you are interested in. On top of that we want the code to be readible and well structured so it's easy to inspect what exactly is being measured. If you feel like we are not exactly there, please let us know right away by raising an [issue](https://github.com/AmpereComputingAI/ampere_model_library/issues/new/choose)! Thank you :)\n## AML setup\n![Ampere AI solutions](https://uawartifacts.blob.core.windows.net/upload-files/ai_infographic_cloud_47da3198d8.jpg \"Ampere AI solutions\")\nVisit [our dockerhub](https://hub.docker.com/u/amperecomputingai) for our frameworks selection.\n\n\n```bash\nsudo apt update \u0026\u0026 sudo apt install -y docker.io\nsudo docker run --privileged=true -it amperecomputingai/pytorch:latest\n# we also offer onnxruntime and tensorflow\n```\nYou should see terminal output similar to that one:\n\n![Ampere docker welcome prompt](https://ampereaimodelzoo.s3.eu-central-1.amazonaws.com/Screenshot+2024-02-16+at+20.16.37.png \"Ampere docker welcome prompt\")\n\nNow, inside the Docker container, run:\n\n```bash\ngit clone --recursive https://github.com/AmpereComputingAI/ampere_model_library.git\ncd ampere_model_library\nbash setup_deb.sh\nsource set_env_variables.sh\n```\n\nYou are good to go! 👌\n\n\n## Examples\n\n### The go-to solution is benchmark.py script\nBenchmark script allows you to quickly evaluate performance of your Ampere system on the example of:\n- ResNet-50 v1.5\n- Whisper medium EN\n- DLRM\n- BERT large\n- YOLO v8s\n\nIt's incredibly user-friendly and designed to assist you in getting the best out of your system.\n\n**After completing setup with Ampere Optimized PyTorch (see [AML setup](#aml-setup)), it's as easy as:**\n```bash\npython3 benchmark.py --no-interactive  # remove --no-interactive if you want a quick estimation of performance\n```\n\n![Evaluation results](https://ampereaimodelzoo.s3.eu-central-1.amazonaws.com/Screenshot+2024-03-01+at+19.53.08.png \"Evaluation results\")\n\n### Running particular AI architectures\n\nArchitectures are categorized based on the task they were originally envisioned for. Therefore, you will find ResNet and VGG under computer_vision and BERT under natural_language_processing.\nUsual workflow is to first setup AML (see [AML setup](#aml-setup)), source environment variables by running ```source set_env_variables.sh``` and run run.py or similarly named python file in the directory of the achitecture you want to benchmark. Some models require additional setup steps to be completed first, which should be described in their respective directories under README.md files.\n\n### ResNet-50 v1.5\n![ResNet-50 architecture](https://miro.medium.com/v2/resize:fit:720/format:webp/0*tH9evuOFqk8F41FG.png \"ResNet-50 architecture\")\n\nNote that the example uses PyTorch - we recommend using Ampere Optimized PyTorch for best results (see [AML setup](#aml-setup)).\n```bash\nsource set_env_variables.sh\nIGNORE_DATASET_LIMITS=1 AIO_IMPLICIT_FP16_TRANSFORM_FILTER=\".*\" AIO_NUM_THREADS=32 python3 computer_vision/classification/resnet_50_v15/run.py -m resnet50 -p fp32 -b 16 -f pytorch\n```\nThe command above will run the model utilizing 32 threads, with batch size of 16. Implicit conversion to FP16 datatype will be applied - you can default to fp32 precision by not setting the **AIO_IMPLICIT_FP16_TRANSFORM_FILTER** variable.\n\n**PSA: you can adjust the level of AIO debug messages by setting AIO_DEBUG_MODE to values in range from 0 to 4 (where 0 is the most peaceful)**\n\n### Whisper tiny EN\n![Whisper architecture](https://raw.githubusercontent.com/openai/whisper/main/approach.png \"Whisper architecture\")\n\nNote that the example uses PyTorch - we recommend using Ampere Optimized PyTorch for best results (see [AML setup](#aml-setup)).\n```bash\nsource set_env_variables.sh\nAIO_IMPLICIT_FP16_TRANSFORM_FILTER=\".*\" AIO_NUM_THREADS=32 python3 speech_recognition/whisper/run.py -m tiny.en\n```\nThe command above will run the model utilizing 32 threads, implicit conversion to FP16 datatype will be applied - you can default to fp32 precision by not setting the **AIO_IMPLICIT_FP16_TRANSFORM_FILTER** variable.\n\n### LLaMA2 7B\n![Transformer vs LLaMA](https://miro.medium.com/v2/resize:fit:1400/1*g9cykAlrYrNkG-rVTIKQ2Q.png \"https://www.youtube.com/shorts/A6LOVMymJhs\")\n\nNote that the example uses PyTorch - we recommend using Ampere Optimized PyTorch for best results (see [AML setup](#aml-setup)).\n\n**Before running this example you need to be granted access by Meta to LLaMA2 model. Go here: [Meta](https://ai.meta.com/resources/models-and-libraries/llama-downloads) and here: [HF](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) to learn more.**\n```bash\nsource set_env_variables.sh\nwget https://github.com/tloen/alpaca-lora/raw/main/alpaca_data.json\nAIO_IMPLICIT_FP16_TRANSFORM_FILTER=\".*\" AIO_NUM_THREADS=32 python3 natural_language_processing/text_generation/llama2/run.py -m meta-llama/Llama-2-7b-chat-hf --dataset_path=alpaca_data.json\n```\nThe command above will run the model utilizing 32 threads, implicit conversion to FP16 datatype will be applied - you can default to fp32 precision by not setting the **AIO_IMPLICIT_FP16_TRANSFORM_FILTER** variable.\n\n### YOLO v8 large\n![YOLO object detection](https://miro.medium.com/v2/resize:fit:1358/1*r_3a2KsqTznF4Pt-MnF00Q.jpeg \"YOLO object detection\")\n\nNote that the example uses PyTorch - we recommend using Ampere Optimized PyTorch for best results (see [AML setup](#aml-setup)).\n```bash\nsource set_env_variables.sh\nwget https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8l.pt\nAIO_IMPLICIT_FP16_TRANSFORM_FILTER=\".*\" AIO_NUM_THREADS=32 python3 computer_vision/object_detection/yolo_v8/run.py -m yolov8l.pt -p fp32 -f pytorch\n```\nThe command above will run the model utilizing 32 threads, implicit conversion to FP16 datatype will be applied - you can default to fp32 precision by not setting the **AIO_IMPLICIT_FP16_TRANSFORM_FILTER** variable.\n\n### BERT large\n![BERT embeddings](https://miro.medium.com/v2/resize:fit:1400/0*m_kXt3uqZH9e7H4w.png \"BERT embeddings\")\n\nNote that the example uses PyTorch - we recommend using Ampere Optimized PyTorch for best results (see [AML setup](#aml-setup)).\n```bash\nsource set_env_variables.sh\nwget -O bert_large_mlperf.pt https://zenodo.org/records/3733896/files/model.pytorch?download=1\nAIO_IMPLICIT_FP16_TRANSFORM_FILTER=\".*\" AIO_NUM_THREADS=32 python3 natural_language_processing/extractive_question_answering/bert_large/run_mlperf.py -m bert_large_mlperf.pt -p fp32 -f pytorch\n```\nThe command above will run the model utilizing 32 threads, implicit conversion to FP16 datatype will be applied - you can default to fp32 precision by not setting the **AIO_IMPLICIT_FP16_TRANSFORM_FILTER** variable.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famperecomputingai%2Fampere_model_library","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famperecomputingai%2Fampere_model_library","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famperecomputingai%2Fampere_model_library/lists"}