{"id":13393589,"url":"https://github.com/wangzhaode/mnn-llm","last_synced_at":"2025-05-14T13:03:04.380Z","repository":{"id":148761226,"uuid":"615272885","full_name":"wangzhaode/mnn-llm","owner":"wangzhaode","description":"llm deploy project based mnn. This project has merged into MNN.","archived":false,"fork":false,"pushed_at":"2025-01-20T12:22:23.000Z","size":12136,"stargazers_count":1574,"open_issues_count":0,"forks_count":172,"subscribers_count":28,"default_branch":"master","last_synced_at":"2025-04-19T13:15:23.847Z","etag":null,"topics":["baichuan2-7b","chatglm-6b","chatglm2-6b","codegeex2-6b","cpp","cuda","mnn","opencl","qwen-7b"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wangzhaode.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-17T10:26:09.000Z","updated_at":"2025-04-17T08:32:00.000Z","dependencies_parsed_at":"2023-12-04T13:43:54.118Z","dependency_job_id":"48872c07-af4f-457e-a868-4dbb9662441d","html_url":"https://github.com/wangzhaode/mnn-llm","commit_stats":null,"previous_names":["wangzhaode/chatglm-mnn"],"tags_count":28,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wangzhaode%2Fmnn-llm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wangzhaode%2Fmnn-llm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wangzhaode%2Fmnn-llm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wangzhaode%2Fmnn-llm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wangzhaode","download_url":"https://codeload.github.com/wangzhaode/mnn-llm/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254149739,"owners_count":22022847,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["baichuan2-7b","chatglm-6b","chatglm2-6b","codegeex2-6b","cpp","cuda","mnn","opencl","qwen-7b"],"created_at":"2024-07-30T17:00:56.502Z","updated_at":"2025-05-14T13:03:04.304Z","avatar_url":"https://github.com/wangzhaode.png","language":"C++","funding_links":[],"categories":["Chinese models","C++"],"sub_categories":["glm 6b"],"readme":"![mnn-llm](resource/logo.png)\n\n# mnn-llm\n[![License](https://img.shields.io/github/license/wangzhaode/mnn-llm)](LICENSE.txt)\n[![Download](https://img.shields.io/github/downloads/wangzhaode/mnn-llm/total)](https://github.com/wangzhaode/mnn-llm/releases)\n[![Documentation Status](https://readthedocs.org/projects/mnn-llm/badge/?version=latest)](https://mnn-llm.readthedocs.io/en/latest/?badge=latest)\n\n\n[English](./README_en.md)\n\n**该项目代码已经Merge到[MNN](https://github.com/alibaba/MNN/tree/master/transformers/llm).**\n\n## 示例工程\n\n- [cli](./demo/cli_demo.cpp): 使用命令行编译，android编译参考[android_build.sh](./script/android_build.sh)\n- [web](./demo/web_demo.cpp): 使用命令行编译，运行时需要指定[web资源](./web)\n- [android](./android/): 使用Android Studio打开编译；\n- [ios](./ios/README.md): 使用Xcode打开编译；🚀🚀🚀**该示例代码100%由ChatGPT生成**🚀🚀🚀\n- [python](./python/README.md): 对mnn-llm的python封装`mnnllm`；\n- [other](./demo): 新增文本embedding；\n\n## 模型导出与下载\n\nllm模型导出`onnx`和`mnn`模型请使用[llm-export](https://github.com/wangzhaode/llm-export)\n\n[模型下载](./docs/download.md)\n\n\n## 构建\n\nCI构建状态：\n\n[![Build Status][pass-linux]][ci-linux]\n[![Build Status][pass-macos]][ci-macos]\n[![Build Status][pass-windows]][ci-windows]\n[![Build Status][pass-android]][ci-android]\n[![Build Status][pass-ios]][ci-ios]\n[![Build Status][pass-python]][ci-python]\n\n[pass-linux]: https://github.com/wangzhaode/mnn-llm/actions/workflows/linux.yml/badge.svg\n[pass-macos]: https://github.com/wangzhaode/mnn-llm/actions/workflows/macos.yml/badge.svg\n[pass-windows]: https://github.com/wangzhaode/mnn-llm/actions/workflows/windows.yml/badge.svg\n[pass-android]: https://github.com/wangzhaode/mnn-llm/actions/workflows/android.yml/badge.svg\n[pass-ios]: https://github.com/wangzhaode/mnn-llm/actions/workflows/ios.yml/badge.svg\n[pass-python]: https://github.com/wangzhaode/mnn-llm/actions/workflows/python.yml/badge.svg\n[ci-linux]: https://github.com/wangzhaode/mnn-llm/actions/workflows/linux.yml\n[ci-macos]: https://github.com/wangzhaode/mnn-llm/actions/workflows/macos.yml\n[ci-windows]: https://github.com/wangzhaode/mnn-llm/actions/workflows/windows.yml\n[ci-android]: https://github.com/wangzhaode/mnn-llm/actions/workflows/android.yml\n[ci-ios]: https://github.com/wangzhaode/mnn-llm/actions/workflows/ios.yml\n[ci-python]: https://github.com/wangzhaode/mnn-llm/actions/workflows/python.yml\n\n### 本地编译\n```\n# clone\ngit clone --recurse-submodules https://github.com/wangzhaode/mnn-llm.git\ncd mnn-llm\n\n# linux\n./script/build.sh\n\n# macos\n./script/build.sh\n\n# windows msvc\n./script/build.ps1\n\n# python wheel\n./script/py_build.sh\n\n# android\n./script/android_build.sh\n\n# android apk\n./script/android_app_build.sh\n\n# ios\n./script/ios_build.sh\n```\n\n一些编译宏：\n- `BUILD_FOR_ANDROID`: 编译到Android设备；\n- `LLM_SUPPORT_VISION`: 是否支持视觉处理能力；\n- `DUMP_PROFILE_INFO`: 每次对话后dump出性能数据到命令行中；\n\n默认使用`CPU`，如果使用其他后端或能力，可以在编译MNN时添加`MNN`编译宏\n- cuda: `-DMNN_CUDA=ON`\n- opencl: `-DMNN_OPENCL=ON`\n- metal: `-DMNN_METAL=ON`\n\n### 4. 执行\n\n```bash\n# linux/macos\n./cli_demo ./Qwen2-1.5B-Instruct-MNN/config.json # cli demo\n./web_demo ./Qwen2-1.5B-Instruct-MNN/config.json ../web # web ui demo\n\n# windows\n.\\Debug\\cli_demo.exe ./Qwen2-1.5B-Instruct-MNN/config.json\n.\\Debug\\web_demo.exe ./Qwen2-1.5B-Instruct-MNN/config.json ../web\n\n# android\nadb push android_build/MNN/OFF/arm64-v8a/libMNN.so /data/local/tmp\nadb push android_build/MNN/express/OFF/arm64-v8a/libMNN_Express.so /data/local/tmp\nadb push android_build/libllm.so android_build/cli_demo /data/local/tmp\nadb push Qwen2-1.5B-Instruct-MNN /data/local/tmp\nadb shell \"cd /data/local/tmp \u0026\u0026 export LD_LIBRARY_PATH=. \u0026\u0026 ./cli_demo ./Qwen2-1.5B-Instruct-MNN/config.json\"\n```\n\n\n## Reference\n\u003cdetails\u003e\n  \u003csummary\u003ereference\u003c/summary\u003e\n\n- [cpp-httplib](https://github.com/yhirose/cpp-httplib)\n- [chatgpt-web](https://github.com/xqdoo00o/chatgpt-web)\n- [ChatViewDemo](https://github.com/BrettFX/ChatViewDemo)\n- [nlohmann/json](https://github.com/nlohmann/json)\n- [Qwen-1.8B-Chat](https://modelscope.cn/models/qwen/Qwen-1_8B-Chat/summary)\n- [Qwen-7B-Chat](https://modelscope.cn/models/qwen/Qwen-7B-Chat/summary)\n- [Qwen-VL-Chat](https://modelscope.cn/models/qwen/Qwen-VL-Chat/summary)\n- [Qwen1.5-0.5B-Chat](https://modelscope.cn/models/qwen/Qwen1.5-0.5B-Chat/summary)\n- [Qwen1.5-1.8B-Chat](https://modelscope.cn/models/qwen/Qwen1.5-1.8B-Chat/summary)\n- [Qwen1.5-4B-Chat](https://modelscope.cn/models/qwen/Qwen1.5-4B-Chat/summary)\n- [Qwen1.5-7B-Chat](https://modelscope.cn/models/qwen/Qwen1.5-7B-Chat/summary)\n- [Qwen2-0.5B-Instruct](https://modelscope.cn/models/qwen/Qwen2-0.5B-Instruct/summary)\n- [Qwen2-1.5B-Instruct](https://modelscope.cn/models/qwen/Qwen2-1.5B-Instruct/summary)\n- [Qwen2-7B-Instruct](https://modelscope.cn/models/qwen/Qwen2-7B-Instruct/summary)\n- [Qwen2-VL-2B-Instruct](https://modelscope.cn/models/qwen/Qwen2-VL-2B-Instruct/summary)\n- [Qwen2-VL-7B-Instruct](https://modelscope.cn/models/qwen/Qwen2-VL-7B-Instruct/summary)\n- [Qwen2.5-0.5B-Instruct](https://modelscope.cn/models/qwen/Qwen2.5-0.5B-Instruct/summary)\n- [Qwen2.5-1.5B-Instruct](https://modelscope.cn/models/qwen/Qwen2.5-1.5B-Instruct/summary)\n- [Qwen2.5-3B-Instruct](https://modelscope.cn/models/qwen/Qwen2.5-3B-Instruct/summary)\n- [Qwen2.5-7B-Instruct](https://modelscope.cn/models/qwen/Qwen2.5-7B-Instruct/summary)\n- [Qwen2.5-Coder-1.5B-Instruct](https://modelscope.cn/models/qwen/Qwen2.5-Coder-1.5B-Instruct/summary)\n- [Qwen2.5-Coder-7B-Instruct](https://modelscope.cn/models/qwen/Qwen2.5-Coder-7B-Instruct/summary)\n- [Qwen2.5-Math-1.5B-Instruct](https://modelscope.cn/models/qwen/Qwen2.5-Math-1.5B-Instruct/summary)\n- [Qwen2.5-Math-7B-Instruct](https://modelscope.cn/models/qwen/Qwen2.5-Math-7B-Instruct/summary)\n- [chatglm-6b](https://modelscope.cn/models/ZhipuAI/chatglm-6b/summary)\n- [chatglm2-6b](https://modelscope.cn/models/ZhipuAI/chatglm2-6b/summary)\n- [codegeex2-6b](https://modelscope.cn/models/ZhipuAI/codegeex2-6b/summary)\n- [chatglm3-6b](https://modelscope.cn/models/ZhipuAI/chatglm3-6b/summary)\n- [glm4-9b-chat](https://modelscope.cn/models/ZhipuAI/glm-4-9b-chat/summary)\n- [Llama-2-7b-chat-ms](https://modelscope.cn/models/modelscope/Llama-2-7b-chat-ms/summary)\n- [Llama-3-8B-Instruct](https://modelscope.cn/models/modelscope/Meta-Llama-3-8B-Instruct/summary)\n- [Llama-3.2-1B-Instruct](https://modelscope.cn/models/LLM-Research/Llama-3.2-1B-Instruct/summary)\n- [Llama-3.2-3B-Instruct](https://modelscope.cn/models/LLM-Research/Llama-3.2-3B-Instruct/summary)\n- [Baichuan2-7B-Chat](https://modelscope.cn/models/baichuan-inc/baichuan-7B/summary)\n- [internlm-chat-7b](https://modelscope.cn/models/Shanghai_AI_Laboratory/internlm-chat-7b/summary)\n- [Yi-6B-Chat](https://modelscope.cn/models/01ai/Yi-6B-Chat/summary)\n- [deepseek-llm-7b-chat](https://modelscope.cn/models/deepseek-ai/deepseek-llm-7b-chat/summary)\n- [TinyLlama-1.1B-Chat-v0.6](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v0.6)\n- [phi-2](https://modelscope.cn/models/AI-ModelScope/phi-2/summary)\n- [bge-large-zh](https://modelscope.cn/models/AI-ModelScope/bge-large-zh/summary)\n- [gte_sentence-embedding_multilingual-base](https://modelscope.cn/models/iic/gte_sentence-embedding_multilingual-base/summary)\n\u003c/details\u003e","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwangzhaode%2Fmnn-llm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwangzhaode%2Fmnn-llm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwangzhaode%2Fmnn-llm/lists"}