{"id":30292452,"url":"https://github.com/pfcclab/ernie4.5-developer-resource","last_synced_at":"2026-02-04T11:10:30.709Z","repository":{"id":302788875,"uuid":"1013628802","full_name":"PFCCLab/ERNIE4.5-Developer-Resource","owner":"PFCCLab","description":null,"archived":false,"fork":false,"pushed_at":"2025-07-23T03:48:44.000Z","size":14,"stargazers_count":22,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-23T05:29:35.237Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PFCCLab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-04T07:56:31.000Z","updated_at":"2025-07-23T03:48:47.000Z","dependencies_parsed_at":"2025-07-23T05:29:37.366Z","dependency_job_id":"4130881a-76bb-41b2-b1e1-cf1634ce0f62","html_url":"https://github.com/PFCCLab/ERNIE4.5-Developer-Resource","commit_stats":null,"previous_names":["mattheliu/ernie4.5-developer-resource","pfcclab/ernie4.5-developer-resource"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/PFCCLab/ERNIE4.5-Developer-Resource","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PFCCLab%2FERNIE4.5-Developer-Resource","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PFCCLab%2FERNIE4.5-Developer-Resource/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PFCCLab%2FERNIE4.5-Developer-Resource/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PFCCLab%2FERNIE4.5-Developer-Resource/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PFCCLab","download_url":"https://codeload.github.com/PFCCLab/ERNIE4.5-Developer-Resource/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PFCCLab%2FERNIE4.5-Developer-Resource/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270791280,"owners_count":24645781,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-16T02:00:11.002Z","response_time":91,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-08-17T00:34:05.638Z","updated_at":"2026-02-04T11:10:30.702Z","avatar_url":"https://github.com/PFCCLab.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🚀 ERNIE 4.5: The Developer's Resource Guide 🤖\n\nWelcome to the developer resource guide for ERNIE 4.5, a powerful family of open-source models from Baidu. This guide provides all the essential information, links, and code examples to help you get started with deploying ERNIE 4.5 models.\n\n## 🔗  Quick Links\n\n| Resource          | URL                                                              |\n| ----------------- | ---------------------------------------------------------------- |\n| **📝 Blog** | [https://yiyan.baidu.com/blog](https://yiyan.baidu.com/blog)       |\n| **📄 Technical Report** | [https://yiyan.baidu.com/blog/publication](https://yiyan.baidu.com/blog/publication/) |\n| **🤗 Hugging Face** | [https://huggingface.co/baidu](https://huggingface.co/baidu)       |\n| **🔧 ERNIEKit** | [https://github.com/PaddlePaddle/ERNIE](https://github.com/PaddlePaddle/ERNIE) |\n| **⚡ FastDeploy** | [https://www.modelscope.cn/studios/PaddlePaddle](https://github.com/PaddlePaddle/FastDeploy) |\n| **💡 Baidu AI Studio** | [https://aistudio.baidu.com/](https://aistudio.baidu.com/)         |\n| **🔅 ModelScope** | [https://www.modelscope.cn/studios/PaddlePaddle](https://www.modelscope.cn/studios/PaddlePaddle) |\n\n## 📦 Open Source Models\n\nERNIE 4.5 is available under the **Apache 2.0 License**. The open-source release includes 10 models across 3 series, along with code for pre-training, fine-tuning, and inference deployment.\n\n| Series        | Activated Parameters | Model Name Suffix | Description                                                                                             |\n| ------------- | -------------------- | ----------------- | ------------------------------------------------------------------------------------------------------- |\n| **0.3B Series** | \\~300 Million         | `-0.3B`           | Lightweight models suitable for local and on-device deployment.                                         |\n| **A3B Series** | \\~3 Billion           | `-A3B`            | Efficient models offering a balance of performance and resource usage.                                  |\n| **A47B Series** | \\~47 Billion          | `-A47B`           | State-of-the-art models for maximum performance on complex tasks.                                       |\n\n**🏷️  Naming Conventions:**\n\n  * **-Base**: The foundational pre-trained model.\n  * *(no suffix)*: The instruction-tuned chat model.\n  * **-VL**: The Vision-Language multimodal model.\n  * **Hybrid Thinking**: The VL model features a \"thinking mode\" (controlled by a parameter) that enhances reasoning, alongside a standard non-thinking mode for fast perception.\n\n-----\n\n## 👩‍💻  Getting Started: Running ERNIE 4.5 Locally\n\nYou can run the lightweight ERNIE 4.5 models on your local machine. Below are examples using `llama.cpp` for general CPU inference and MNN for optimized on-device deployment.\n\n### 🍎 Example 1: Running with `llama.cpp` (for ERNIE-4.5-0.3B)\n\nThe `llama.cpp` project supports the ERNIE 4.5 0.3B models, allowing you to run them efficiently on a CPU.\n\n**Step 1️⃣: Clone and Build `llama.cpp`**\nFirst, get the latest version of `llama.cpp` which includes support for ERNIE 4.5.\n\n```bash\n# Clone the repository\ngit clone https://github.com/ggerganov/llama.cpp.git\ncd llama.cpp\n\n# Build the project\nmkdir build  \ncd build\ncmake ..\nmake\n```\n\n**Step 2️⃣: Download the ERNIE 4.5 GGUF Model**\ndownload the .gguf file.\n```bash\n# Install huggingface_hub\npip install -U huggingface_hub\nhuggingface-cli download --resume-download unsloth/ERNIE-4.5-0.3B-PT-GGUF --local-dir path/to/dir\n```\n```\n# If timeout,use \nexport HF_ENDPOINT=https://hf-mirror.com\n```\n**Step 3️⃣: Run Inference**\nUse the `main` executable from `llama.cpp` to run the model.\n\n```bash\n# Run the model in interactive mode\ncd llama.cpp/build/bin\n./llama-cli -m /path/to/dir/ERNIE-4.5-0.3B-PT.gguf --jinja -p \"Hello, who are you?\" -n 128\n```\n\n  * `-m`: Specifies the path to your GGUF model file.\n  * `-p`: Provides an initial prompt.\n  * `-n`: Sets the number of tokens to generate.\n\n### 🍏 Example 2: Running with MNN (for ERNIE-4.5-0.3B-PT-MNN)\nReference project: https://huggingface.co/taobao-mnn/ERNIE-4.5-0.3B-PT-MNN, welcome to visit the original author link\n\nMNN is a highly efficient deep learning inference engine, perfect for edge and mobile devices. A 4-bit quantized version of ERNIE 4.5 is available specifically for MNN.\n\n**Step 1️⃣: Download the MNN Model**\nYou can download the model from Hugging Face or ModelScope.\n\n```bash\n# Install Hugging Face Hub\npip install -U huggingface_hub\n```\n```\n# Download the model files\n# shell download\nhuggingface-cli download --resume-download taobao-mnn/ERNIE-4.5-0.3B-PT-MNN --local-dir path/to/dir\n```\n```\n# If timeout,use \nexport HF_ENDPOINT=https://hf-mirror.com\n```\n```\n# SDK download\nfrom huggingface_hub import snapshot_download\nmodel_dir = snapshot_download('taobao-mnn/ERNIE-4.5-0.3B-PT-MNN')\n```\n```\n# git clone\ngit clone https://www.modelscope.cn/MNN/ERNIE-4.5-0.3B-PT-MNN\n```\n\n**Step 2️⃣: Clone and Compile MNN**\nYou need to compile the MNN engine from the source with the correct flags to enable LLM support.\n\n```bash\n# Clone the MNN repository\ngit clone https://github.com/alibaba/MNN.git\ncd MNN\n\n# Create build directory and compile\nmkdir build \u0026\u0026 cd build\ncmake .. -DMNN_LOW_MEMORY=true -DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true\nmake -j\n```\n\n**Step 3️⃣: Run the Demo**\nUse the `llm_demo` application to run the model. \n```bash\n# Run the MNN demo\n./llm_demo /path/to/ERNIE-4.5-0.3B-PT-MNN/config.json prompt.txt\n```\n### 🍊 Example 3: Running with mlx (for ERNIE-4.5-0.3B-PT-bf16)\nReference project: https://huggingface.co/mlx-community/ERNIE-4.5-0.3B-PT-bf16, welcome to visit the original author link\n\nMLX LM is a Python package for generating text and fine-tuning large language models on Apple silicon with MLX.\n\nThis model mlx-community/ERNIE-4.5-0.3B-PT-bf16 was converted to MLX format from baidu/ERNIE-4.5-0.3B-PT using mlx-lm version 0.25.2.\n\n**Step 1️⃣: Download the mlx Model**\n```bash\n# Install Hugging Face Hub\npip install -U huggingface_hub\n```\n```\n# Download the model files\n# shell download\nhuggingface-cli download --resume-download mlx-community/ERNIE-4.5-0.3B-PT-bf16 --local-dir path/to/dir\n```\n```\n# If timeout,use \nexport HF_ENDPOINT=https://hf-mirror.com\n```\n**Step 2️⃣: Use with mlx**\n```bash\nfrom mlx_lm import load, generate\n\nmodel, tokenizer = load(\"mlx-community/ERNIE-4.5-0.3B-PT-bf16\")\n\nprompt = \"hello\"\n\nif tokenizer.chat_template is not None:\n    messages = [{\"role\": \"user\", \"content\": prompt}]\n    prompt = tokenizer.apply_chat_template(\n        messages, add_generation_prompt=True\n    )\n\nresponse = generate(model, tokenizer, prompt=prompt, verbose=True)\n```\n-----\n## 🌍 Developer Ecosystem and Tools\n\n### 🛠️ Official Toolkits (PaddlePaddle Based)\n\n  * **[ERNIEKit](https://github.com/PaddlePaddle/ERNIE)**: An industrial-grade toolkit for the full development lifecycle of ERNIE models. It supports high-performance pre-training, SFT, DPO, LoRA, and quantization (QAT/PTQ).\n  * **[FastDeploy](https://github.com/PaddlePaddle/FastDeploy)**: A production-ready inference and deployment toolkit. It features advanced acceleration (speculative decoding, MTP), comprehensive quantization support, and compatibility with numerous hardware backends (NVIDIA, Kunlunxin, Ascend, etc.).\n\n## **🤝  Friends of OSS Projects (Third-Party Integrations)**\n\nERNIE 4.5 is being actively integrated into the wider open-source ecosystem. Here is the current status of support in popular projects:\n\n| Project            | Status       |\n| ------------------ | ------------ |\n| **transformers** | ✅ **Merged 🎉 !** Ernie 0.3B and MoE models are now integrated! Directly usable. ⚙️ ([Repo](https://github.com/huggingface/transformers))([PR #39228](https://github.com/huggingface/transformers/pull/39228)) \u003cbr\u003e ✅ **Merged 🎉 !** [Ernie 4.5 VL models #39585](https://github.com/huggingface/transformers/pull/39585) |\n| **vLLM** | ✅ **Merged 🎉 !** Native support for ERNIE 4.5 text models is now available in the main branch. ([PR #20220](https://github.com/vllm-project/vllm/pull/20220)) \u003cbr\u003e ✅ **Merged 🎉 !** Added ERNIE 4.5 VL Model Support ([PR #22514](https://github.com/vllm-project/vllm/pull/22514)) \u003cbr\u003e ✅ **Merged 🎉 !**: Enable EPLB on ernie4.5-moe ([PR #22100](https://github.com/vllm-project/vllm/pull/22100)) |\n| **sglang** | ✅ **Merged 🎉 !** ERNIE 4.5 is now supported in sglang, enabling streamlined usage in structured generation and multi-agent orchestration scenarios. ([PR #7657](https://github.com/sgl-project/sglang/pull/7657)) |\n| **llama.cpp/ollama** | ✅ **Merged 🎉 !** 0.3B models and Ernie4.5 MoE are already supported in `llama.cpp` — efficient local CPU inference available. ([PR #14408](https://github.com/ggerganov/llama.cpp/pull/14408))([PR #14746](https://github.com/ggml-org/llama.cpp/pull/14746)) |\n| **ms-swift** | ✅ **Merged 🎉 !** Support for ERNIE 4.5 has been integrated, enabling streamlined fine-tuning and inference within the ModelScope ecosystem. ([PR #4757](https://github.com/modelscope/ms-swift/pull/4757)) \u003cbr\u003e ✅ **Merged 🎉 !** ERNIE VL support ([PR #6545](https://github.com/modelscope/ms-swift/pull/6545)) |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpfcclab%2Fernie4.5-developer-resource","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpfcclab%2Fernie4.5-developer-resource","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpfcclab%2Fernie4.5-developer-resource/lists"}