{"id":18524947,"url":"https://github.com/paddlepaddle/ernie","last_synced_at":"2025-05-13T19:16:02.891Z","repository":{"id":37759464,"uuid":"173544440","full_name":"PaddlePaddle/ERNIE","owner":"PaddlePaddle","description":"Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding \u0026 Generation, Multimodal Understanding \u0026 Generation, and beyond.","archived":false,"fork":false,"pushed_at":"2024-08-31T00:09:47.000Z","size":130836,"stargazers_count":6374,"open_issues_count":14,"forks_count":1287,"subscribers_count":194,"default_branch":"ernie-kit-open-v1.0","last_synced_at":"2025-04-29T17:47:58.026Z","etag":null,"topics":["bert","ernie","language-understanding","natural-language-processing","nlp"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PaddlePaddle.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-03-03T07:31:29.000Z","updated_at":"2025-04-28T12:39:31.000Z","dependencies_parsed_at":"2023-02-16T07:00:48.012Z","dependency_job_id":"56bcdf20-b9c8-4b51-8a3a-5c13636b1b53","html_url":"https://github.com/PaddlePaddle/ERNIE","commit_stats":{"total_commits":316,"total_committers":47,"mean_commits":6.723404255319149,"dds":0.8227848101265822,"last_synced_commit":"70bad885f44ff36de378aa82309e3600998b498d"},"previous_names":["paddlepaddle/lark"],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PaddlePaddle%2FERNIE","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PaddlePaddle%2FERNIE/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PaddlePaddle%2FERNIE/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PaddlePaddle%2FERNIE/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PaddlePaddle","download_url":"https://codeload.github.com/PaddlePaddle/ERNIE/tar.gz/refs/heads/ernie-kit-open-v1.0","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251584547,"owners_count":21613059,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bert","ernie","language-understanding","natural-language-processing","nlp"],"created_at":"2024-11-06T17:43:53.979Z","updated_at":"2025-04-29T21:25:31.733Z","avatar_url":"https://github.com/PaddlePaddle.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\n# ![ERNIE_milestone_20210519_zh](./.metas/ERNIE.png)\n\n文心大模型ERNIE是百度发布的产业级知识增强大模型，涵盖了NLP大模型和跨模态大模型。2019年3月，开源了国内首个开源预训练模型文心ERNIE 1.0，此后在语言与跨模态的理解和生成等领域取得一系列技术突破，并对外开源与开放了系列模型，助力大模型研究与产业化应用发展。提醒: ERNIE老版本代码已经迁移至repro分支，欢迎使用我们全新升级的基于动静结合的新版ERNIE套件进行开发。另外，也欢迎上[EasyDL](https://ai.baidu.com/easydl/pro)、[BML](https://ai.baidu.com/bml/app/overview)体验更丰富的功能。\n[【了解更多】](https://wenxin.baidu.com/)\n\n# 开源Roadmap\n- 2022.8.18:\n  - 图文跨模态预训练模型`ERNIE-ViL 2.0 (base)` [正式开源](https://github.com/PaddlePaddle/ERNIE/tree/ernie-kit-open-v1.0/Research/ERNIE-ViL2)\n- 2022.5.20:\n  - 最新开源ERNIE 3.0系列预训练模型:\n    - 110M参数通用模型ERNIE 3.0 Base\n    - 280M参数重量级通用模型ERNIE 3.0 XBase\n    - 74M轻量级通用模型ERNIE 3.0 Medium\n  - 新增语音-语言跨模态模型ERNIE-SAT [正式开源](https://github.com/PaddlePaddle/ERNIE/tree/repro/ernie-sat)\n  - 新增ERNIE-Gen（中文）预训练模型，支持多类主流生成任务：主要包括摘要、问题生成、对话、问答\n  - 动静结合的文心ERNIE开发套件：基于飞桨动态图功能，支持文心ERNIE模型动态图训练。您仅需要在模型训练开启前，修改一个参数配置，即可实现模型训练的动静切换。\n  - 将文本预处理、预训练模型、网络搭建、模型评估、上线部署等NLP开发流程规范封装。\n  - 支持NLP常用任务：文本分类、文本匹配、序列标注、信息抽取、文本生成、数据蒸馏等。\n  - 提供数据清洗、数据增强、分词、格式转换、大小写转换等数据预处理工具。\n- 2021.12.3:\n  - 多语言预训练模型`ERNIE-M` [正式开源](https://github.com/PaddlePaddle/ERNIE/tree/repro/ernie-m)\n- 2021.5.20:\n  - ERNIE 最新开源四大预训练模型:\n    - 多粒度语言知识模型`ERNIE-Gram` [正式开源](https://github.com/PaddlePaddle/ERNIE/blob/develop/ernie-gram)\n    - 超长文本双向建模预训练模型`ERNIE-Doc` [正式开源](https://github.com/PaddlePaddle/ERNIE/tree/repro/ernie-doc)\n    - 融合场景图知识的跨模态预训练模型教程`ERNIE-ViL` [正式开源](https://github.com/PaddlePaddle/ERNIE/tree/repro/ernie-vil)\n    - 语言与视觉一体的预训练模型`ERNIE-UNIMO` [正式开源](https://github.com/PaddlePaddle/ERNIE/tree/repro/ernie-unimo)\n- 2020.9.24:\n  - `ERNIE-ViL` 技术发布! ([点击进入](https://github.com/PaddlePaddle/ERNIE/tree/repro/ernie-vil))\n    - 面向视觉-语言知识增强的预训练框架，首次在视觉-语言预训练引入结构化的知识。\n      - 利用场景图中的知识，构建了物体、属性和关系预测任务，精细刻画模态间细粒度语义对齐。\n    - 五项视觉-语言下游任务取得最好效果，[视觉常识推理榜单](https://visualcommonsense.com/)取得第一。\n- 2020.5.20:\n  - `ERNIE-GEN` 模型正式开源! ([点击进入](https://github.com/PaddlePaddle/ERNIE/tree/repro/ernie-gen))\n    - 最强文本生成预训练模型正式开源，相关工作已被 `IJCAI-2020` 收录。\n      - 首次把 ERNIE 预训练技术能力扩展至文本生成领域，在多个典型任务上取得最佳。\n      - 您现在即可下载论文报告的所有模型（包含 [base/large/large-430G](https://github.com/PaddlePaddle/ERNIE/tree/repro/ernie-gen/README.zh.md#预训练模型)）。\n    - 首次在预训练阶段加入span-by-span 生成任务，让模型每次能够生成一个语义完整的片段。\n    - 提出填充式生成机制和噪声感知机制来缓解曝光偏差问题。\n    - 精巧的 Mulit-Flow Attention 实现框架。\n- 2020.4.30 发布[ERNIESage](https://github.com/PaddlePaddle/PGL/tree/master/examples/erniesage)， 一种新型图神经网络模型，采用ERNIE做为aggreagtor. 由[PGL](https://github.com/PaddlePaddle/PGL)实现。\n- 2020.3.27 [在SemEval2020五项子任务上夺冠](https://www.jiqizhixin.com/articles/2020-03-27-8)。\n- 2019.12.26 [GLUE榜第一名](https://www.technologyreview.com/2019/12/26/131372/ai-baidu-ernie-google-bert-natural-language-glue/)。\n- 2019.11.6 发布[ERNIE Tiny](https://www.jiqizhixin.com/articles/2019-11-06-9)。\n- 2019.7.7 发布[ERNIE 2.0](https://www.jiqizhixin.com/articles/2019-07-31-10)。\n- 2019.3.16 发布[ERNIE 1.0](https://www.jiqizhixin.com/articles/2019-03-16-3)。\n\n# 环境安装\n\n1. 安装环境依赖：[环境安装](./README_ENV.md)\n2. 安装Ernie套件\n\n```plain\ngit clone https://github.com/PaddlePaddle/ERNIE.git\n```\n\n# 快速上手：使用文心ERNIE大模型进行训练\n\n- 使用ERNIE3.0作为预训练模型，准备工作包括：\n  - 下载模型\n  - 准备数据\n  - 配置训练json文件\n  - 启动训练模型\n  - 配置预测json文件\n  - 启动预测\n- 我们以文本分类任务为例，来快速上手ERNIE大模型的使用\n\n## 下载模型\n\n- 使用ERNIE3.0预训练模型进行文本分类任务\n- ERNNIE3.0预训练模型的下载与配置\n\n```plain\n# ernie_3.0 模型下载\n# 进入models_hub目录\ncd ./applications/models_hub\n# 运行下载脚本\nsh download_ernie_3.0_base_ch.sh\n```\n\n## 准备数据\n\n- 文心各个任务的data目录下自带一些示例数据，能够实现直接使用，方便快速熟悉文心的使用。\n- 文本分类任务的数据\n\n```shell\n#进入文本分类任务文件夹\ncd ./applications/tasks/text_classification/\n#查看文本分类任务自带数据集\nls ./data\n```\n\n- 注：示例数据仅作为格式演示使用，在真正训练模型时请替换为真实数据。\n\n## 配置训练json文件\n\n- 其预置json文件在./examples/目录下，使用ERNIE3.0预训练模型进行训练的配置文件为的./examples/cls_ernie_fc_ch.json，在该json文件中对数据、模型、训练方式等逻辑进行了配置。\n\n```shell\n#查看 ERNIE3.0预训练模型 训练文本分类任务的配置文件\ncat ./examples/cls_ernie_fc_ch.json\n```\n\n## 启动训练\n\n- 将数据集存放妥当，并配置好cls_ernie_fc_ch.json，我们就可以运行模型训练的命令。\n- 其中，单卡指令为`python run_trainer.py`，如下所示，使用基于ernie的中文文本分类模型在训练集上进行本地模型训练。\n\n```shell\n# ernie 中文文本分类模型\n# 基于json实现预置网络训练。其调用了配置文件./examples/cls_ernie_fc_ch.json\npython run_trainer.py --param_path ./examples/cls_ernie_fc_ch.json\n```\n\n- 多卡指令为:\n\n```plain\nfleetrun --gpus=x,y run_trainer.py./examples/cls_ernie_fc_ch.json\n```\n\n- 训练运行的日志会自动保存在**./log/test.log**文件中。\n- 训练中以及结束后产生的模型文件会默认保存在./output/**目录下，其中**save_inference_model/文件夹会保存用于预测的模型文件，**save_checkpoint/** 文件夹会保存用于热启动的模型文件。\n\n## 配置预测json文件\n\n- 其预置json文件在./examples/目录下，使用ERNIE2.0预训练模型训练的模型进行预测的配置文件为的./examples/cls_ernie_fc_ch_infer.json\n- 主要修改./examples/cls_ernie_fc_ch_infer.json文件的预测模型的输入路径、预测文件的输入路径、预测结果的输出路径，对应修改配置如下：\n\n```\n{\n\"dataset_reader\":{\"train_reader\":{\"config\":{\"data_path\":\"./data/predict_data\"}}},\n\"inference\":{\"inference_model_path\":\"./output/cls_ernie_fc_ch/save_inference_model/inference_step_251\",\n                        \"output_path\": \"./output/predict_result.txt\"}\n}\n```\n\n## 启动预测\n\n- 运行run_infer.py ，选择对应的参数配置文件即可。如下所示：\n\n```plain\npython run_infer.py --param_path ./examples/cls_ernie_fc_ch_infer.json\n```\n\n- 预测过程中的日志自动保存在./output/predict_result.txt文件中。\n\n# 预训练模型介绍\n\n- 参考预训练模型原理介绍:[模型介绍](./applications/models_hub)\n- 预训练模型下载：进入./applications/models_hub目录下,下载示例：\n\n```plain\n#进入预训练模型下载目录\ncd ./applications/models_hub\n#下载ERNIE3.0 base模型\nsh downlaod_ernie_3.0_base_ch.sh\n```\n\n- 更多开源模型，见[Research](./Research/)\n\n# 数据集下载\n\n[CLUE数据集](https://www.cluebenchmarks.com/)\n\n[DuIE2.0数据集](https://www.luge.ai/#/luge/dataDetail?id=5)\n\n[MSRA_NER数据集](https://ernie-github.cdn.bcebos.com/data-msra_ner.tar.gz)\n\n# 模型效果评估\n\n## 评估数据集\n\n- 分类和匹配采用[CLUE数据集](https://www.cluebenchmarks.com/)。\n\n## CLUE 评测结果:\n\n| 配置     | 模型                  | CLUEWSC2020 | IFLYTEK | TNEWS | AFQMC | CMNLI | CSL   | OCNLI | 平均值 |\n| -------- | --------------------- | ----------- | ------- | ----- | ----- | ----- | ----- | ----- | ------ |\n| 24L1024H | RoBERTa-wwm-ext-large | 90.79       | 62.02   | 59.33 | 76.00 | 83.88 | 83.67 | 78.81 | 76.36  |\n| 20L1024H | ERNIE 3.0-XBase       | 91.12       | 62.22   | 60.34 | 76.95 | 84.98 | 84.27 | 82.07 | 77.42  |\n| 12L768H  | RoBERTa-wwm-ext-base  | 88.55       | 61.22   | 58.08 | 74.75 | 81.66 | 81.63 | 77.25 | 74.73  |\n| 12L768H  | ERNIE 3.0-Base        | 88.18       | 60.72   | 58.73 | 76.53 | 83.65 | 83.30 | 80.31 | 75.63  |\n| 6L768H   | RBT6, Chinese         | 75.00       | 59.68   | 56.62 | 73.15 | 79.26 | 80.04 | 73.15 | 70.99  |\n| 6L768H   | ERNIE 3.0-Medium      | 79.93       | 60.14   | 57.16 | 74.56 | 80.87 | 81.23 | 77.02 | 72.99  |\n\n\n\n## **具体评测方式**\n\n1. 以上所有任务均基于 Grid Search 方式进行超参寻优。分类任务训练每间隔 100 steps 评估验证集效果，取验证集最优效果作为表格中的汇报指标。\n2. 分类任务 Grid Search 超参范围: batch_size: 16, 32, 64; learning rates: 1e-5, 2e-5, 3e-5, 5e-5；因为 CLUEWSC2020 数据集较小，所以模型在该数据集上的效果对 batch_size 较敏感，所以对 CLUEWSC2020 评测时额外增加了 batch_size = 8 的超参搜索； 因为CLUEWSC2020 和 IFLYTEK 数据集对 dropout 概率值较为敏感，所以对 CLUEWSC2020 和 IFLYTEK 数据集评测时增加dropout_prob = 0.0 的超参搜索。\n\n## 下游任务的固定超参配置\n\n**分类和匹配任务:**\n\n| TASK              | AFQMC | TNEWS | IFLYTEK | CMNLI | OCNLI | CLUEWSC2020 | CSL  |\n| ----------------- | ----- | ----- | ------- | ----- | ----- | ----------- | ---- |\n| epoch             | 3     | 3     | 3       | 2     | 5     | 50          | 5    |\n| max_seq_length    | 128   | 128   | 128     | 128   | 128   | 128         | 256  |\n| warmup_proportion | 0.1   | 0.1   | 0.1     | 0.1   | 0.1   | 0.1         | 0.1  |\n\n\n\n##  ERNIE模型Grid Search 最优超参\n\n\n\n| Model            | AFQMC           | TNEWS           | IFLYTEK         | CMNLI                           | OCNLI           | CLUEWSC2020                   | CSL             |\n| ---------------- | --------------- | --------------- | --------------- | ------------------------------- | --------------- | ----------------------------- | --------------- |\n| ERNIE 3.0-Medium | bsz_32_lr_2e-05 | bsz_16_lr_3e-05 | bsz_16_lr_5e-05 | bsz_16_lr_1e-05/bsz_64_lr_2e-05 | bsz_64_lr_2e-05 | bsz_8_lr_2e-05                | bsz_32_lr_1e-05 |\n| ERNIE 3.0-Base   | bsz_16_lr_2e-05 | bsz_64_lr_3e-05 | bsz_16_lr_5e-05 | bsz_16_lr_2e-05                 | bsz_16_lr_2e-05 | bsz_8_lr_2e-05(drop_out _0.1) | bsz_16_lr_3e-05 |\n| ERNIE 3.0-XBase  | bsz_16_lr_1e-05 | bsz_16_lr_2e-05 | bsz_16_lr_3e-05 | bsz_16_lr_1e-05                 | bsz_32_lr_2e-05 | bsz_8_lr_2e-05                | bsz_64_lr_1e-05 |\n\n\n\n# 应用场景\n\n文本分类（[文本分类](./applications/tasks/text_classification)）\n\n文本匹配（[文本匹配](./applications/tasks/text_matching)）\n\n序列标注（[序列标注](./applications/tasks/sequence_labeling)）\n\n信息抽取（[信息抽取](./applications/tasks/information_extraction_many_to_many)）\n\n文本生成（[文本生成](./applications/tasks/text_generation)）\n\n图文匹配（[图文匹配](./Research/ERNIE-ViL2)）\n\n数据蒸馏（[数据蒸馏](./applications/tasks/data_distillation)）\n\n工具使用（[工具使用](./applications/tools)）\n\n# 文献引用\n\n### ERNIE 1.0\n\n```\n@article{sun2019ernie,\n  title={Ernie: Enhanced representation through knowledge integration},\n  author={Sun, Yu and Wang, Shuohuan and Li, Yukun and Feng, Shikun and Chen, Xuyi and Zhang, Han and Tian, Xin and Zhu, Danxiang and Tian, Hao and Wu, Hua},\n  journal={arXiv preprint arXiv:1904.09223},\n  year={2019}\n}\n```\n\n### ERNIE 2.0\n\n```\n@inproceedings{sun2020ernie,\n  title={Ernie 2.0: A continual pre-training framework for language understanding},\n  author={Sun, Yu and Wang, Shuohuan and Li, Yukun and Feng, Shikun and Tian, Hao and Wu, Hua and Wang, Haifeng},\n  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},\n  volume={34},\n  number={05},\n  pages={8968--8975},\n  year={2020}\n}\n```\n\n### ERNIE-GEN\n\n```\n@article{xiao2020ernie,\n  title={Ernie-gen: An enhanced multi-flow pre-training and fine-tuning framework for natural language generation},\n  author={Xiao, Dongling and Zhang, Han and Li, Yukun and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},\n  journal={arXiv preprint arXiv:2001.11314},\n  year={2020}\n}\n```\n\n### ERNIE-ViL\n\n```\n@article{yu2020ernie,\n  title={Ernie-vil: Knowledge enhanced vision-language representations through scene graph},\n  author={Yu, Fei and Tang, Jiji and Yin, Weichong and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},\n  journal={arXiv preprint arXiv:2006.16934},\n  year={2020}\n}\n```\n\n### ERNIE-Gram\n\n```\n@article{xiao2020ernie,\n  title={ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding},\n  author={Xiao, Dongling and Li, Yu-Kun and Zhang, Han and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},\n  journal={arXiv preprint arXiv:2010.12148},\n  year={2020}\n}\n```\n\n### ERNIE-Doc\n\n```\n@article{ding2020ernie,\n  title={ERNIE-Doc: A retrospective long-document modeling transformer},\n  author={Ding, Siyu and Shang, Junyuan and Wang, Shuohuan and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},\n  journal={arXiv preprint arXiv:2012.15688},\n  year={2020}\n}\n```\n\n### ERNIE-UNIMO\n\n```\n@article{li2020unimo,\n  title={Unimo: Towards unified-modal understanding and generation via cross-modal contrastive learning},\n  author={Li, Wei and Gao, Can and Niu, Guocheng and Xiao, Xinyan and Liu, Hao and Liu, Jiachen and Wu, Hua and Wang, Haifeng},\n  journal={arXiv preprint arXiv:2012.15409},\n  year={2020}\n}\n```\n\n### ERNIE-M\n\n```\n@article{ouyang2020ernie,\n  title={Ernie-m: Enhanced multilingual representation by aligning cross-lingual semantics with monolingual corpora},\n  author={Ouyang, Xuan and Wang, Shuohuan and Pang, Chao and Sun, Yu and Tian, Hao and Wu, Hua and Wang, Haifeng},\n  journal={arXiv preprint arXiv:2012.15674},\n  year={2020}\n}\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaddlepaddle%2Fernie","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpaddlepaddle%2Fernie","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaddlepaddle%2Fernie/lists"}