{"id":13615272,"url":"https://github.com/bojone/albert_zh","last_synced_at":"2025-04-10T14:08:21.424Z","repository":{"id":108163768,"uuid":"236639037","full_name":"bojone/albert_zh","owner":"bojone","description":"转换 https://github.com/brightmart/albert_zh 到google格式","archived":false,"fork":false,"pushed_at":"2020-09-28T10:22:57.000Z","size":18,"stargazers_count":62,"open_issues_count":0,"forks_count":8,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-24T12:48:04.303Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bojone.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-01-28T02:05:31.000Z","updated_at":"2024-01-04T16:41:40.000Z","dependencies_parsed_at":null,"dependency_job_id":"1320c726-6a20-45b3-a3c6-9781c9f05da8","html_url":"https://github.com/bojone/albert_zh","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bojone%2Falbert_zh","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bojone%2Falbert_zh/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bojone%2Falbert_zh/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bojone%2Falbert_zh/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bojone","download_url":"https://codeload.github.com/bojone/albert_zh/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248232186,"owners_count":21069477,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T20:01:11.278Z","updated_at":"2025-04-10T14:08:21.405Z","avatar_url":"https://github.com/bojone.png","language":"Python","readme":"# Google官方格式的中文版ALBERT\n\n转换brightmart版的albert权重到Google版格式。\n\n## 背景\n\nbrightmart的项目\u003ca href=\"https://github.com/brightmart/albert_zh\"\u003ealbert_zh\u003c/a\u003e训练了从tiny版到xlarge版等一系列albert模型，极大地促进了albert在中文NLP领域的普及。\n\n然而，brightmart版albert的开源时间早于Google版albert，这导致早期brightmart版albert的权重与Google版的不完全一致，换言之两者不能直接相互替换。当Google版开源之后，很多工作自然会以Google版为标准，但如果直接放弃掉之前训练好的权重未免就太可惜了，而全部重新训练一次成本又太大。因此这里做一个转换。\n\n## 说明\n\n注意，我们说brightmart版albert跟Google版不一致，并不是单纯指变量命名上的不一致，而是模型架构上就不一致（两者处理Embedding层的方式不一样），所以原封不动的转换是做不到的。但如果放弃Embedding层的低秩分解，那么可以转换一个版本出来。\n\n因此，本项目转换出来的模型，Embedding层都是没有低秩分解的，但是保留了transformer block的跨层参数共享。\n\n## 权重\n\n转换后的权重可以直接用\u003ca href=\"https://github.com/bojone/bert4keras\"\u003ebert4keras\u003c/a\u003e加载，也可以用Google官方的\u003ca href=\"https://github.com/google-research/ALBERT\"\u003ealbert脚本\u003c/a\u003e加载。\n\n|                     模型                        |           下载地址             |\n|:----------------------------------------------:|:-----------------------------:|\n|       albert_tiny_google_zh_489k.zip           |\u003ca href=\"https://pan.baidu.com/s/1UsJRo4E8DRshwpF8rA3i9A\"\u003e百度网盘\u003c/a\u003e(4m4b)|\n| albert_base_google_zh_additional_36k_steps.zip |\u003ca href=\"https://pan.baidu.com/s/1QSglsiOy6cLOcSBbuHaAUQ\"\u003e百度网盘\u003c/a\u003e(tc54)|\n|          albert_large_google_zh.zip            |\u003ca href=\"https://pan.baidu.com/s/1YOrNYjK4oilwPLI_5e-vCw\"\u003e百度网盘\u003c/a\u003e(dq2h)|\n|        albert_xlarge_google_zh_183k.zip        |\u003ca href=\"https://pan.baidu.com/s/1PabxtKfRc74AfBSZvzlu4w\"\u003e百度网盘\u003c/a\u003e(hhxz)|\n\n（注：zip的命名跟原brightmart版基本一致，只是多了google字眼，读者可以凭文件名找到原权重的介绍。）\n\n## 交流\n\n- QQ交流群：67729435\n- 微信群请加机器人微信号spaces_ac_cn\n- https://kexue.fm\n","funding_links":[],"categories":["Pretrained Language Model"],"sub_categories":["Repository"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbojone%2Falbert_zh","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbojone%2Falbert_zh","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbojone%2Falbert_zh/lists"}