{"id":19119440,"url":"https://github.com/cluebenchmark/distilbert","last_synced_at":"2026-03-01T11:03:26.289Z","repository":{"id":110258561,"uuid":"224346536","full_name":"CLUEbenchmark/DistilBert","owner":"CLUEbenchmark","description":"DistilBERT for Chinese 海量中文预训练蒸馏bert模型","archived":false,"fork":false,"pushed_at":"2019-12-05T01:38:27.000Z","size":16,"stargazers_count":91,"open_issues_count":4,"forks_count":6,"subscribers_count":13,"default_branch":"master","last_synced_at":"2025-02-22T12:29:53.848Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CLUEbenchmark.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2019-11-27T04:47:55.000Z","updated_at":"2025-02-12T06:13:27.000Z","dependencies_parsed_at":"2023-03-10T15:45:16.194Z","dependency_job_id":null,"html_url":"https://github.com/CLUEbenchmark/DistilBert","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/CLUEbenchmark/DistilBert","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CLUEbenchmark%2FDistilBert","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CLUEbenchmark%2FDistilBert/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CLUEbenchmark%2FDistilBert/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CLUEbenchmark%2FDistilBert/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CLUEbenchmark","download_url":"https://codeload.github.com/CLUEbenchmark/DistilBert/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CLUEbenchmark%2FDistilBert/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29967932,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-01T10:55:55.490Z","status":"ssl_error","status_checked_at":"2026-03-01T10:55:55.175Z","response_time":124,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-09T05:09:41.332Z","updated_at":"2026-03-01T11:03:26.273Z","avatar_url":"https://github.com/CLUEbenchmark.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"一、DistilBert for Chinese 海量中文预训练蒸馏Bert模型\n\n拟于12月16日发布 target to release on Dec 16th.\n\n拟发布内容 Contents：\n\n1.1 可下载的蒸馏模型，已经训练过 \n\na pretrained chinese DistilBert, others can use it directly or  trained again on their own corpus; \n\n1.2 可用于下游任务的例子和代码，包括3个ChineseGLUE(CLUE)的任务 \n\nfine tuning examples and codes using DistilBert on three ChineseGLUE(CLUE) tasks; \n\n1.3 小模型基准测评\n\nperformance comparsion with albert_tiny, ernie_tiny.\n\n\n二、distillbert简介\n\n2.1 BERT 瘦身之三个思路\n\nDistillation（蒸馏）：通过蒸馏技巧，将 BERT 模型知识导入小模型，之后用小模型；\nQuantization（量化）：将高精度模型用低精度来表示，使得模型更小；\nPruning（剪枝）：将模型中作用较小部分舍弃，而让模型更小。\n\n2.2 Distillation最早的蒸馏法原理\n\n一般认为是 Hinton 在 Distilling the Knowledge in a Neural Network 提出，之后得到推广，Hinton 在论文中提出方法很简单，就是让学生模型的预测分布，来拟合老师模型（可以是集成模型）的预测分布。\n\n2.3 目前较好的实现实践\n\n比较完美实现上述经典方法对 BERT 蒸馏的是 HuggingFace 前段时间提出的 DistilBERT，将 BERT-base 从 12 层蒸馏到 6 层 BERT 模型。当然除了上述方法，还用了些其他技巧，比如用老师模型参数初始化学生模型，更多细节可看 HuggingFace 的博客和论文。\n\n三、其他\n\nContact with chineseGLUE@163.com to join us.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcluebenchmark%2Fdistilbert","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcluebenchmark%2Fdistilbert","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcluebenchmark%2Fdistilbert/lists"}