{"id":13754374,"url":"https://github.com/lonePatient/BERT-NER-Pytorch","last_synced_at":"2025-05-09T22:32:11.116Z","repository":{"id":39340200,"uuid":"170256148","full_name":"lonePatient/BERT-NER-Pytorch","owner":"lonePatient","description":"Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)","archived":false,"fork":false,"pushed_at":"2023-03-11T03:14:55.000Z","size":498,"stargazers_count":2170,"open_issues_count":69,"forks_count":430,"subscribers_count":13,"default_branch":"master","last_synced_at":"2025-04-07T22:06:51.908Z","etag":null,"topics":["adversarial-training","albert","bert","chinese","crf","focal-loss","labelsmoothing","ner","nlp","pytorch","softmax","span"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lonePatient.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2019-02-12T05:12:07.000Z","updated_at":"2025-04-07T09:38:53.000Z","dependencies_parsed_at":"2022-08-09T14:48:16.572Z","dependency_job_id":"ae3b858c-ae05-4abd-a0d4-001adff7dc62","html_url":"https://github.com/lonePatient/BERT-NER-Pytorch","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lonePatient%2FBERT-NER-Pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lonePatient%2FBERT-NER-Pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lonePatient%2FBERT-NER-Pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lonePatient%2FBERT-NER-Pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lonePatient","download_url":"https://codeload.github.com/lonePatient/BERT-NER-Pytorch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253335969,"owners_count":21892763,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adversarial-training","albert","bert","chinese","crf","focal-loss","labelsmoothing","ner","nlp","pytorch","softmax","span"],"created_at":"2024-08-03T09:01:57.291Z","updated_at":"2025-05-09T22:32:06.107Z","avatar_url":"https://github.com/lonePatient.png","language":"Python","readme":"## Chinese NER using Bert\n\nBERT for Chinese NER. \n\n**update**：其他一些可以参考,包括Biaffine、GlobalPointer等:[examples](https://github.com/lonePatient/TorchBlocks/tree/master/examples)\n\n### dataset list\n\n1. cner: datasets/cner\n2. CLUENER: https://github.com/CLUEbenchmark/CLUENER\n\n### model list\n\n1. BERT+Softmax\n2. BERT+CRF\n3. BERT+Span\n\n### requirement\n\n1. 1.1.0 =\u003c PyTorch \u003c 1.5.0\n2. cuda=9.0\n3. python3.6+\n\n### input format\n\nInput format (prefer BIOS tag scheme), with each character its label for one line. Sentences are splited with a null line.\n\n```text\n美\tB-LOC\n国\tI-LOC\n的\tO\n华\tB-PER\n莱\tI-PER\n士\tI-PER\n\n我\tO\n跟\tO\n他\tO\n```\n\n### run the code\n\n1. Modify the configuration information in `run_ner_xxx.py` or `run_ner_xxx.sh` .\n2. `sh scripts/run_ner_xxx.sh`\n\n**note**: file structure of the model\n\n```text\n├── prev_trained_model\n|  └── bert_base\n|  |  └── pytorch_model.bin\n|  |  └── config.json\n|  |  └── vocab.txt\n|  |  └── ......\n```\n\n### CLUENER result\n\nThe overall performance of BERT on **dev**:\n\n|              | Accuracy (entity)  | Recall (entity)    | F1 score (entity)  |\n| ------------ | ------------------ | ------------------ | ------------------ |\n| BERT+Softmax | 0.7897     | 0.8031     | 0.7963    |\n| BERT+CRF     | 0.7977 | 0.8177 | 0.8076 |\n| BERT+Span    | 0.8132 | 0.8092 | 0.8112 |\n| BERT+Span+adv    | 0.8267 | 0.8073 | **0.8169** |\n| BERT-small(6 layers)+Span+kd    | 0.8241 | 0.7839 | 0.8051 |\n| BERT+Span+focal_loss    | 0.8121 | 0.8008 | 0.8064 |\n| BERT+Span+label_smoothing   | 0.8235 | 0.7946 | 0.8088 |\n\n### ALBERT for CLUENER\n\nThe overall performance of ALBERT on **dev**:\n\n| model  | version       | Accuracy(entity) | Recall(entity) | F1(entity) | Train time/epoch |\n| ------ | ------------- | ---------------- | -------------- | ---------- | ---------------- |\n| albert | base_google   | 0.8014           | 0.6908         | 0.7420     | 0.75x            |\n| albert | large_google  | 0.8024           | 0.7520         | 0.7763     | 2.1x             |\n| albert | xlarge_google | 0.8286           | 0.7773         | 0.8021     | 6.7x             |\n| bert   | google        | 0.8118           | 0.8031         | **0.8074**     | -----            |\n| albert | base_bright   | 0.8068           | 0.7529         | 0.7789     | 0.75x            |\n| albert | large_bright  | 0.8152           | 0.7480         | 0.7802     | 2.2x             |\n| albert | xlarge_bright | 0.8222           | 0.7692         | 0.7948     | 7.3x             |\n\n### Cner result\n\nThe overall performance of BERT on **dev(test)**:\n\n|              | Accuracy (entity)  | Recall (entity)    | F1 score (entity)  |\n| ------------ | ------------------ | ------------------ | ------------------ |\n| BERT+Softmax | 0.9586(0.9566)     | 0.9644(0.9613)     | 0.9615(0.9590)     |\n| BERT+CRF     | 0.9562(0.9539)     | 0.9671(**0.9644**) | 0.9616(0.9591)     |\n| BERT+Span    | 0.9604(**0.9620**) | 0.9617(0.9632)     | 0.9611(**0.9626**) |\n| BERT+Span+focal_loss    | 0.9516(0.9569) | 0.9644(0.9681)     | 0.9580(0.9625) |\n| BERT+Span+label_smoothing   | 0.9566(0.9568) | 0.9624(0.9656)     | 0.9595(0.9612) |\n","funding_links":[],"categories":["实体识别NER、意图识别、槽位填充"],"sub_categories":["其他_文本生成、文本对话"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FlonePatient%2FBERT-NER-Pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FlonePatient%2FBERT-NER-Pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FlonePatient%2FBERT-NER-Pytorch/lists"}