{"id":13595159,"url":"https://github.com/LeeSureman/Flat-Lattice-Transformer","last_synced_at":"2025-04-09T10:33:01.941Z","repository":{"id":37624942,"uuid":"259826252","full_name":"LeeSureman/Flat-Lattice-Transformer","owner":"LeeSureman","description":"code for ACL 2020 paper: FLAT: Chinese NER Using Flat-Lattice Transformer","archived":false,"fork":false,"pushed_at":"2022-05-10T05:18:21.000Z","size":104,"stargazers_count":1001,"open_issues_count":92,"forks_count":175,"subscribers_count":13,"default_branch":"master","last_synced_at":"2024-11-06T17:46:09.616Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LeeSureman.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-04-29T04:44:42.000Z","updated_at":"2024-11-04T12:23:38.000Z","dependencies_parsed_at":"2022-07-12T02:17:06.868Z","dependency_job_id":null,"html_url":"https://github.com/LeeSureman/Flat-Lattice-Transformer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LeeSureman%2FFlat-Lattice-Transformer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LeeSureman%2FFlat-Lattice-Transformer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LeeSureman%2FFlat-Lattice-Transformer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LeeSureman%2FFlat-Lattice-Transformer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LeeSureman","download_url":"https://codeload.github.com/LeeSureman/Flat-Lattice-Transformer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248020593,"owners_count":21034459,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T16:01:45.052Z","updated_at":"2025-04-09T10:32:56.933Z","avatar_url":"https://github.com/LeeSureman.png","language":"Python","funding_links":[],"categories":["Python","实体识别NER、意图识别、槽位填充"],"sub_categories":["其他_文本生成、文本对话"],"readme":"[English](#Requirement)\n[中文](#运行环境)\n\n# Flat-Lattice-Transformer\ncode for ACL 2020 paper: FLAT: Chinese NER Using Flat-Lattice Transformer. \n\nModels and results can be found at our ACL 2020 paper [FLAT: Chinese NER Using Flat-Lattice Transformer](https://arxiv.org/pdf/2004.11795.pdf).\n\n\n\n\n# Requirement:\n\n```\nPython: 3.7.3\nPyTorch: 1.2.0\nFastNLP: 0.5.0\nNumpy: 1.16.4\n```\nyou can go [here](https://fastnlp.readthedocs.io/zh/latest/) to know more about FastNLP.\n\n\n\nHow to run the code?\n====\n1. Download the character embeddings and word embeddings.\n\n      Character and Bigram embeddings (gigaword_chn.all.a2b.{'uni' or 'bi'}.ite50.vec) : [Google Drive](https://drive.google.com/file/d/1_Zlf0OAZKVdydk7loUpkzD2KPEotUE8u/view?usp=sharing) or [Baidu Pan](https://pan.baidu.com/s/1pLO6T9D)\n\n      Word(Lattice) embeddings: \n      \n      yj, (ctb.50d.vec) : [Google Drive](https://drive.google.com/file/d/1K_lG3FlXTgOOf8aQ4brR9g3R40qi1Chv/view?usp=sharing) or [Baidu Pan](https://pan.baidu.com/s/1pLO6T9D)\n      \n      ls, (sgns.merge.word.bz2) : [Baidu Pan](https://pan.baidu.com/s/1luy-GlTdqqvJ3j-A4FcIOw)\n\n2. Modify the `paths.py` to add the pretrained embedding and the dataset\n3. Run following commands\n```\npython preprocess.py (add '--clip_msra' if you need to train FLAT on MSRA NER dataset)\ncd V0 (without Bert) / V1 (with Bert)\npython flat_main.py --dataset \u003cdataset_name\u003e (ontonotes, msra, weibo or resume)\n```\n\nIf you want to record experiment result, you can use fitlog:\n```\npip install fitlog\nfitlog init V0\ncd V0\nfitlog log logs\n```\nthen set use_fitlog = True in flat_main.py.\n\nyou can go [here](https://fitlog.readthedocs.io/zh/latest/) to know more about Fitlog.\n\n\nCite: \n========\n[bibtex](https://www.aclweb.org/anthology/2020.acl-main.611.bib)\n\n---\n\n\n\n\n\n\n# 运行环境:\n\n```\nPython: 3.7.3\nPyTorch: 1.2.0\nFastNLP: 0.5.0\nNumpy: 1.16.4\n```\n你可以在 [这里](https://fastnlp.readthedocs.io/zh/latest/) 深入了解 FastNLP 这个库.\n\n\n\n如何运行？\n====\n1. 请下载预训练的embedding\n\n      从[Google Drive](https://drive.google.com/file/d/1_Zlf0OAZKVdydk7loUpkzD2KPEotUE8u/view?usp=sharing) 或 [Baidu Pan](https://pan.baidu.com/s/1pLO6T9D) 下载字和 Bigram 的 embedding (gigaword_chn.all.a2b.{'uni' or 'bi'}.ite50.vec) \n\n      从[Google Drive](https://drive.google.com/file/d/1K_lG3FlXTgOOf8aQ4brR9g3R40qi1Chv/view?usp=sharing) 或 [Baidu Pan](https://pan.baidu.com/s/1pLO6T9D) 下载词的 embedding (ctb.50d.vec)(yj)\n      \n      从[Baidu Pan](https://pan.baidu.com/s/1luy-GlTdqqvJ3j-A4FcIOw) 下载词的embedding (sgns.merge.bigram.bz2)(ls)\n\n2. 修改 `paths.py` 来添加预训练的 embedding 和你的数据集\n3. 运行下面的代码\n```\npython preprocess.py (add '--clip_msra' if you need to train FLAT on MSRA NER dataset)\ncd V0 (without Bert) / V1 (with Bert)\npython flat_main.py --dataset \u003cdataset_name\u003e (ontonotes, msra, weibo or resume)\n```\n\n如果你想方便地记录和观察实验结果, 你可以使用fitlog:\n```\npip install fitlog\nfitlog init V0\ncd V0\nfitlog log logs\n```\n然后把flat_main.py里的 use_fitlog 设置为 True 就行\n你可以在 [这里](https://fitlog.readthedocs.io/zh/latest/) 深入了解 Fitlog 这个工具\n\n\n引用: \n========\n[bibtex](https://www.aclweb.org/anthology/2020.acl-main.611.bib)\n\n\n更新说明：\n========\n5.7共提交两个版本，其中V2使用tensor.unique()用于去除相对位置中重复组合（记为Flat_unique），V3使用标量替代了FLAt中的相对位置编码(记为Flat_scalar).详见[FLAT瘦身日记](https://zhuanlan.zhihu.com/p/509248057)   \n使用这两种方法的显存占用如下表所示   \nbatch_size=10   \n|seq_len| 50 | 100 | 150 | 200 | 250 | 300 |   \n|:-------|----:|-----:|-----:|-----:|-----:|-----:|   \n|Flat|1096MB | 1668MB |2734MB|4118MB|5938MB|8374MB|\n|Flat_unique|964MB|1204MB|1610MB|2166MB|2922MB|3940MB|\n|Flat_scalar|878MB|916MB|1028MB|1062MB|1148MB|1322MB|\n|Bert+Flat|1605MB|2237MB|3333MB|4725MB|6571MB|9039MB|\n|Bert+Flat_unique|1495MB|1685MB|2129MB|2697MB|3453MB|4585MB|\n|Bert+Flat_scalar|1409MB|1481MB|1565MB|1617MB|1705MB|2051MB|\n\n\n\n\n\n\n ","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FLeeSureman%2FFlat-Lattice-Transformer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FLeeSureman%2FFlat-Lattice-Transformer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FLeeSureman%2FFlat-Lattice-Transformer/lists"}