{"id":22218839,"url":"https://github.com/cooelf/sembert","last_synced_at":"2025-04-09T13:07:55.473Z","repository":{"id":68595004,"uuid":"221818672","full_name":"cooelf/SemBERT","owner":"cooelf","description":"Semantics-aware BERT for Language Understanding (AAAI 2020)","archived":false,"fork":false,"pushed_at":"2022-12-21T01:27:45.000Z","size":476,"stargazers_count":287,"open_issues_count":4,"forks_count":55,"subscribers_count":11,"default_branch":"master","last_synced_at":"2025-04-02T11:07:23.969Z","etag":null,"topics":["aaai2020","bert","bert-model","glue","nlu","sembert","srl"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/1909.02209","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cooelf.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2019-11-15T01:28:07.000Z","updated_at":"2025-01-17T09:40:26.000Z","dependencies_parsed_at":"2023-03-01T21:15:47.540Z","dependency_job_id":null,"html_url":"https://github.com/cooelf/SemBERT","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cooelf%2FSemBERT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cooelf%2FSemBERT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cooelf%2FSemBERT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cooelf%2FSemBERT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cooelf","download_url":"https://codeload.github.com/cooelf/SemBERT/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248045232,"owners_count":21038553,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aaai2020","bert","bert-model","glue","nlu","sembert","srl"],"created_at":"2024-12-02T22:29:24.913Z","updated_at":"2025-04-09T13:07:55.441Z","avatar_url":"https://github.com/cooelf.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SemBERT: Semantics-aware BERT for Language Understanding\n\n**(2020/10/07) Update: Tips for possible issues**\n\n1) *SRL prediction mismatches the provided samples*\n\nThe POS tags are slightly different using different spaCy versions.  SemBERT used spacy==2.0.18 to obtain the verbs.\n\nRefer to [allenai/allennlp#3418](https://github.com/allenai/allennlp/issues/3418),  [cooelf/SemBERT#12](https://github.com/cooelf/SemBERT/issues/12) (CHN).\n\n2) *SRL is not a registered name for Model.*\n\nPlease try pip install --pre allennlp-models\n\n3) Issues about AllenNLP\n\nIf you encounter issues about the class or variables in AllenNLP, please try to use a lower version, e.g., 0.8.1. \n\nOur experiment environment for reference:\n\nPython 3.6+ PyTorch (1.0.0) AllenNLP (0.8.1)\n\n=========================================\n\nCodes for the paper **[Semantics-aware BERT for Language Understanding](https://www.researchgate.net/publication/339301633_Semantics-aware_BERT_for_Language_Understanding)** in AAAI 2020\n\n### **Overview**\n\n![](SemBERT.png)\n\n## Requirements\n\n(Our experiment environment for reference)\n\nPython 3.6+\nPyTorch (1.0.0)\nAllenNLP (0.8.1)\n\n## Datasets\nGLUE data can be downloaded from [GLUE data](https://gluebenchmark.com/tasks) by running [this script](https://gist.github.com/W4ngatang/60c2bdb54d156a41194446737ce03e2e) and unpack it to directory \u003cu\u003eglue_data\u003c/u\u003e.\nWe provide an example data sample in \u003cu\u003eglue_data/MNLI\u003c/u\u003e to show how SemBERT works.\n\n## Instructions\nThis repo shows the example implementation of SemBERT for NLU tasks.\nWe basically used the pre-trained BERT uncased models so do not forget to pass the parameter `--do_lower_case`.\n\nThe example script are as follows:\n\n**Train a model**\n\nNote: please replace the sample data with labeled data (use our labeled data or annotate your data following the instructions below).\n\n```shell\nCUDA_VISIBLE_DEVICES=0 \\\npython run_classifier.py \\\n--data_dir glue_data/SNLI/ \\\n--task_name snli \\\n--train_batch_size 32 \\\n--max_seq_length 128 \\\n--bert_model bert-wwm-uncased \\\n--learning_rate 2e-5 \\\n--num_train_epochs 2 \\\n--do_train \\\n--do_eval \\\n--do_lower_case \\\n--max_num_aspect 3 \\\n--output_dir glue/snli_model_dir\n```\n\n**Evaluation**\n\nBoth `run_classifier.py ` and  `run_snli_predict.py` can be used for evaluation, where the later is simplified for easy employment.\n\nThe major difference is that `run_classifier.py` takes labeled data as input, while `run_snli_predict.py` integrates the real-time semantic role labeling, so it uses the original raw data.\n\n**Evaluation using labeled data**\n\n```shell\nCUDA_VISIBLE_DEVICES=0 \\\npython run_classifier.py \\\n--data_dir glue_data/SNLI/ \\\n--task_name snli \\\n--eval_batch_size 128 \\\n--max_seq_length 128 \\\n--bert_model bert-wwm-uncased \\\n--do_eval \\\n--do_lower_case \\\n--max_num_aspect 3 \\\n--output_dir glue/snli_model_dir\n```\n\n**Evaluation using raw data (with real-time semantic role labeling)** \n\nOur trained SNLI model (reaching 91.9% test accuracy) can be accessed here.\n\nhttps://drive.google.com/drive/folders/1Yn-WCw1RaMxbDDNZRnoJCIGxMSAOu20_?usp=sharing\n\nTo use our trained SNLI model, please put the [SNLI model](https://drive.google.com/open?id=1Yn-WCw1RaMxbDDNZRnoJCIGxMSAOu20_) and the [SRL model](https://s3-us-west-2.amazonaws.com/allennlp/models/srl-model-2018.05.25.tar.gz) to the **snli_model_dir** and **srl_model_dir**, respectively.\n\nAs shown in our example SNLI model, the folder of **snli_model_dir** should contain three files:\n\n*vocab.txt* and *bert_config.json* from the BERT model folder that are used for training your model;\n\n*pytorch_model.bin* that is the trained SNLI model.\n\n```shell\nCUDA_VISIBLE_DEVICES=0 \\\npython run_snli_predict.py \\\n--data_dir /share03/zhangzs/glue_data/SNLI \\\n--task_name snli \\\n--eval_batch_size 128 \\\n--max_seq_length 128 \\\n--max_num_aspect 3 \\\n--do_eval \\\n--do_lower_case \\\n--bert_model snli_model_dir \\\n--output_dir snli_model_dir \\\n--tagger_path srl_model_dir\n```\n\nFor prediction, use the flag: `--do_predict` for either the script `run_classifier.py` or `run_snli_predict.py`. The output pred file can be directly used for GLUE online submission and evaluation.\n\n### Data annotation (Semantic role labeling)\n\nWe provide two kinds of semantic labeling method, \n\n* **online**: each word sequence are passed to label module to obtain the tags which could be used for online prediction. This would be time-consuming for large corpus. See  *tag_model/tagging.py*\n\n  If you want to use the online one, please specify the `--tagger_path` parameter in the run.py file.\n\n* **offline**: the current one that pre-process the datasets and save them for later loading for training and evaluation. See *tag_model/tagger_offline.py*\n\n  Our labeled data can be downloaded here for quick start.\n\n  Google Drive: [https://drive.google.com/file/d/1B-_IRWRvR67eLdvT6bM0b2OiyvySkO-x/view?usp=sharing](https://drive.google.com/file/d/1B-_IRWRvR67eLdvT6bM0b2OiyvySkO-x/view?usp=sharing)\n\n  Baidu Cloud:  \n\n  Link \u003chttps://pan.baidu.com/s/1EduMJAfEXet_9yCfVob9qA\u003e\n  Password：sl7l\n\n  \n\nNote this repo is based on the offline version, so that the column id/index in the data-processor would be slightly different from the original, which is like this:\n\ntext_a = line[-3]\ntext_b = line[-2]\nlabel = line[-1]\n\nIf you use the original data \u003cu\u003einstead of\u003c/u\u003e our preprocessed one by tag_model/tagger_offline.py, please modify the index according to the dataset structure.\n\n### SRL model\n\nThe SRL model in this implementation used the [ELMo-based SRL model](https://s3-us-west-2.amazonaws.com/allennlp/models/srl-model-2018.05.25.tar.gz)  from [AllenNLP](https://github.com/allenai/allennlp). \n\nRecently, there is a new [BERT-based model](https://s3-us-west-2.amazonaws.com/allennlp/models/bert-base-srl-2019.06.17.tar.gz), which is a nice alternative. \n\n### Reference\n\nPlease kindly cite this paper in your publications if it helps your research:\n\n```\n@inproceedings{zhang2020SemBERT,\n\ttitle={Semantics-aware {BERT} for language understanding},\n\tauthor={Zhang, Zhuosheng and Wu, Yuwei and Zhao, Hai and Li, Zuchao and Zhang, Shuailiang and Zhou, Xi and Zhou, Xiang},\n  \tbooktitle={the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-2020)},\n\tyear={2020}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcooelf%2Fsembert","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcooelf%2Fsembert","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcooelf%2Fsembert/lists"}