{"id":13595140,"url":"https://github.com/bohanli/BERT-flow","last_synced_at":"2025-04-09T10:33:02.474Z","repository":{"id":41428507,"uuid":"301422716","full_name":"bohanli/BERT-flow","owner":"bohanli","description":"TensorFlow implementation of On the Sentence Embeddings from Pre-trained Language Models (EMNLP 2020)","archived":false,"fork":false,"pushed_at":"2021-05-19T17:45:52.000Z","size":281,"stargazers_count":529,"open_issues_count":12,"forks_count":68,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-11-06T17:45:59.355Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bohanli.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-10-05T13:44:32.000Z","updated_at":"2024-10-10T18:40:50.000Z","dependencies_parsed_at":"2022-09-21T08:37:28.092Z","dependency_job_id":null,"html_url":"https://github.com/bohanli/BERT-flow","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bohanli%2FBERT-flow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bohanli%2FBERT-flow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bohanli%2FBERT-flow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bohanli%2FBERT-flow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bohanli","download_url":"https://codeload.github.com/bohanli/BERT-flow/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248020593,"owners_count":21034459,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T16:01:44.669Z","updated_at":"2025-04-09T10:32:57.465Z","avatar_url":"https://github.com/bohanli.png","language":"Python","funding_links":[],"categories":["Python","文本匹配 文本检索 文本相似度","🧑‍💻 Repos \u003csmall\u003e(18)\u003c/small\u003e"],"sub_categories":["其他_文本生成、文本对话","\u003cimg src=\"assets/tensorflow.svg\" alt=\"TensorFlow\" height=\"20px\"\u003e \u0026nbsp;TensorFlow Repos"],"readme":"# On the Sentence Embeddings from Pre-trained Language Models\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"img/bert-flow.png\" width=\"450\"\u003e\n\u003c/p\u003e\n\nThis is a TensorFlow implementation of the following [paper](https://arxiv.org/abs/2011.05864):\n\n```\nOn the Sentence Embeddings from Pre-trained Language Models\nBohan Li, Hao Zhou, Junxian He, Mingxuan Wang, Yiming Yang, Lei Li\nEMNLP 2020\n```\n\n\n\nModel                                        | Spearman's rho \n-------------------------------------------- | :-------------: \nBERT-large-NLI                               | 77.80    \nBERT-large-NLI-last2avg                      | 78.45   \nBERT-large-NLI-flow (target, train only)     | 80.54 \nBERT-large-NLI-flow (target, train+dev+test) | 81.18    \n  \n\nPlease contact bohanl1@cs.cmu.edu if you have any questions.\n\n\n## Requirements\n\n* Python \u003e= 3.6\n* TensorFlow \u003e= 1.14\n\n## Preparation\n\n### Pretrained BERT models\n```bash\nexport BERT_PREMODELS=\"../bert_premodels\"\nmkdir ${BERT_PREMODELS}; cd ${BERT_PREMODELS}\n\n# then download the pre-trained BERT models from https://github.com/google-research/bert\ncurl -O https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip\ncurl -O https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-24_H-1024_A-16.zip\n\nls ${BERT_PREMODELS}/uncased_L-12_H-768_A-12 # base\nls ${BERT_PREMODELS}/uncased_L-24_H-1024_A-16 # large\n```\n\n### GLUE\n```bash\nexport GLUE_DIR=\"../glue_data\"\npython download_glue_data.py --data_dir=${GLUE_DIR}\n\n# then download the labeled test set of STS-B\ncd ../glue_data/STS-B\ncurl -O https://raw.githubusercontent.com/kawine/usif/master/STSBenchmark/sts-test.csv\n```\n\n### SentEval\n```bash\ncd ..\ngit clone https://github.com/facebookresearch/SentEval\n```\n\n## Usage\n\n### Fine-tune BERT with NLI supervision (optional)\n```bash\nexport OUTPUT_PARENT_DIR=\"../exp\"\nexport CACHED_DIR=${OUTPUT_PARENT_DIR}/cached_data\nmkdir ${CACHED_DIR}\n\nexport RANDOM_SEED=1234\nexport CUDA_VISIBLE_DEVICES=0\nexport BERT_NAME=\"large\"\nexport TASK_NAME=\"ALLNLI\"\nunset INIT_CKPT\nbash scripts/train_siamese.sh train \\\n\"--exp_name=exp_${BERT_NAME}_${RANDOM_SEED} \\\n--num_train_epochs=1.0 \\\n--learning_rate=2e-5 \\\n--train_batch_size=16 \\\n--cached_dir=${CACHED_DIR}\"\n\n\n# evaluation\nexport RANDOM_SEED=1234\nexport CUDA_VISIBLE_DEVICES=0\nexport TASK_NAME=STS-B\nexport BERT_NAME=large\nexport OUTPUT_PARENT_DIR=\"../exp\"\nexport INIT_CKPT=${OUTPUT_PARENT_DIR}/exp_${BERT_NAME}_${RANDOM_SEED}/model.ckpt-60108\nexport CACHED_DIR=${OUTPUT_PARENT_DIR}/cached_data\nexport EXP_NAME=exp_${BERT_NAME}_${RANDOM_SEED}_eval\nbash scripts/train_siamese.sh predict \\\n\"--exp_name=${EXP_NAME} \\\n --cached_dir=${CACHED_DIR} \\\n --sentence_embedding_type=avg \\\n --flow=0 --flow_loss=0 \\\n --num_examples=0 \\\n --num_train_epochs=1e-10\"\n```\n\nNote: You may want to add `--use_xla` to speed up the BERT fine-tuning.\n\n### Unsupervised learning of flow-based generative models\n```bash\nexport CUDA_VISIBLE_DEVICES=0\nexport TASK_NAME=STS-B\nexport BERT_NAME=large\nexport OUTPUT_PARENT_DIR=\"../exp\"\nexport INIT_CKPT=${OUTPUT_PARENT_DIR}/exp_large_1234/model.ckpt-60108\nexport CACHED_DIR=${OUTPUT_PARENT_DIR}/cached_data\nbash scripts/train_siamese.sh train \\\n\"--exp_name_prefix=exp \\\n --cached_dir=${CACHED_DIR} \\\n --sentence_embedding_type=avg-last-2 \\\n --flow=1 --flow_loss=1 \\\n --num_examples=0 \\\n --num_train_epochs=1.0 \\\n --flow_learning_rate=1e-3 \\\n --use_full_for_training=1\"\n\n# evaluation\nexport CUDA_VISIBLE_DEVICES=0\nexport TASK_NAME=STS-B\nexport BERT_NAME=large\nexport OUTPUT_PARENT_DIR=\"../exp\"\nexport INIT_CKPT=${OUTPUT_PARENT_DIR}/exp_large_1234/model.ckpt-60108\nexport CACHED_DIR=${OUTPUT_PARENT_DIR}/cached_data\nexport EXP_NAME=exp_t_STS-B_ep_1.00_lr_5.00e-05_e_avg-last-2_f_11_1.00e-03_allsplits\nbash scripts/train_siamese.sh predict \\\n\"--exp_name=${EXP_NAME} \\\n --cached_dir=${CACHED_DIR} \\\n --sentence_embedding_type=avg-last-2 \\\n --flow=1 --flow_loss=1 \\\n --num_examples=0 \\\n --num_train_epochs=1.0 \\\n --flow_learning_rate=1e-3 \\\n --use_full_for_training=1\"\n```\n\n### Fit flow with only the training set of STS-B\n```bash\nexport CUDA_VISIBLE_DEVICES=0\nexport TASK_NAME=STS-B\nexport BERT_NAME=large\nexport OUTPUT_PARENT_DIR=\"../exp\"\nexport INIT_CKPT=${OUTPUT_PARENT_DIR}/exp_large_1234/model.ckpt-60108\nexport CACHED_DIR=${OUTPUT_PARENT_DIR}/cached_data\nbash scripts/train_siamese.sh train \\\n\"--exp_name_prefix=exp \\\n --cached_dir=${CACHED_DIR} \\\n --sentence_embedding_type=avg-last-2 \\\n --flow=1 --flow_loss=1 \\\n --num_examples=0 \\\n --num_train_epochs=1.0 \\\n --flow_learning_rate=1e-3 \\\n --use_full_for_training=0\"\n\n# evaluation\nexport CUDA_VISIBLE_DEVICES=0\nexport TASK_NAME=STS-B\nexport BERT_NAME=large\nexport OUTPUT_PARENT_DIR=\"../exp\"\nexport INIT_CKPT=${OUTPUT_PARENT_DIR}/exp_large_1234/model.ckpt-60108\nexport CACHED_DIR=${OUTPUT_PARENT_DIR}/cached_data\nexport EXP_NAME=exp_t_STS-B_ep_1.00_lr_5.00e-05_e_avg-last-2_f_11_1.00e-03\nbash scripts/train_siamese.sh predict \\\n\"--exp_name=${EXP_NAME} \\\n --cached_dir=${CACHED_DIR} \\\n --sentence_embedding_type=avg-last-2 \\\n --flow=1 --flow_loss=1 \\\n --num_examples=0 \\\n --num_train_epochs=1.0 \\\n --flow_learning_rate=1e-3 \\\n --use_full_for_training=1\"\n```\n\n## Download our models\nOur models are available at https://drive.google.com/file/d/1-vO47t5SPFfzZPKkkhSe4tXhn8u--KLR/view?usp=sharing\n\n## Reference\n\n```\n@inproceedings{li2020emnlp,\n    title = {On the Sentence Embeddings from Pre-trained Language Models},\n    author = {Bohan Li and Hao Zhou and Junxian He and Mingxuan Wang and Yiming Yang and Lei Li},\n    booktitle = {Conference on Empirical Methods in Natural Language Processing (EMNLP)},\n    month = {November},\n    year = {2020}\n}\n\n```\n\n## Acknowledgements\n\nA large portion of this repo is borrowed from the following projects:\n- https://github.com/google-research/bert\n- https://github.com/zihangdai/xlnet\n- https://github.com/tensorflow/tensor2tensor\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbohanli%2FBERT-flow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbohanli%2FBERT-flow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbohanli%2FBERT-flow/lists"}