{"id":13754413,"url":"https://github.com/hooman650/SupCL-Seq","last_synced_at":"2025-05-09T22:32:21.798Z","repository":{"id":37912324,"uuid":"404433160","full_name":"hooman650/SupCL-Seq","owner":"hooman650","description":"Supervised Contrastive Learning for Downstream Optimized Sequence Representations","archived":false,"fork":false,"pushed_at":"2021-11-09T18:38:19.000Z","size":460,"stargazers_count":27,"open_issues_count":1,"forks_count":9,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-22T20:50:55.157Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hooman650.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-09-08T17:14:50.000Z","updated_at":"2024-11-19T06:35:32.000Z","dependencies_parsed_at":"2022-08-18T14:02:17.066Z","dependency_job_id":null,"html_url":"https://github.com/hooman650/SupCL-Seq","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hooman650%2FSupCL-Seq","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hooman650%2FSupCL-Seq/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hooman650%2FSupCL-Seq/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hooman650%2FSupCL-Seq/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hooman650","download_url":"https://codeload.github.com/hooman650/SupCL-Seq/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253336054,"owners_count":21892781,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T09:01:58.747Z","updated_at":"2025-05-09T22:32:16.828Z","avatar_url":"https://github.com/hooman650.png","language":"Python","funding_links":[],"categories":["其他_NLP自然语言处理"],"sub_categories":["其他_文本生成、文本对话"],"readme":"[![PyPI license](https://img.shields.io/pypi/l/ansicolortags.svg)](https://github.com/hooman650/SupCL-Seq/blob/main/LICENSE) [![arXiv](https://img.shields.io/badge/arXiv-1234.56789-b31b1b.svg?style=plastic)](https://arxiv.org/abs/2109.07424)\n# SupCL-Seq :book:\n\n[Supervised Contrastive Learning for Downstream Optimized Sequence representations (**SupCS-Seq**)](https://arxiv.org/abs/2109.07424) accepted to be published in EMNLP 2021, extends the supervised contrastive learning from computer vision to the optimization of sequence representations in NLP. By altering the dropout mask probability in standard Transformer architectures (e.g. *BERT_base*), for every representation (anchor), we generate augmented altered views. A supervised contrastive loss is then utilized to maximize the system’s capability of pulling together similar samples (e.g. anchors and their altered views) and pushing apart the samples belonging to the other classes. Despite its simplicity, SupCL-Seq leads to large gains in many sequence classification tasks on the GLUE benchmark compared to a standard *BERT_base*, including 6% absolute improvement on CoLA, 5.4% on MRPC, 4.7% on RTE and 2.6% on STS-B.\n\nThis package can be easily run on almost all of the transformer models in [`Huggingface`](https://huggingface.co/):hugs: that contain an encoder including but not limited to:\n\n1. [ALBERT](https://huggingface.co/transformers/model_doc/albert.html)\n2. [BERT](https://huggingface.co/transformers/model_doc/bert.html)\n3. [BigBird](https://huggingface.co/transformers/model_doc/bigbird.html)\n4. [RoBerta](https://huggingface.co/transformers/model_doc/roberta.html)\n5. [ERNIE](https://huggingface.co/nghuyong/ernie-2.0-large-en)\n6. And many more models!\n\n![SupCL-Seq](SupCLSeq.png)\n\n## Table of Contents  \n[GLUE Benchmark BERT SupCL-SEQ](#glue-benchmark-bert-supcl-seq)  \n\n[Installation](#installation) \n\n[Usage](#usage)\n\n[Run on GLUE](#run-on-glue)\n\n[How to Cite](#how-to-cite)\n\n[References](#references)\n\n## GLUE Benchmark BERT SupCL-SEQ\nThe table below reports the improvements over naive finetuning of BERT model on GLUE benchmark. We employed `[CLS]` token during training and expect that using the `mean` would further improve these results.\n\n![Glue](Glue.PNG)\n\n## Installation\n\n1. First you need to install one of, or both, TensorFlow 2.0 and PyTorch. Please refer to [TensorFlow installation page](https://www.tensorflow.org/install/pip), [PyTorch installation page](https://pytorch.org/) and/or Flax installation page regarding the specific install command for your platform.\n\n2. Second step:\n\n```bash\n$ pip install SupCL-Seq\n```\n\n## Usage\nThe package builds on the [`trainer`](https://huggingface.co/transformers/main_classes/trainer.html) from [`Huggingface`](https://huggingface.co/):hugs:. Therefore, its use is exactly similar to [`trainer`](https://huggingface.co/transformers/main_classes/trainer.html). The pipeline works as follows:\n\n1. First employ supervised contrastive learning to constratively optimize sentence embeddings using your annotated data.\n \n```python\nfrom SupCL_Seq import SupCsTrainer\n\nSupCL_trainer = SupCsTrainer.SupCsTrainer(\n            w_drop_out=[0.0,0.05,0.2],      # Number of views and their associated mask drop-out probabilities [Optional]\n            temperature= 0.05,              # Temeprature for the contrastive loss function [Optional]\n            def_drop_out=0.1,               # Default drop out of the transformer, this is usually 0.1 [Optional]\n            pooling_strategy='mean',        # Strategy used to extract embeddings can be from `mean` or `pooling` [Optional]\n            model = model,                  # model\n            args = CL_args,                 # Arguments from `TrainingArguments` [Optional]\n            train_dataset=train_dataset,    # Train dataloader\n            tokenizer=tokenizer,            # Tokenizer\n            compute_metrics=compute_metrics # If you need a customized evaluation [Optional]\n        )\n\n```\n\n\n\n\n2. After contrastive training:\n\n    2.1 Add a linear classification layer to your model\n   \n    2.2 Freeze the base layer\n    \n    2.3 Finetune the linear layer on your annotated data\n\n\nFor detailed implementation see [`glue.ipynb`](./examples/glue.ipynb)\n\n## Run on GLUE\nIn order to evaluate the method on GLUE benchmark please see the [`glue.ipynb`](./examples/glue.ipynb)\n\n## How to Cite\n```bibtex\n@misc{sedghamiz2021supclseq,\n      title={SupCL-Seq: Supervised Contrastive Learning for Downstream Optimized Sequence Representations}, \n      author={Hooman Sedghamiz and Shivam Raval and Enrico Santus and Tuka Alhanai and Mohammad Ghassemi},\n      year={2021},\n      eprint={2109.07424},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL}\n}\n```\n\n## References\n[1] [Supervised Contrastive Learning](https://arxiv.org/abs/2004.11362)\n\n[2] [SimCSE: Simple Contrastive Learning of Sentence Embeddings](https://arxiv.org/abs/2104.08821)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhooman650%2FSupCL-Seq","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhooman650%2FSupCL-Seq","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhooman650%2FSupCL-Seq/lists"}