{"id":20068347,"url":"https://github.com/pliang279/sent_debias","last_synced_at":"2026-03-03T21:02:13.558Z","repository":{"id":37367068,"uuid":"236606964","full_name":"pliang279/sent_debias","owner":"pliang279","description":"[ACL 2020] Towards Debiasing Sentence Representations","archived":false,"fork":false,"pushed_at":"2022-11-21T05:02:08.000Z","size":50279,"stargazers_count":66,"open_issues_count":3,"forks_count":19,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-06-06T09:18:24.825Z","etag":null,"topics":["fairness-ai","machine-learning","natural-language-processing","representation-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pliang279.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-01-27T22:11:58.000Z","updated_at":"2025-06-01T01:15:22.000Z","dependencies_parsed_at":"2023-01-21T04:21:04.730Z","dependency_job_id":null,"html_url":"https://github.com/pliang279/sent_debias","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/pliang279/sent_debias","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pliang279%2Fsent_debias","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pliang279%2Fsent_debias/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pliang279%2Fsent_debias/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pliang279%2Fsent_debias/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pliang279","download_url":"https://codeload.github.com/pliang279/sent_debias/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pliang279%2Fsent_debias/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30060632,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-03T18:21:05.932Z","status":"ssl_error","status_checked_at":"2026-03-03T18:20:59.341Z","response_time":61,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fairness-ai","machine-learning","natural-language-processing","representation-learning"],"created_at":"2024-11-13T14:06:15.448Z","updated_at":"2026-03-03T21:02:13.542Z","avatar_url":"https://github.com/pliang279.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Towards Debiasing Sentence Representations\n\n\u003e Pytorch implementation for debiasing sentence representations.\n\nThis implementation contains code for removing bias from BERT representations and evaluating bias level in BERT representations.\n\nCorrespondence to: \n  - Paul Liang (pliang@cs.cmu.edu)\n  - Irene Li (mengzeli@cs.cmu.edu)\n\n## Paper\n\n[**Towards Debiasing Sentence Representations**](https://www.aclweb.org/anthology/2020.acl-main.488/)\u003cbr\u003e\n[Paul Pu Liang](http://www.cs.cmu.edu/~pliang/), [Irene Li](https://www.linkedin.com/in/mengze-irene-li-114592130/), [Emily Zheng](https://www.linkedin.com/in/emily-zheng-348190128/), [Yao Chong Lim](https://scholar.google.com/citations?user=R-upoxQAAAAJ\u0026hl=en), [Ruslan Salakhutdinov](https://www.cs.cmu.edu/~rsalakhu/), and [Louis-Philippe Morency](https://www.cs.cmu.edu/~morency/)\u003cbr\u003e\nACL 2020\n\nIf you find this repository useful, please cite our paper:\n```\n@inproceedings{liang-etal-2020-towards,\n    title = \"Towards Debiasing Sentence Representations\",\n    author = \"Liang, Paul Pu  and\n      Li, Irene Mengze  and\n      Zheng, Emily  and\n      Lim, Yao Chong  and\n      Salakhutdinov, Ruslan  and\n      Morency, Louis-Philippe\",\n    booktitle = \"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics\",\n    month = jul,\n    year = \"2020\",\n    address = \"Online\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://www.aclweb.org/anthology/2020.acl-main.488\",\n    doi = \"10.18653/v1/2020.acl-main.488\",\n    pages = \"5502--5515\",\n}\n```\n\n## Installation\n\nFirst check that the requirements are satisfied:\u003c/br\u003e\nPython 3.6\u003c/br\u003e\ntorch 1.2.0\u003c/br\u003e\nhuggingface transformers\u003c/br\u003e\nnumpy 1.18.1\u003c/br\u003e\nsklearn 0.20.0\u003c/br\u003e\nmatplotlib 3.1.2\u003c/br\u003e\ngensim 3.8.0 \u003c/br\u003e\ntqdm 4.45.0\u003c/br\u003e\nregex 2.5.77\u003c/br\u003e\npattern3\u003c/br\u003e\n\nThe next step is to clone the repository:\n```bash\ngit clone https://github.com/pliang279/sent_debias.git\n```\n\nTo install bert models, go to `debias-BERT/`, run ```pip install .```\n\n## Data\nDownload the [GLUE data](https://gluebenchmark.com/tasks) by running this [script](https://gist.github.com/W4ngatang/60c2bdb54d156a41194446737ce03e2e):\n```python\npython download_glue_data.py --data_dir glue_data --tasks SST,QNLI,CoLA\n```\nUnpack it to some directory `$GLUE_DIR`.\n\n## Precomputed models and embeddings (optional)\n1. Models\n    * Download https://drive.google.com/file/d/1cAN49-HDHFdNP1GJZn83s2-mEGCeZWjh/view?usp=sharing to `debias-BERT/experiments`.\n    * \n      ```\n      tar -xvf acl2020-results.tar.gz\n      ```\n\n2. Embeddings\n    * Download https://drive.google.com/file/d/1ubKn8SCjwnp9pYjQa9SmKWxFojX9a6Bz/view?usp=sharing to `debias-BERT/experiments`.\n    * \n      ```\n      tar -xvf saved_embs.tar.gz\n      ```\n\n## Usage\n\nIf you choose to use precomputed models and embeddings, skip to step B. Otherwise, follow step A and B sequentially.\n\n### A. Fine-tune BERT\n\n1. Go to `debias-BERT/experiments`.\n2. Run `export TASK_NAME=SST-2` (task can be one of SST-2, CoLA, and QNLI).\n4. Fine tune BERT on `$TASK_NAME`.\n    * With debiasing\n      ```\n      python run_classifier.py \\\n      --data_dir $GLUE_DIR/$TASK_NAME/ \\\n      --task_name $TASK_NAME \\\n      --output_dir path/to/results_directory \\\n      --do_train \\\n      --do_eval \\\n      --do_lower_case \\\n      --debias \\\n      --normalize \\\n      --tune_bert \n      ```\n    * Without debiasing\n      ```\n      python run_classifier.py \\\n      --data_dir $GLUE_DIR/$TASK_NAME/ \\\n      --task_name $TASK_NAME \\\n      --output_dir path/to/results_directory \\\n      --do_train \\\n      --do_eval \\\n      --do_lower_case \\\n      --normalize \\\n      --tune_bert \n      ```\n    The fine-tuned model and dev set evaluation results will be stored under the specified `output_dir`.\n\n### B. Evaluate bias in BERT representations\n\n1. Go to `debias-BERT/experiments`.\n2. Run ` export TASK_NAME=SST-2` (task can be one of SST-2, CoLA, and QNLI).\n3. Evaluate fine-tuned BERT on bias level.\n    * Evaluate debiased fine-tuned BERT.\n      ```\n        python eval_bias.py \\\n        --debias \\\n        --model_path path/to/model \\\n        --model $TASK_NAME \\\n        --results_dir path/to/results_directory \\\n        --output_name debiased\n      ```\n      If using precomputed models, set `model_path` to `acl2020-results/$TASK_NAME/debiased`.\n    * Evaluate biased fine-tuned BERT.\n      ```\n        python eval_bias.py \\\n        --model_path path/to/model \\\n        --model $TASK_NAME \\\n        --results_dir path/to/results_directory \\\n        --output_name biased\n      ```\n      If using precomputed models, set `model_path` to `acl2020-results/$TASK_NAME/biased`.\n\n    The evaluation results will be stored in the file `results_dir/output_name`. \n\n    Note: The argument `model_path` should be specified as the `output_dir` corresponding to the fine-tuned model you want to evaluate. Specifically, `model_path` should be a directory containing the following files: `config.json`, `pytorch_model.bin` and `vocab.txt`. \n4. Evaluate pretrained BERT on bias level.\n    * Evaluate debiased pretrained BERT.\n      ```\n      python eval_bias.py \\\n      --debias \\\n      --model pretrained \\\n      --results_dir path/to/results_directory \\\n      --output_name debiased \n      ```\n    * Evaluate biased pretrained BERT.\n      ```\n      python eval_bias.py \\\n      --model pretrained \\\n      --results_dir path/to/results_directory \\\n      --output_name biased \n      ```\n    Again, the bias evaluation results will be stored in the file `results_dir/output_name`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpliang279%2Fsent_debias","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpliang279%2Fsent_debias","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpliang279%2Fsent_debias/lists"}