https://github.com/princeton-nlp/mabel

EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975
https://github.com/princeton-nlp/mabel
contrastive-learning fairness gender-bias natural-language-processing
Last synced: 5 months ago
JSON representation
EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975
Host: GitHub
URL: https://github.com/princeton-nlp/mabel
Owner: princeton-nlp
License: mit
Created: 2022-10-22T09:44:41.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2023-12-14T08:04:53.000Z (almost 2 years ago)
Last Synced: 2025-04-05T04:31:50.201Z (6 months ago)
Topics: contrastive-learning, fairness, gender-bias, natural-language-processing
Language: Python
Homepage:
Size: 5.92 MB
Stars: 37
Watchers: 4
Forks: 2
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          ## MABEL: Attenuating Gender Bias using Textual Entailment Data

Authors: [Jacqueline He](https://jacqueline-he.github.io/), [Mengzhou Xia](https://xiamengzhou.github.io/), [Christiane Fellbaum](https://www.cs.princeton.edu/~fellbaum/), [Danqi Chen](https://www.cs.princeton.edu/~danqic/)

This repository contains the code for our EMNLP 2022 paper, ["MABEL: Attenuating Gender Bias using Textual Entailment Data"](https://arxiv.org/pdf/2210.14975.pdf). 

**MABEL** (a **M**ethod for **A**ttenuating **B**ias using **E**ntailment **L**abels) is a task-agnostic intermediate pre-training technique that leverages entailment pairs from NLI data to produce 

representations which are both semantically capable and fair. 

This approach exhibits a good fairness-performance tradeoff across intrinsic and extrinsic gender bias diagnostics, with minimal damage on natural language understanding tasks. 

![Training Schema](figure/teaser.png)

## Table of Contents

  * [Quick Start](#quick-start)

  * [Model List](#model-list)

  * [Training](#training)

  * [Evaluation](#evaluation)

    + [Intrinsic Metrics](#intrinsic-metrics)

    + [Extrinsic Metrics](#extrinsic-metrics)

    + [Language Understanding](#language-understanding)

  * [Code Acknowledgements](#code-acknowledgements)

  * [Citation](#citation)

## Quick Start

With the `transformers` package installed, you can import the off-the-shelf model like so: 

```python

from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("princeton-nlp/mabel-bert-base-uncased")

model = AutoModelForMaskedLM.from_pretrained("princeton-nlp/mabel-bert-base-uncased")

```

## Model List

|              MABEL Models       | ICAT ↑ |

|:-------------------------------|:------|

|  [princeton-nlp/mabel-bert-base-uncased](https://huggingface.co/princeton-nlp/mabel-bert-base-uncased) | 73.98 | 

| [princeton-nlp/mabel-bert-large-uncased](https://huggingface.co/princeton-nlp/mabel-bert-large-uncased) |  73.45 |

| [princeton-nlp/mabel-roberta-base](https://huggingface.co/princeton-nlp/mabel-roberta-base) |  69.68 |

| [princeton-nlp/mabel-roberta-large](https://huggingface.co/princeton-nlp/mabel-roberta-large) |  69.49 |

Note: The ICAT score is a bias metric that consolidates a model's capacity for language modeling and stereotypical association into a single numerical indicator. More information can be found in the [StereoSet](https://aclanthology.org/2021.acl-long.416.pdf) (Nadeem et al., 2021) paper.

## Training

Before training, make sure that the [counterfactually-augmented NLI data](https://drive.google.com/file/d/16KPp0rZv2DqAumccaRgvpLW4NwRCfdl6/view?usp=sharing), processed from SNLI and MNLI, is downloaded and stored under the `training` directory as `entailment_data.csv`. 

**1. Install package dependencies**

```bash

pip install -r requirements.txt

```

**2. Run training script**

```bash

cd training

chmod +x run.sh 

./run.sh

```

You can configure the hyper-parameters in `run.sh` accordingly. Models are saved to `out/`. The optimal set of hyper-parameters varies depending on the choice of backbone encoder, and the full training details can be found in the paper.   

## Evaluation

### Intrinsic Metrics 

If you use your own trained model instead of our provided HF checkpoint, you must first run `python -m training.convert_to_hf --path /path/to/your/checkpoint --base-model bert` (which converts the checkpoint to a standard BertForMaskedLM model - use `--base_model roberta` for RobertaForMaskedLM) prior to intrinsic evaluation.

Also, please note that we use [Meade et al.'s](https://arxiv.org/abs/2110.08527) method of computation and datasets for both StereoSet and CrowS-Pairs; this is why the metrics for the pre-trained models are not directly comparable to those reported in the original benchmark papers. 

**1. StereoSet ([Nadeem et al., 2021](https://aclanthology.org/2021.acl-long.416/))**

Command:

```bash

python -m benchmark.intrinsic.stereoset.predict --model_name_or_path princeton-nlp/mabel-bert-base-uncased && 

python -m benchmark.intrinsic.stereoset.eval

```

Output:

```

intrasentence

gender

Count: 2313.0

LM Score: 84.5453251710623

SS Score: 56.248299466465376

ICAT Score: 73.98003496789251

```

Collective Results:

|              Models       | LM ↑ | SS ◇| ICAT ↑ |

|:-------------------------------|:------|:------|:------|

| bert-base-uncased | 84.17 | 60.28 | 66.86 | 

|  princeton-nlp/mabel-bert-base-uncased | 84.54 | 56.25 | 73.98 | 

| bert-large-uncased | 86.54 | 63.24 | 63.62 | 

|  princeton-nlp/mabel-bert-large-uncased | 84.93 | 56.76 |  73.45 |

| roberta-base | 88.93 | 66.32 | 59.90 | 

|  princeton-nlp/mabel-roberta-base | 87.44 | 60.14 |  69.68 |

| roberta-large | 88.81 | 66.82 | 58.92 | 

|  princeton-nlp/mabel-roberta-large | 89.72 | 61.28 |  69.49 |

◇: The closer to 50, the better.

**2. CrowS-Pairs ([Nangia et al., 2021](https://aclanthology.org/2020.emnlp-main.154/))** 

Command: 

```bash

python -m benchmark.intrinsic.crows.eval --model_name_or_path princeton-nlp/mabel-bert-base-uncased

```

Output: 

``` 

====================================================================================================

Total examples: 262

Metric score: 50.76

Stereotype score: 51.57

Anti-stereotype score: 49.51

Num. neutral: 0.0

====================================================================================================

```

Collective Results:

|              Models       | Metric Score ◇ |

|:-------------------------------|:------|

| bert-base-uncased | 57.25 |

|  princeton-nlp/mabel-bert-base-uncased |50.76 | 

| bert-large-uncased | 55.73 |

|  princeton-nlp/mabel-bert-large-uncased | 51.15 |

| roberta-base | 60.15 | 

|  princeton-nlp/mabel-roberta-base | 49.04 | 

| roberta-large | 60.15 | 

|  princeton-nlp/mabel-roberta-large | 54.41 | 

◇: The closer to 50, the better.

### Extrinsic Metrics

1. Occupation Classification 

See `benchmark/extrinsic/occ_cls/README.md` for full training instructions and results.

2. Natural Language Inference

See `benchmark/extrinsic/nli/README.md` for full training instructions and results.

3. Coreference Resolution

See `benchmark/extrinsic/coref/README.md` for full training instructions and results.

### Language Understanding 

**1. GLUE ([Wang et al., 2018](https://aclanthology.org/W18-5446/))** 

We fine-tune on GLUE through the [transformers](https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-classification) library, following the default hyper-parameters. 

A straightforward way is to download the current transformers repository:

```bash

git clone https://github.com/huggingface/transformers

cd transformers

pip install .

```

Then set up the environment dependencies:

```bash

cd ./examples/pytorch/text-classification

pip install -r requirements.txt

```

Here is a sample script for one of the GLUE tasks, MRPC:

```bash

# task options: cola, sst2, mrpc, stsb, qqp, mnli, qnli, rte 

export TASK_NAME=mrpc

export OUTPUT_DIR=out/

CUDA_VISIBLE_DEVICES=0 python run_glue.py \

  --model_name_or_path princeton-nlp/mabel-bert-base-uncased \

  --task_name $TASK_NAME \

  --do_train \

  --do_eval \

  --max_seq_length 128 \

  --per_device_train_batch_size 32 \

  --learning_rate 2e-5 \

  --num_train_epochs 3 \

  --output_dir $OUTPUT_DIR

```

**2. SentEval Transfer Tasks ([Conneau et al., 2018](https://arxiv.org/abs/1803.05449))**

Preprocess:

Make sure you have cloned the [SentEval](https://github.com/facebookresearch/SentEval) repo and added its contents into this repository's `transfer` folder, and run `./get_transfer_data.bash` in `data/downstream` to download the evaluation data.

Command:

```bash

python -m benchmark.transfer.eval --model_name_or_path princeton-nlp/mabel-bert-base-uncased --task_set transfer

```

Output:

```

+-------+-------+-------+-------+-------+-------+-------+-------+

|   MR  |   CR  |  SUBJ |  MPQA |  SST2 |  TREC |  MRPC |  Avg. |

+-------+-------+-------+-------+-------+-------+-------+-------+

| 78.33 | 85.83 | 93.78 | 89.13 | 85.50 | 85.20 | 68.87 | 83.81 |

+-------+-------+-------+-------+-------+-------+-------+-------+

```

Collective Results:

|              Models       | Transfer Avg. ↑ |

|:-------------------------------|:------|

| bert-base-uncased | 83.73 |

|  princeton-nlp/mabel-bert-base-uncased |83.81 | 

| bert-large-uncased | 86.54 |

|  princeton-nlp/mabel-bert-large-uncased | 86.09 |

## Code Acknowledgements

- Evaluation code for StereoSet and CrowS-Pairs is adapted from ["An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models"](https://aclanthology.org/2022.acl-long.132/) (Meade et al., 2022). 

- Model implementation code is adapted from [SimCSE](https://github.com/princeton-nlp/SimCSE/blob/main/evaluation.py) (Gao et al., 2021). 

- Evaluation code for the transfer tasks relies on the SentEval package [here](https://github.com/facebookresearch/SentEval), and adapts from a script prepared by [SimCSE](https://github.com/princeton-nlp/SimCSE/blob/main/evaluation.py) (Gao et al., 2021). 

- Evaluation code for GLUE relies on the Huggingface implementation of the [transformers](https://arxiv.org/abs/1910.03771) (Wolf et al., 2019) package.

- Training and evaluation for e2e span-based coreference resolution follows from [this Pytorch implementation](https://aclanthology.org/2020.emnlp-main.686/) (Xu and Choi, 2020).

- Repository is formatted with [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black).

## Citation

```bibtex

@inproceedings{he2022mabel,

   title={{MABEL}: Attenuating Gender Bias using Textual Entailment Data},

   author={He, Jacqueline and Xia, Mengzhou and Fellbaum, Christiane and Chen, Danqi},

   booktitle={Empirical Methods in Natural Language Processing (EMNLP)},

   year={2022}

}

```
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/princeton-nlp/mabel

Awesome Lists containing this project

README