Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/minnesotanlp/Quantifying-Annotation-Disagreement

Official implementation of Wan et al's paper "Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information" (AAAI 2023)
https://github.com/minnesotanlp/Quantifying-Annotation-Disagreement

aaai ai annotation natural-language-processing nlp roberta

Last synced: about 2 months ago
JSON representation

Official implementation of Wan et al's paper "Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information" (AAAI 2023)

Awesome Lists containing this project

README

        

# Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information
This repository provides datasets and code for preprocessing, training and testing models for quantifying annotation disagreement with the official Hugging Face implementation of the following paper:

> [Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information](https://arxiv.org/abs/2301.05036)

> [Ruyuan Wan](https://ruyuanwan.github.io/), [Jaehyung Kim](https://sites.google.com/view/jaehyungkim), [Dongyeop Kang](https://dykang.github.io/)

> [AAAI 2023](https://aaai.org/Conferences/AAAI-23/)

Our code is mainly based on HuggingFace's `transformers` libarary.

## Installation
The following command installs all necessary packages:
```
pip install -r requirements.txt
```
The project was tested using Python 3.7.

## HuggingFace Integration
We uploaded both our datasets and model checkpoints to Hugging Face's [repo](https://huggingface.co/RuyuanWan). You can directly load our data using `datasets` and load our model using `transformers`.
```python
# load our dataset
from datasets import load_dataset
dataset = load_dataset("RuyuanWan/SBIC_Disagreement")
# you can replace "SBIC_Disagreement" to "SChem_Disagreement", "Dilemmas_Disagreement", "Dynasent_Disagreement" or "Politeness_Disagreement" to change datasets

# load our model
from simpletransformers.classification import ClassificationModel, ClassificationArgs
model_args = ClassificationArgs()
model_args.regression = True
SBIC_person_demo_col_regression = ClassificationModel(
"roberta",
"RuyuanWan/SBIC_RoBERTa_Demographic-text_Disagreement_Predictor",
num_labels=1,
args=model_args
)
# you can replace "SBIC_RoBERTa_Demographic-text_Disagreement_Predictor" to other pretrained models

#predict
# you can replace example text to other random examples.
text_example1 = ['Abortion should be legal']
predict1, raw_outputs1 = SBIC_person_demo_col_regression.predict(text_example1)
print(predict1)
```

[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1V-NC0DJ5q-7ePyuXhIgVumtRcRSl8-SD?usp=sharing)

We also provided a simple [demo code](https://colab.research.google.com/drive/1V-NC0DJ5q-7ePyuXhIgVumtRcRSl8-SD?usp=sharing) for how to use them to predict disagreement.

## Datasets
We used public datasets of subjective tasks that contain annotators’ voting records from their original raw dataset

- [Social Bias Corpus(Sap et al. 2020)](https://maartensap.com/social-bias-frames/index.html)
- [Social Chemistry 101(Forbes et al. 2020)](https://github.com/mbforbes/social-chemistry-101)
- [Scruples-dilemmas(Lourie, Bras, and Choi 2021)](https://github.com/allenai/scruples)
- [Dyna-Sentiment(Potts et al. 2021)](https://github.com/cgpotts/dynasent)
- [Wikipedia Politeness(Danescu-Niculescu-Mizil et al.
2013)](https://convokit.cornell.edu/documentation/wiki_politeness.html)

You can load our processed version of disagreement datasets using Hugging Face's `datasets`, and you can also download the disagreement datasets in [datasets/](https://github.com/minnesotanlp/Quantifying-Annotation-Disagreement/tree/main/dataset)

Here are the five datasets with disagreement labels. You can change the following data specifications in using Hugging Face's `datasets`:


Dataset name in Hugging Face
Dataset information


"RuyuanWan/SBIC_Disagreement"
SBIC dataset with disagreement labels


"RuyuanWan/SChem_Disagreement"
SChem dataset with disagreement labels


"RuyuanWan/Dilemmas_Disagreement"
Dilemmas dataset with disagreement labels


"RuyuanWan/Dynasent_Disagreement"
Dynasent dataset with disagreement labels


"RuyuanWan/Politeness_Disagreement"
Politeness dataset with disagreement labels

## Models
In our disagreement prediction experiments, we compared:
- Binary v.s. continous disagreement labels,
- Only text input v.s. text with annotator's demographic information,
- Text with group-wise annotator's demographic information v.s. text with personal level annotator's demographic information.

![plot](https://github.com/minnesotanlp/Quantifying-Annotation-Disagreement/blob/main/code/Quantifying_Disagreement.png)

Here are the different models that we stored at Hugging Face.


Model name in Hugging Face
Model information


"RuyuanWan/SBIC_RoBERTa_Text_Disagreement_Binary_Classifie"
Binary diagreement classifier trained on SBIC text


"RuyuanWan/SBIC_RoBERTa_Text_Disagreement_Predictor"
Disagreement predictor trained on SBIC text(regression)


"RuyuanWan/SBIC_RoBERTa_Demographic-text_Disagreement_Predictor"
Disagreement predictor trained on SBIC text and individual annotator's demographic information in colon templated format


"RuyuanWan/SChem_RoBERTa_Text_Disagreement_Binary_Classifier"
Binary diagreement classifier trained on SChem text


"RuyuanWan/SChem_RoBERTa_Text_Disagreement_Predictor"
Disagreement predictor trained on SChem text(regression)


"RuyuanWan/SChem_RoBERTa_Demographic-text_Disagreement_Predictor"
Disagreement predictor trained on Schem text and individual annotator's demographic information in colon templated format


"RuyuanWan/Dilemmas_RoBERTa_Text_Disagreement_Binary_Classifier"
Binary diagreement classifier trained on Dilemmas text


"RuyuanWan/Dilemmas_RoBERTa_Text_Disagreement_Predictor"
Disagreement predictor trained on Dilemmas text(regression)


"RuyuanWan/Dynasent_RoBERTa_Text_Disagreement_Binary_Classifier"
Binary diagreement classifier trained on Dilemmas text


"RuyuanWan/Dynasent_RoBERTa_Text_Disagreement_Predictor"
Disagreement predictor trained on Dynasent text(regression)


"RuyuanWan/Politeness_RoBERTa_Text_Disagreement_Binary_Classifier"
Binary diagreement classifier trained on Politeness text


"RuyuanWan/Politeness_RoBERTa_Text_Disagreement_Predictor"
Disagreement predictor trained on Politeness text(regression)

## Citation
If you find this work useful for your research, please cite our papers:

```
@article{wan2023everyone,
title={Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information},
author={Wan, Ruyuan and Kim, Jaehyung and Kang, Dongyeop},
journal={arXiv preprint arXiv:2301.05036},
year={2023}
}
```