Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/skoltech-nlp/rudetoxifier

Code and data of "Methods for Detoxification of Texts for the Russian Language" paper
https://github.com/skoltech-nlp/rudetoxifier

nlp russian-language style-transfer

Last synced: 7 days ago
JSON representation

Code and data of "Methods for Detoxification of Texts for the Russian Language" paper

Host: GitHub
URL: https://github.com/skoltech-nlp/rudetoxifier
Owner: s-nlp
License: mit
Created: 2021-05-13T10:07:38.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2024-08-27T13:33:37.000Z (3 months ago)
Last Synced: 2024-10-27T18:21:27.892Z (18 days ago)
Topics: nlp, russian-language, style-transfer
Language: Python
Homepage: https://s-nlp.github.io/rudetoxifier/
Size: 8.71 MB
Stars: 46
Watchers: 5
Forks: 6
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Methods for Detoxification of Texts for the Russian Language (ruDetoxifier)

This repository contains models and evaluation methodology for the detoxification task of Russian texts. [The original paper](https://arxiv.org/abs/2105.09052) "Methods for Detoxification of Texts for the Russian Language" was presented at [Dialogue-2021](http://www.dialog-21.ru/) conference.

## Inference Example

In this repository, we release two best models **detoxGPT** and **condBERT** (see [Methodology](https://github.com/skoltech-nlp/rudetoxifier#methodology) for more details). You can try detoxification inference example in this [notebook](https://github.com/skoltech-nlp/rudetoxifier/blob/main/notebooks/rudetoxifier_inference.ipynb) or [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1lSXh8PHGeKTLtuhxYCwHL74qG-V-pkLK?usp=sharing).

## Interactive Demo

Also, you can test our models via [web-demo](https://detoxifier-nlp-zh.skoltech.ru/) or you can pour out your anger on our [Telegram bot](https://t.me/rudetoxifierbot).

***

## Methodology

In our research, we tested several approaches:

### Baselines

- Duplicate: simple duplication of the input;

- Delete: removal of rude and toxic from pre-defined [vocab](https://github.com/skoltech-nlp/rudetoxifier/blob/main/data/train/MAT_FINAL_with_unigram_inflections.txt);

- Retrieve: retrieval based on cosine similarity between word embeddings from non-toxic part of [RuToxic](https://github.com/skoltech-nlp/rudetoxifier/blob/main/data/train/ru_toxic_dataset.csv) dataset;

### detoxGPT

Based on [ruGPT](https://github.com/sberbank-ai/ru-gpts) models. This method requires [parallel dataset](https://github.com/skoltech-nlp/rudetoxifier/blob/main/data/train/dataset_200.xls) for training. We tested ***ruGPT-small***, ***ruGPT-medium***, and ***ruGPT-large*** models in several setups:

- ***zero-shot***: the model is taken as is (with no fine-tuning). The input is a toxic sentence which we would like to detoxify prepended with the prefix “Перефразируй” (rus. Paraphrase) and followed with the suffix “>>>” to indicate the paraphrasing task

- ***few-shot***: the model is taken as is. Unlike the previous scenario, we give a prefix consisting of a parallel dataset of toxic and neutral sentences.

- ***fine-tuned***: the model is fine-tuned for the paraphrasing task on a parallel dataset.

### condBERT

Based on [BERT](https://arxiv.org/abs/1810.04805) model. This method *does not* require parallel dataset for training. One of the tasks on which original BERT was pretrained -- predicting the word that should was replaced with a \[MASK\] token -- suits delete-retrieve-generate style transfer method. We tested [RuBERT](https://huggingface.co/DeepPavlov/rubert-base-cased-conversational) and [Geotrend](https://huggingface.co/Geotrend/bert-base-ru-cased) pre-trained models in several setups:

- ***zero-shot*** where BERT is taken as is (with no extra fine-tuning);

- ***fine-tuned*** where BERT is fine-tuned on a dataset of toxic and safe sentences to acquire a style-

dependent distribution, as described above.

***

## Automatic Evaluation

The evaluation consists of three types of metrics:

- **style transfer accuracy (STA)**: accuracy based on toxic/non-toxic classifier (we suppose that the resulted text should be in non-toxic style)

- **content preservation**:

  - word overlap (WO);

  - BLEU: accuracy based on n-grams (1-4);

  - cosine similarity (CS): between vectors of texts’ embeddings.

- **language quality**: perplexity (PPL) based on language model.

Finally, **aggregation metric**: geometric mean between STA, CS and PPL.

### Launching

You can run [`ru_metric.py`](https://github.com/skoltech-nlp/rudetoxifier/blob/main/metrics/ru_metric.py) script for evaluation. The fine-tuned weights for toxicity classifier can be found [here](https://drive.google.com/file/d/1WqNOyFegzUWoY7tCMtmtMt9Ct0lhRwHy/view?usp=sharing).

***

## Results

|Method   |STA↑   |CS↑   |WO↑   |BLEU↑  |PPL↓  |GM↑   |

|---|:---:|:---:|:---:|:---:|:---:|:---:|

|**Baselines**

|Duplicate   |0.00   |1.00   |1.00   |1.00   |146.00   |0.05 ± 0.0012   |

|Delete   |0.27   |0.96   |0.85   |0.81   |263.55   |0.10 ± 0.0007   |

|Retrieve   |0.91   |0.85   |0.07   |0.09   |65.74   |0.22 ± 0.0010   |

|**detoxGPT-small**

|zero-shot   |0.93   |0.20   |0.00   |0.00   |159.11   |0.10 ± 0.0005   |

|few-shot   |0.17   |0.70   |0.05   |0.06   |83.38   |0.11 ± 0.0009   |

|fine-tuned   |0.51   |0.70   |0.05   |0.05   |39.48   |0.20 ± 0.0011   |

|**detoxGPT-medium**   |   |   |   |   |   |   |

|fine-tuned   |0.49   |0.77   |0.18   |0.21   |86.75   |0.16 ± 0.0009   |

|**detoxGPT-large**   |   |   |   |   |   |   |

|fine-tuned   |0.61   |0.77   |0.22   |0.21   |**36.92**  |**0.23 ± 0.0010**  |

|**condBERT**  |   |   |   |   |   |   |

|DeepPavlov zero-shot   |0.53   |0.80   |0.42   |0.61   |668.58   |0.08 ± 0.0006   |

|DeepPavlov fine-tuned   |0.52   |0.86   |0.51   |0.53   |246.68   |0.12 ± 0.0007   |

|Geotrend zero-shot   |0.62   |0.85   |0.54   |**0.64**   |237.46   |0.13 ± 0.0009   |

|Geotrend fine-tuned   |**0.66**  |**0.86**  |**0.54**   |0.64   |209.95   |0.14 ± 0.0009   |

***

## Data

Folder `data` consists of all used train datasets, test data and naive example of style transfer result:

- `data/train`: RuToxic dataset, list of Russian rude words, and 200 samples of parallel sentences that were used for ruGPT fine-tuning;

- `data/test`: 10,000 samples that were used for approaches evaluation;

- `data/results`: example of style transfer output format illustrated with naive duplication.

***

## Citation

If you find this repository helpful, feel free to cite our publication:

```

@article{DBLP:journals/corr/abs-2105-09052,

  author    = {Daryna Dementieva and

               Daniil Moskovskiy and

               Varvara Logacheva and

               David Dale and

               Olga Kozlova and

               Nikita Semenov and

               Alexander Panchenko},

  title     = {Methods for Detoxification of Texts for the Russian Language},

  journal   = {CoRR},

  volume    = {abs/2105.09052},

  year      = {2021},

  url       = {https://arxiv.org/abs/2105.09052},

  archivePrefix = {arXiv},

  eprint    = {2105.09052},

  timestamp = {Mon, 31 May 2021 16:16:57 +0200},

  biburl    = {https://dblp.org/rec/journals/corr/abs-2105-09052.bib},

  bibsource = {dblp computer science bibliography, https://dblp.org}

}

```

***

## Contacts

For any questions please contact Daryna Dementieva via [email](mailto:[email protected]).