https://github.com/textflint/textflint

Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing
https://github.com/textflint/textflint

adversarial-samples attack data-augmentation model-robustness robustness-analysis subpopulation text-augmentation text-transformations transformation

Last synced: about 2 months ago
JSON representation

Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing

Host: GitHub
URL: https://github.com/textflint/textflint
Owner: textflint
License: gpl-3.0
Created: 2021-03-06T11:15:52.000Z (about 4 years ago)
Default Branch: master
Last Pushed: 2022-09-27T17:09:16.000Z (over 2 years ago)
Last Synced: 2025-03-05T15:51:59.833Z (3 months ago)
Topics: adversarial-samples, attack, data-augmentation, model-robustness, robustness-analysis, subpopulation, text-augmentation, text-transformations, transformation
Language: Python
Homepage:
Size: 11.6 MB
Stars: 643
Watchers: 18
Forks: 95
Open Issues: 6
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

awesome-huggingface - TextFlint - A unified multilingual robustness evaluation toolkit for NLP. (🏹️ Adversarial Attack)

README

Unified Multilingual Robustness Evaluation Toolkit
for Natural Language Processing

TextFlint is a multilingual robustness evaluation platform for natural language processing, which unifies text **transformation**, **sub-population**, **adversarial attack**,and their combinations to provide a comprehensive robustness analysis. So far, TextFlint supports 13 NLP tasks.

> If you're looking for robustness evaluation results of SOTA models, you might want the [TextFlint IO](https://www.textflint.io/textflint) page.

## Features

- **Full coverage of transformation types**, including 20 general transformations, 8 subpopulations and 60 task-specific transformations, as well as thousands of their combinations.
- **Subpopulation**, which is to identify the specific part of dataset on which the target model performs poorly.
- **Adversarial attack** aims to find a perturbation of an input text that is able to fool the given model.
- **Complete analytical report** to accurately explain where your model's shortcomings are, such as the problems in lexical rules or syntactic rules.

## Online Demo

You can test most of transformations directly on our [online demo](https://www.textflint.io/tutorials).

## Table of Contents

- [Setup](#setup)
- [Usage](#usage)
- [Architecture](#Architecture)
- [Learn More](#learn-more)
- [Contributing](#contributing)
- [Citation](#Citation)

## Setup

Require **python version >= 3.7**, recommend install with `pip`.

```shell
pip install textflint
```

Once TextFlint is installed, you can run it via command-line (`textflint ...`) or integrate it inside another NLP project.

## Usage

### Workflow

The general workflow of TextFlint is displayed above. Evaluation of target models could be divided into three steps:

1. For input preparation, the original dataset for testing, which is to be loaded by `Dataset`, should be firstly formatted as a series of `JSON` objects. You can use the built-in `Dataset` following this [instruction](docs/user/components/4_Sample_Dataset.ipynb). TextFlint configuration is specified by `Config`. Target model is also loaded as `FlintModel`.
2. In adversarial sample generation, multi-perspective transformations (i.e., [80+Transformation](docs/user/components/transformation.md), [Subpopulation](docs/user/components/subpopulation.md) and [AttackRecipe](https://github.com/QData/TextAttack)), are performed on `Dataset` to generate transformed samples. Besides, to ensure semantic and grammatical correctness of transformed samples, [Validator](docs/user/components/validator.md) calculates confidence of each sample to filter out unacceptable samples.
3. Lastly, `Analyzer` collects evaluation results and `ReportGenerator` automatically generates a comprehensive report of model robustness.

For example, on the Sentiment Analysis (SA) task, this is a statistical chart of the performance of`XLNET` with different types of `Transformation`/`Subpopulation`/`AttackRecipe` on the `IMDB` dataset.

We release tutorials of performing the whole pipeline of TextFlint on various tasks, including:

* [Machine Reading Comprehension](docs/user/tutorials/9_MRC.ipynb)
* [Part-of-speech Tagging](docs/user/tutorials/7_BERT%20for%20POS%20tagging.ipynb)
* [Named Entity Recognition](docs/user/tutorials/11_NER.ipynb)
* [Chinese Word Segmentation](docs/user/tutorials/10_CWS.ipynb)

### Quick Start

Using TextFlint to verify the robustness of a specific model is as simple as running the following command:

```shell
$ textflint --dataset input_file --config config.json
```

where *input\_file* is the input file of csv or json format, *config.json* is a configuration file with generation and target model options. Transformed datasets would save to your out dir according to your *config.json*.

Based on the design of decoupling sample generation and model verification, **TextFlint** can be used inside another NLP project with just a few lines of code.

```python
from textflint import Engine

data_path = 'input.json'
config = 'config.json'
engine = Engine()
engine.run(data_path, config)
```

For more input and output instructions of TextFlint, please refer to the [IO format document](docs/user/components/IOFormat.md).

## Architecture

***Input layer:*** receives textual datasets and models as input, represented as `Dataset` and `FlintModel` separately.

- **`DataSet`**: a container, provides efficient and handy operation interfaces for `Sample`. `Dataset` supports loading, verification, and saving data in Json or CSV format for various NLP tasks.
- **`FlintModel`**: a target model used in an adversarial attack.

***Generation layer:*** there are mainly four parts in generation layer:

- **`Subpopulation`**: generates a subset of a `DataSet`.
- **`Transformation`**: transforms each sample of `Dataset` if it can be transformed.
- **`AttackRecipe`**: attacks the `FlintModel` and generates a `DataSet` of adversarial examples.
- **`Validator`**: verifies the quality of samples generated by `Transformation` and `AttackRecipe`.

> textflint provides an interface to integrate the easy-to-use adversarial attack recipes implemented based on `textattack`. Users can refer to [textattack](https://github.com/QData/TextAttack) for more information about the supported `AttackRecipe`.

***Report layer:*** analyzes model testing results and provides robustness report for users.

## Learn More

| Section | Description |
| ------------------------------------------------------------ | ------------------------------------------------------------ |
| [Documentation](https://textflint.readthedocs.io/) | Full API documentation and tutorials |
| [Tutorial](https://github.com/textflint/textflint/tree/master/docs/user) | The tutorial of textflint components and pipeline |
| [Website](https://www.textflint.io/textflint) | Provides evaluation results of SOTA models and transformed data download |
| [Online Demo](https://www.textflint.io/tutorials) | Interactive demo to try single text transformations |
| [Paper](https://aclanthology.org/2021.acl-demo.41.pdf) | Our system paper which was received by ACL2021 |

## Contributing

We welcome community contributions to TextFlint in the form of bugfixes 🛠️ and new features💡! If you want to contribute, please first read [our contribution guideline](CONTRIBUTING.md).

## Citation

If you are using TextFlint for your work, please kindly cite our [ACL2021 TextFlint demo paper](https://aclanthology.org/2021.acl-demo.41.pdf):

```latex
@inproceedings{wang-etal-2021-textflint,
title = {TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing},
author = {Wang, Xiao and Liu, Qin and Gui, Tao and Zhang, Qi and others},
booktitle = {Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations},
month = {aug},
year = {2021},
address = {Online},
publisher = {Association for Computational Linguistics},
url = {https://aclanthology.org/2021.acl-demo.41},
doi = {10.18653/v1/2021.acl-demo.41},
pages = {347--355}
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/textflint/textflint

Awesome Lists containing this project

README

Unified Multilingual Robustness Evaluation Toolkit
for Natural Language Processing