https://github.com/monologg/JointBERT

Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"
https://github.com/monologg/JointBERT

bert intent-classification joint-bert pytorch slot-filling slu transformers

Last synced: 3 months ago
JSON representation

Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"

Host: GitHub
URL: https://github.com/monologg/JointBERT
Owner: monologg
License: apache-2.0
Created: 2019-11-14T15:48:12.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2024-01-11T07:51:17.000Z (over 1 year ago)
Last Synced: 2024-11-06T17:46:25.485Z (8 months ago)
Topics: bert, intent-classification, joint-bert, pytorch, slot-filling, slu, transformers
Language: Python
Homepage:
Size: 464 KB
Stars: 665
Watchers: 13
Forks: 186
Open Issues: 16
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

StarryDivineSky - monologg/JointBERT

README

        # JointBERT

(Unofficial) Pytorch implementation of `JointBERT`: [BERT for Joint Intent Classification and Slot Filling](https://arxiv.org/abs/1902.10909)

## Model Architecture



      



- Predict `intent` and `slot` at the same time from **one BERT model** (=Joint model)

- total_loss = intent_loss + coef \* slot_loss (Change coef with `--slot_loss_coef` option)

- **If you want to use CRF layer, give `--use_crf` option**

## Dependencies

- python>=3.6

- torch==1.6.0

- transformers==3.0.2

- seqeval==0.0.12

- pytorch-crf==0.7.2

## Dataset

|       | Train  | Dev | Test | Intent Labels | Slot Labels |

| ----- | ------ | --- | ---- | ------------- | ----------- |

| ATIS  | 4,478  | 500 | 893  | 21            | 120         |

| Snips | 13,084 | 700 | 700  | 7             | 72          |

- The number of labels are based on the _train_ dataset.

- Add `UNK` for labels (For intent and slot labels which are only shown in _dev_ and _test_ dataset)

- Add `PAD` for slot label

## Training & Evaluation

```bash

$ python3 main.py --task {task_name} \

                  --model_type {model_type} \

                  --model_dir {model_dir_name} \

                  --do_train --do_eval \

                  --use_crf

# For ATIS

$ python3 main.py --task atis \

                  --model_type bert \

                  --model_dir atis_model \

                  --do_train --do_eval

# For Snips

$ python3 main.py --task snips \

                  --model_type bert \

                  --model_dir snips_model \

                  --do_train --do_eval

```

## Prediction

```bash

$ python3 predict.py --input_file {INPUT_FILE_PATH} --output_file {OUTPUT_FILE_PATH} --model_dir {SAVED_CKPT_PATH}

```

## Results

- Run 5 ~ 10 epochs (Record the best result)

- Only test with `uncased` model

- ALBERT xxlarge sometimes can't converge well for slot prediction.

|           |                  | Intent acc (%) | Slot F1 (%) | Sentence acc (%) |

| --------- | ---------------- | -------------- | ----------- | ---------------- |

| **Snips** | BERT             | **99.14**      | 96.90       | 93.00            |

|           | BERT + CRF       | 98.57          | **97.24**   | **93.57**        |

|           | DistilBERT       | 98.00          | 96.10       | 91.00            |

|           | DistilBERT + CRF | 98.57          | 96.46       | 91.85            |

|           | ALBERT           | 98.43          | 97.16       | 93.29            |

|           | ALBERT + CRF     | 99.00          | 96.55       | 92.57            |

| **ATIS**  | BERT             | 97.87          | 95.59       | 88.24            |

|           | BERT + CRF       | **97.98**      | 95.93       | 88.58            |

|           | DistilBERT       | 97.76          | 95.50       | 87.68            |

|           | DistilBERT + CRF | 97.65          | 95.89       | 88.24            |

|           | ALBERT           | 97.64          | 95.78       | 88.13            |

|           | ALBERT + CRF     | 97.42          | **96.32**   | **88.69**        |

## Updates

- 2019/12/03: Add DistilBert and RoBERTa result

- 2019/12/14: Add Albert (large v1) result

- 2019/12/22: Available to predict sentences

- 2019/12/26: Add Albert (xxlarge v1) result

- 2019/12/29: Add CRF option

- 2019/12/30: Available to check `sentence-level semantic frame accuracy`

- 2020/01/23: Only show the result related with uncased model

- 2020/04/03: Update with new prediction code

## References

- [Huggingface Transformers](https://github.com/huggingface/transformers)

- [pytorch-crf](https://github.com/kmkurn/pytorch-crf)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/monologg/JointBERT

Awesome Lists containing this project

README