https://github.com/zhaocq-nlp/attention-visualization

Visualization for simple attention and Google's multi-head attention.
https://github.com/zhaocq-nlp/attention-visualization

attention attention-visualization machine-translation multi-head-attention neural-machine-translation visualization

Last synced: 3 months ago
JSON representation

Visualization for simple attention and Google's multi-head attention.

Host: GitHub
URL: https://github.com/zhaocq-nlp/attention-visualization
Owner: zhaocq-nlp
License: apache-2.0
Created: 2018-03-07T11:29:46.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-03-08T05:31:37.000Z (over 7 years ago)
Last Synced: 2025-04-13T09:11:53.040Z (3 months ago)
Topics: attention, attention-visualization, machine-translation, multi-head-attention, neural-machine-translation, visualization
Language: Java
Size: 2.34 MB
Stars: 68
Watchers: 2
Forks: 15
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Attention-Visualization
Visualization for simple attention and Google's multi-head attention.

## Requirements

- python
- jdk1.8

## Usage

1\. Python version (for simple attention only):

``` bash
python exec/plot_heatmap.py --input xxx.attention
```

2\. Java version (for both simple attention and Google's multi-head attention):
``` bash
java -jar exec/plot_heatmap.jar
```
then select the attention file on the GUI.

## Data Format

The name of the attention file should end with ".attention" extension especially when using exec/plot_heatmap.jar and the file should be json format:

``` python
{ "0": {
"source": " ", # the source sentence (without and symbols)
"translation": " ", # the target sentence (without and symbols)
"attentions": [ # various attention results
{
"name": " ", # a unique name for this attention
"type": " ", # the type of this attention (simple or multihead)
"value": [...] # the attention weights, a json array
}, # end of one attention result
{...}, ...] # end of various attention results
}, # end of the first sample
"1":{
...
}, # end of the second sample
...
} # end of file
```

Note that due to the hard coding, the `name` of each attention should contain "encoder_decoder_attention", "encoder_self_attention" or "decoder_self_attention" substring on the basis of its real meaning.

The `value` has shape [length_queries, length_keys] when `type`=simple and has shape [num_heads, length_queries, length_keys] when `type`=multihead.

For more details, see [attention.py](https://github.com/zhaocq-nlp/NJUNMT-tf/blob/master/njunmt/inference/attention.py).

## Demo

The `toydata/toy.attention` is generated by a NMT model with a self-attention encoder, Bahdanau's attention and a RNN decoder using [NJUNMT-tf](https://github.com/zhaocq-nlp/NJUNMT-tf).

1\. Execute the python version (for simple attention only):

``` bash
python exec/plot_heatmap.py --input toydata/toy.attention
```
It will plot the traditional attention heatmap:

2\. As for java version (for both simple attention and Google's multihead attention), execute
``` bash
java -jar exec/plot_heatmap.jar
```
then select the `toydata/toy.attention` on the GUI.

The words on the left side are attention "queries" and attention "keys" are on the right. Click on the words on the left side to see the heatmap:

Here shows the traditional `encoder_decoder_attention` of word "obtained". The color depth of lines and squares indicate the degree of attention.

Next, select `encoder_self_attention0` under the menu bar. Click on the "获得" on the left.

It shows the multi-head attention of the word "获得". Attention weights of head0 - head7 are displayed on the right.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zhaocq-nlp/attention-visualization

Awesome Lists containing this project

README