https://github.com/abhilash1910/miniattention
Hierarchical Attention Layer for Keras for document classification.
https://github.com/abhilash1910/miniattention
attention attention-mechanism bilstm hierarchicalattention keras-layers keras-tensorflow lstm-neural-networks tensorflow
Last synced: 6 months ago
JSON representation
Hierarchical Attention Layer for Keras for document classification.
- Host: GitHub
- URL: https://github.com/abhilash1910/miniattention
- Owner: abhilash1910
- License: other
- Created: 2020-08-06T11:33:20.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2023-02-02T14:15:47.000Z (over 2 years ago)
- Last Synced: 2024-04-26T14:41:42.960Z (over 1 year ago)
- Topics: attention, attention-mechanism, bilstm, hierarchicalattention, keras-layers, keras-tensorflow, lstm-neural-networks, tensorflow
- Language: Jupyter Notebook
- Homepage:
- Size: 110 KB
- Stars: 7
- Watchers: 2
- Forks: 6
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.TXT
Awesome Lists containing this project
README
# Mini-Attention
## A Keras Hirarchical Attention Layer for Document Classification in NLP :robot:
This library is an implementation of Heirarchical Attention Networks for Document Classification (Yang etal,2015).[Link](https://www.cs.cmu.edu/~./hovy/papers/16HLT-hierarchical-attention-networks.pdf0). This is compatible with Keras and Tensorflow (keras version >=2.0.6). As the paper suggests, it uses hierarchical attention mechanism and capabilities of Word Encoder (including bi-directional Recurrent unit- GRU) ,Sentence Attention and Document Classification are addressed.
## Dependencies
## Usability
The library or the Layer is compatible with Tensorflow and Keras. Installation is carried out using the pip command as follows:
```python
pip install MiniAttention==0.1
```For using inside the Jupyter Notebook or Python IDE (along with Keras layers):
```python
import MiniAttention.MiniAttention as MA
```The Layer takes as input a 3D Tensor with dimensions: (sample_size,steps,features)
The Layer as an output produces a 2D Tensor with dimensions: (sample_size,features)
This Layer can be used after the Keras.Embedding() Layer to provide a global attention using the features and Embedding weights.Additionally Embedding libraries like (Glove,Word2Vec,ELMO) can be used in the Embedding Layer. It can also be used in between and before LSTM/Bidirectional LSTM/ GRU and recurrent layers . In this context, it has a capability to cater to Sequential and Model (functional) types of Model architectures(specific for Keras). The functional (keras.models.Model) version is provided as follows:```python
inp=Input(shape=(inp_shape,))
z=Embedding(max_features,256)(inp)
z=MA.MiniAttentionBlock(keras.initializers.he_uniform,None,None,keras.regularizers.L2(l2=0.02),None,None,None,None,None)(z)
z=tf.keras.layers.Bidirectional(LSTM(128,recurrent_activation="relu",return_sequences=True))(z)
z=tf.keras.layers.Bidirectional(LSTM(64,recurrent_activation="relu",return_sequences=True))(z)
z=MA.MiniAttentionBlock(keras.initializers.he_uniform,None,None,keras.regularizers.L2(l2=0.02),None,None,None,None,None)(z)
z=keras.layers.Dense(64,activation="relu")(z)
z=keras.layers.Dense(64,activation="relu")(z)
z=keras.layers.Dense(1,activation="sigmoid")(z)
model=keras.models.Model(inputs=inp,outputs=z)
model.compile(loss="binary_crossentropy",metrics=['accuracy'],optimizer=keras.optimizers.Adagrad(learning_rate=1e-3))
model.summary()
```For Sequential Model (keras.models.Sequential):
```python
model = Sequential()
model.add(Embedding(max_features,128,input_shape=(100,)))
model.add(MA.MiniAttentionBlock(None,None,None,None,None,None,None,None,None))
model.add(LSTM(128))
model.add(Dense(8,activation='relu'))
model.add(Dense(4,activation='sigmoid'))
model.compile(loss='binary_crossentropy',metrics=['accuracy'],optimizer='Adagrad')
model.summary()
```The arguments for the MiniAttentionBlock class include:
```
1.W_init: Weight Initializer - Compatible with keras.initializers
2.b_init: Bias Initializer - Compatible with keras.initializers
3.u_init: Output Initializer -Compatible with keras.initializers
4.W_reg: Weight Regularizer - Compatible with keras.regularizers
5.b_reg: Bias Regularizer - Compatible with keras.regularizers
6.u_reg: Output Regularizer -Compatible with keras.regularizers
7.W_const: Weight Constraint - Compatible with keras.constraints
8.b_const: Bias Constraint - Compatible with keras.constraints
9.u_const: Output Constraint -Compatible with keras.constraints
10.bias: Boolean - True/False- Whether to use bias (Optional)
```There are 3 main functions inside the MiniAttentionBlock class. The "**init**" method is used for initializeing the weight , bias tensors for computation. The "attention_block" method is used for assigning the variables (tensors) and checking for the input tensor size. The "build_nomask" method is used for computing the attention modules.Uses tanh as the internal activation function with exponential normalization.Masking has not been added to the library yet.
## Example
For reference on how to use the library, a Jupyter Notebook sample is present in the repository: "MiniAttention_on_IMDB.ipynb".
This is a sample which uses this Layer with Keras.layers.Embedding() Layer in IMDB binary classification.It uses the default keras embedding which is followed from the official tutorial by [Keras docs](https://keras.io/examples/nlp/bidirectional_lstm_imdb). Alternately,"Tensorboard-tfds-IMDB.py" contains a tensorboard demonstration.
## Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
## License
MIT