Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/batzner/indrnn
TensorFlow implementation of Independently Recurrent Neural Networks
https://github.com/batzner/indrnn
indrnn paper-implementations rnn tensorflow
Last synced: 3 months ago
JSON representation
TensorFlow implementation of Independently Recurrent Neural Networks
- Host: GitHub
- URL: https://github.com/batzner/indrnn
- Owner: batzner
- License: apache-2.0
- Created: 2018-03-14T13:44:17.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2021-08-19T20:04:03.000Z (about 3 years ago)
- Last Synced: 2024-04-05T01:51:49.460Z (7 months ago)
- Topics: indrnn, paper-implementations, rnn, tensorflow
- Language: Python
- Homepage: https://arxiv.org/abs/1803.04831
- Size: 595 KB
- Stars: 514
- Watchers: 39
- Forks: 129
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Independently Recurrent Neural Networks
Simple TensorFlow implementation of [Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN](https://arxiv.org/pdf/1803.04831.pdf) by Shuai Li et al. The author's original implementation in Theano and Lasagne can be found in [Sunnydreamrain/IndRNN_Theano_Lasagne](https://github.com/Sunnydreamrain/IndRNN_Theano_Lasagne).
[![Build Status](https://travis-ci.com/batzner/indrnn.svg?branch=master)](https://travis-ci.com/batzner/indrnn)
## Summary
In IndRNNs, neurons in recurrent layers are independent from each other. The basic RNN calculates the hidden state `h` with `h = act(W * input + U * state + b)`. IndRNNs use an element-wise vector multiplication `u * state` meaning each neuron has a single recurrent weight connected to its last hidden state.
The IndRNN
- can be used efficiently with ReLU activation functions making it easier to stack multiple recurrent layers without saturating gradients
- allows for better interpretability, as neurons in the same layer are independent from each other
- prevents vanishing and exploding gradients by regulating each neuron's recurrent weight## Usage
Copy [ind_rnn_cell.py](https://github.com/batzner/indrnn/blob/master/ind_rnn_cell.py) into your project.
```python
from ind_rnn_cell import IndRNNCell# Regulate each neuron's recurrent weight as recommended in the paper
recurrent_max = pow(2, 1 / TIME_STEPS)cell = MultiRNNCell([IndRNNCell(128, recurrent_max_abs=recurrent_max),
IndRNNCell(128, recurrent_max_abs=recurrent_max)])
output, state = tf.nn.dynamic_rnn(cell, input_data, dtype=tf.float32)
...
```
## Experiments in the paper### Addition Problem
See [examples/addition_rnn.py](https://github.com/batzner/indrnn/blob/master/examples/addition_rnn.py) for a script reproducing the "Adding Problem" from the paper. Below are the results reproduced with the `addition_rnn.py` code.
![https://github.com/batzner/indrnn/raw/master/img/addition/TAll.png](https://github.com/batzner/indrnn/raw/master/img/addition/TAll.png)
### Sequential MNISTSee [examples/sequential_mnist.py](https://github.com/batzner/indrnn/blob/master/examples/sequential_mnist.py) for a script reproducing the Sequential MNIST experiment. I let it run for two days and stopped it after 60,000 training steps with a
- Training error rate of 0.7%
- Validation error rate of 1.1%
- **Test error rate of 1.1%**![https://github.com/batzner/indrnn/raw/master/img/sequential_mnist/errors.png](https://github.com/batzner/indrnn/raw/master/img/sequential_mnist/errors.png)
## Requirements
- Python 3.4+
- TensorFlow 1.5+