https://github.com/arig23498/rnn_viz
Sequence models in Numpy
https://github.com/arig23498/rnn_viz
backpropagation lstm numpy recurrence-formula rnn wandb
Last synced: 3 months ago
JSON representation
Sequence models in Numpy
- Host: GitHub
- URL: https://github.com/arig23498/rnn_viz
- Owner: ariG23498
- License: apache-2.0
- Created: 2020-09-22T11:37:49.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-10-09T18:17:02.000Z (about 5 years ago)
- Last Synced: 2025-05-12T13:12:11.601Z (5 months ago)
- Topics: backpropagation, lstm, numpy, recurrence-formula, rnn, wandb
- Language: Jupyter Notebook
- Homepage: https://bit.ly/under_RNN
- Size: 4.6 MB
- Stars: 25
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# How do the sequence models work?
### Reports
1. **[Under the hood of RNNs](http://bit.ly/under_RNN)**
2. **[Under the hood of LSMTs](http://bit.ly/under_LSTM)**### Introduction
The code-base contains `Numpy` implementations of two sequence model architectures, a vanilla `Recurrent Neural Network` and a vanilla `Long Short Term Memory`. This repository is for those who want to know what happens under the hood of these architectures.
In the repository care is given to the `feed-froward` and `back-propagation` of the architectures. The derivations are unrolled as much as I could to make it understandable.
The goal is to tackle the problem of `character generation` using RNNs and LSTMs. While tackling the problem, we also look into the gradient flow of the architectures. Later on an experiment to show the `context understanding` is done too.
### Problem Statement
The input will be a sequence of characters and the output would be the immediate next character in the sequence. The image below demonstrates the approach. The characters in a particular sequence are `H, E, L, L` and the next character is `O`. A little thing to notice here is that the character `O` could have been a `,` or simply a `\n`. The character that is generated largely depends on the context of the sequence. A well-trained model would generate characters that fit the context very well.

Character level language model
### Feedforward
We look into the recurrence formula for both the architectures.

Recurrence formula of RNN

Recurrence formula of LSTM
### Backpropagation
We look into the backpropagation formula for both the architectures.

Backpropagation in RNN

Backpropagation in LSTM