Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/guocheng18/transformer-encoder
Implementation of Transformer encoder in PyTorch
https://github.com/guocheng18/transformer-encoder
easy-to-use encoder pytorch transformer
Last synced: about 14 hours ago
JSON representation
Implementation of Transformer encoder in PyTorch
- Host: GitHub
- URL: https://github.com/guocheng18/transformer-encoder
- Owner: guocheng18
- License: mit
- Created: 2019-07-23T02:06:12.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-08-15T03:47:47.000Z (over 4 years ago)
- Last Synced: 2024-11-15T07:30:56.816Z (2 days ago)
- Topics: easy-to-use, encoder, pytorch, transformer
- Language: Python
- Homepage:
- Size: 37.1 KB
- Stars: 60
- Watchers: 1
- Forks: 7
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Transformer Encoder
This repository provides a pytorch implementation of the encoder of [Transformer](http://papers.nips.cc/paper/7181-attention-is-all-you-need/).
## Getting started
Build a transformer encoder
```python
from transformer_encoder import TransformerEncoderencoder = TransformerEncoder(d_model=512, d_ff=2048, n_heads=8, n_layers=6, dropout=0.1)
input_seqs = ...
mask = ...
out = encoder(input_seqs, mask)
```Add positional encoding to input embeddings
```python
import torch.nn as nn
from transformer_encoder.utils import PositionalEncodinginput_layer = nn.Sequential(
nn.Embedding(num_embeddings=10000, embedding_dim=512),
PositionalEncoding(d_model=512, dropout=0.1, max_len=5000)
)
```Optimize model with the warming up strategy
```python
import torch.optim as optim
from transformer_encoder.utils import WarmupOptimizermodel = ...
base_optimizer = optim.Adam(model.parameters(), lr=1e-3)
optimizer = WarmupOptimizer(base_optimizer, d_model=512, scale_factor=1, warmup_steps=100)optimizer.zero_grad()
loss = ...
loss.backward()
optimizer.step()
```## API Reference
*transformer_encoder.TransformerEncoder(d_model, d_ff, n_heads=1, n_layers=1, dropout=0.1)*
- `d_model`: dimension of each word vector
- `d_ff`: hidden dimension of feed forward layer
- `n_heads`: number of heads in self-attention (defaults to 1)
- `n_layers`: number of stacked layers of encoder (defaults to 1)
- `dropout`: dropout rate (defaults to 0.1)*transformer_encoder.TransformerEncoder.forward(x, mask)*
- `x (~torch.FloatTensor)`: shape *(batch_size, max_seq_len, d_model)*
- `mask (~torch.ByteTensor)`: shape *(batch_size, max_seq_len)**transformer_encoder.utils.PositionalEncoding(d_model, dropout=0.1, max_len=5000)*
- `d_model`: same as TransformerEncoder
- `dropout`: dropout rate (defaults to 0.1)
- `max_len`: max sequence length (defaults to 5000)*transformer_encoder.utils.PositionalEncoding.forward(x)*
- `x (~torch.FloatTensor)`: shape *(batch_size, max_seq_len, d_model)*
*transformer_encoder.utils.WarmupOptimizer(base_optimizer, d_model, scale_factor, warmup_steps)*
- `base_optimizer (~torch.optim.Optimzier)`: e.g. adam optimzier
- `d_model`: equals d_model in TransformerEncoder
- `scale_factor`: scale factor of learning rate
- `warmup_steps`: warming up steps## Installation
Requires `python 3.5+`, `pytorch 1.0.0+`
```
pip install transformer_encoder
```