https://github.com/anantzoid/language-modeling-gatedcnn

Tensorflow implementation of "Language Modeling with Gated Convolutional Networks"
https://github.com/anantzoid/language-modeling-gatedcnn

Last synced: about 1 year ago
JSON representation

Tensorflow implementation of "Language Modeling with Gated Convolutional Networks"

Host: GitHub
URL: https://github.com/anantzoid/language-modeling-gatedcnn
Owner: anantzoid
Created: 2017-01-11T12:59:12.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2017-01-16T04:32:21.000Z (over 9 years ago)
Last Synced: 2025-03-24T09:36:58.524Z (about 1 year ago)
Language: Python
Size: 559 KB
Stars: 271
Watchers: 16
Forks: 98
Open Issues: 8
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Language Modeling with Gated Convolutional Networks

This is a Tensorflow implementation of Facebook AI Research Lab's paper: [Language Modeling with Gated Convolutional Networks](https://arxiv.org/abs/1612.08083). This paper applies a convolutional approach to language modelling with a novel Gated-CNN model.

## Architecture
![Architecture](assets/architecture.png)

## Requirements
- Download and extract the [Google 1 Billion Word dataset](http://www.statmt.org/lm-benchmark/1-billion-word-language-modeling-benchmark-r13output.tar.gz) in the `data` folder.

- [TensorFlow 0.12.1](https://www.tensorflow.org/)

## Usage
To train the model using the default hyperparameters:

```
$ python main.py
$ tensorboard --logdir=logs --host=0.0.0.0
```
Check `main.py` for tunable hyperparameter flags.

## TODO
- [ ] Replace NCE loss with Adaptive Softmax.
- [ ] Remove restricted training on fixed sized sentences (20, for now) and extend to account for all varied sentence lenghts.
- [ ] Implement Weight Normalisation for faster convergence.
- [ ] Train extensively on deeper models to match the results with the paper.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/anantzoid/language-modeling-gatedcnn

Awesome Lists containing this project

README