https://github.com/carpedm20/lstm-char-cnn-tensorflow
in progress
https://github.com/carpedm20/lstm-char-cnn-tensorflow
cnn lstm nlp tensorflow
Last synced: 10 days ago
JSON representation
in progress
- Host: GitHub
- URL: https://github.com/carpedm20/lstm-char-cnn-tensorflow
- Owner: carpedm20
- License: mit
- Created: 2015-12-10T16:31:03.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2018-07-27T11:04:42.000Z (over 6 years ago)
- Last Synced: 2025-03-28T12:05:24.173Z (17 days ago)
- Topics: cnn, lstm, nlp, tensorflow
- Language: Python
- Homepage:
- Size: 8.76 MB
- Stars: 768
- Watchers: 60
- Forks: 241
- Open Issues: 17
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-tensorflow - Character-Aware Neural Language Models - TensorFlow implementation of [Character-Aware Neural Language Models](http://arxiv.org/abs/1508.06615) (Models/Projects)
- awesome-tensorflow - Character-Aware Neural Language Models - TensorFlow implementation of [Character-Aware Neural Language Models](http://arxiv.org/abs/1508.06615) (Models/Projects)
- fucking-awesome-tensorflow - Character-Aware Neural Language Models - TensorFlow implementation of [Character-Aware Neural Language Models](http://arxiv.org/abs/1508.06615) (Models/Projects)
- Awesome-TensorFlow-Chinese - Character-Aware Neural Language Models - TensorFlow implementation of [Character-Aware Neural Language Models](http://arxiv.org/abs/1508.06615) (模型项目 / 微信群)
README
Character-Aware Neural Language Models
======================================Tensorflow implementation of [Character-Aware Neural Language Models](http://arxiv.org/abs/1508.06615). The original code of author can be found [here](https://github.com/yoonkim/lstm-char-cnn).

This implementation contains:
1. Word-level and Character-level Convolutional Neural Network
2. Highway Network
3. Recurrent Neural Network Language Model*The current implementation has a performance issue. See [#3](https://github.com/carpedm20/lstm-char-cnn-tensorflow/issues/3).*
Prerequisites
-------------- Python 2.7 or Python 3.3+
- [Tensorflow](https://www.tensorflow.org/)Usage
-----To train a model with `ptb` dataset:
$ python main.py --dataset ptb
To test an existing model:
$ python main.py --dataset ptb --forward_only True
To see all training options, run:
$ python main.py --help
which will print
usage: main.py [-h] [--epoch EPOCH] [--word_embed_dim WORD_EMBED_DIM]
[--char_embed_dim CHAR_EMBED_DIM]
[--max_word_length MAX_WORD_LENGTH] [--batch_size BATCH_SIZE]
[--seq_length SEQ_LENGTH] [--learning_rate LEARNING_RATE]
[--decay DECAY] [--dropout_prob DROPOUT_PROB]
[--feature_maps FEATURE_MAPS] [--kernels KERNELS]
[--model MODEL] [--data_dir DATA_DIR] [--dataset DATASET]
[--checkpoint_dir CHECKPOINT_DIR]
[--forward_only [FORWARD_ONLY]] [--noforward_only]
[--use_char [USE_CHAR]] [--nouse_char] [--use_word [USE_WORD]]
[--nouse_word]optional arguments:
-h, --help show this help message and exit
--epoch EPOCH Epoch to train [25]
--word_embed_dim WORD_EMBED_DIM
The dimension of word embedding matrix [650]
--char_embed_dim CHAR_EMBED_DIM
The dimension of char embedding matrix [15]
--max_word_length MAX_WORD_LENGTH
The maximum length of word [65]
--batch_size BATCH_SIZE
The size of batch images [100]
--seq_length SEQ_LENGTH
The # of timesteps to unroll for [35]
--learning_rate LEARNING_RATE
Learning rate [1.0]
--decay DECAY Decay of SGD [0.5]
--dropout_prob DROPOUT_PROB
Probability of dropout layer [0.5]
--feature_maps FEATURE_MAPS
The # of feature maps in CNN
[50,100,150,200,200,200,200]
--kernels KERNELS The width of CNN kernels [1,2,3,4,5,6,7]
--model MODEL The type of model to train and test [LSTM, LSTMTDNN]
--data_dir DATA_DIR The name of data directory [data]
--dataset DATASET The name of dataset [ptb]
--checkpoint_dir CHECKPOINT_DIR
Directory name to save the checkpoints [checkpoint]
--forward_only [FORWARD_ONLY]
True for forward only, False for training [False]
--noforward_only
--use_char [USE_CHAR]
Use character-level language model [True]
--nouse_char
--use_word [USE_WORD]
Use word-level language [False]
--nouse_wordbut more options can be found in [models/LSTMTDNN](./models/LSTMTDNN.py) and [models/TDNN](./models/TDNN.py).
Performance
-----------**Failed to reproduce the results of paper (2016.02.12)**. If you are looking for a code that reproduced the paper's result, see https://github.com/mkroutikov/tf-lstm-char-cnn.

The perplexity on the test sets of Penn Treebank (PTB) corpora.
| Name | Character embed | LSTM hidden units | Paper (Y Kim 2016) | This repo. |
|:---------------:|:---------------:|:-----------------:|:------------------:|:-----------:|
| LSTM-Char-Small | 15 | 100 | 92.3 | in progress |
| LSTM-Char-Large | 15 | 150 | 78.9 | in progress |Author
------Taehoon Kim / [@carpedm20](http://carpedm20.github.io/)