https://github.com/hughperkins/char-lstm

char-rnn, tweaked to use Element-Research rnn modules
https://github.com/hughperkins/char-lstm

Last synced: 3 months ago
JSON representation

char-rnn, tweaked to use Element-Research rnn modules

Host: GitHub
URL: https://github.com/hughperkins/char-lstm
Owner: hughperkins
License: bsd-2-clause
Created: 2015-12-29T12:49:47.000Z (almost 10 years ago)
Default Branch: master
Last Pushed: 2016-01-22T16:05:10.000Z (over 9 years ago)
Last Synced: 2025-05-13T02:09:14.666Z (5 months ago)
Language: Lua
Size: 472 KB
Stars: 9
Watchers: 5
Forks: 2
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# char-lstm
char-rnn, tweaked to use Element-Research rnn modules

What this does, and the way it works, is closely based on how Karpathy's https://github.com/karpathy/char-rnn works, but tweaked to use Element Research's https://github.com/element-research/rnn modules instead.

## Status

Draft, not yet fully working

Update:
- both training and sampling are implemented now, but seems to be some critical bug in training for some reason. I'm working on this :-)

## Differences from original char-rnn

* uses Element Research's rnn modules
* weights are stored as a FloatTensor, rather than CudaTensor etc
* can train using any of cuda/cl/cpu, and sample using the same, or different, up to you
* the sequences used to train each epoch are offset by 1 character from the previous epoch, which hopefully will improve generalization
* each thread is exposed to the entire training set, rather than a 1/batchSize portion of it, which hopefully means can use really large batch sizes, for speed of execution

## Does it support cuda and OpenCL?

* of course it supports OpenCL :-)
* and it supports CUDA :-)

## To do

* currently, calls 'forget' before each sequence. It should not
* implement sampling
* add command-line options

## How to use

### Training

```
th train.lua
```

### Sampling

eg:
```
th sample.lua out/weights_tinyshakespeare_1_501.t7
```

## Naming

If you can think of a better name, please raise an issue to suggest :-)

## License

Original char-rnn code is MIT. New code in this repo are BSDv2

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/hughperkins/char-lstm

Awesome Lists containing this project

README