https://github.com/openai/neural-gpu

Code for the Neural GPU model originally described in "Neural GPUs Learn Algorithms"
https://github.com/openai/neural-gpu

paper

Last synced: 4 months ago
JSON representation

Code for the Neural GPU model originally described in "Neural GPUs Learn Algorithms"

Host: GitHub
URL: https://github.com/openai/neural-gpu
Owner: openai
Created: 2016-07-06T01:38:45.000Z (almost 9 years ago)
Default Branch: master
Last Pushed: 2018-11-22T00:38:44.000Z (over 6 years ago)
Last Synced: 2025-01-29T19:46:24.861Z (4 months ago)
Topics: paper
Language: Python
Homepage: http://arxiv.org/abs/1511.08228
Size: 242 KB
Stars: 137
Watchers: 165
Forks: 67
Open Issues: 2
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        **Status:** Archive (code is provided as-is, no updates expected)

Code for the Neural GPU model originally described in

[[http://arxiv.org/abs/1511.08228]].

Running experiments

===================

Running one instance

--------------------

The following would use 256 filters to train on binary multiplication,

then 4-ary, then decimal:

```

python neural_gpu_trainer.py --nmaps=256 --task=bmul,qmul,mul --progressive_curriculum=5

```

My typical invocation is something like

```

  CUDA_VISIBLE_DEVICES=0 python neural_gpu_trainer.py --random_seed=0 --max_steps=200000 --forward_max=201 --nmaps=256 --task=bmul,qmul,mul --time_till_eval=4 --progressive_curriculum=5 --train_dir=../logs/August-12-curriculum/forward_max=201-nmaps=256-task=bmul,qmul,mul-progressive_curriculum=5-random_seed=0

```

The tests on decimal carry were done using invocations like the following:

```

  CUDA_VISIBLE_DEVICES=0 neural_gpu_trainer.py --train_dir=../logs/run1 --random_seed=1 --max_steps=100000 --forward_max=201 --nmaps=128 --task=add --time_till_eval=4 --time_till_ckpt=1

```

You can find a list of options, and their default values, in `neuralgpu/trainer.py`.

Examining results

=================

Loading and examining a model

-----------------------------

`examples/examples_for_loading_model.py` gives a simple instance of loading a

model and running it on an instance.

Plotting results

----------------

Something like `python plots/get_pretty_score.py cachedlogs/*/*task=bmul,qmul,mul-*` works.  There are a lot of options to make it prettier (renaming stuff, removing some runs, changing titles, reordering, etc.).  For example, one of my plots was made with

```

python get_pretty_score.py cachedlogs/A*/*256*[=,]mul-* --titles '256 filters|' --title 'Decimal multiplication is easier with curriculum' --task mul --remove_strings='|-progressive_curriculum=5' --exclude='layer|progressive' --order '4,2,1,3' --global-legend=1

```

Requirements

============

* TensorFlow (see tensorflow.org for how to install)

* Matplotlib for Python (sudo apt-get install python-matplotlib)

* joblib

Credits

=======

Original code by Lukasz Kaiser (lukaszkaiser).  Modified by Eric Price

(ecprice)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/openai/neural-gpu

Awesome Lists containing this project

README