https://github.com/mpezeshki/theano_tile_coding

A tile coder in theano for Reinforcement Learning tasks
https://github.com/mpezeshki/theano_tile_coding

deep-learning gpu-computing reinforcement-learning theano tile-coding

Last synced: about 1 month ago
JSON representation

A tile coder in theano for Reinforcement Learning tasks

Host: GitHub
URL: https://github.com/mpezeshki/theano_tile_coding
Owner: mpezeshki
License: mit
Created: 2017-03-02T16:25:52.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2017-04-21T03:42:14.000Z (about 8 years ago)
Last Synced: 2025-03-26T17:57:13.586Z (about 2 months ago)
Topics: deep-learning, gpu-computing, reinforcement-learning, theano, tile-coding
Language: Python
Size: 10.4 MB
Stars: 3
Watchers: 2
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Theano_Tile_Coding
A tile coder in theano for Reinforcement Learning tasks

# Experiments
## 2D surface approximation
As a simple toy task, we try to approximate the function z = sin(x) + cos(y). over the range of [0, 2 pi] for both x and y dimensions.

![](https://github.com/mohammadpz/Theano_Tile_Coding/blob/master/files/2d_example.gif)

## Motion modeling
We test our framework on an adopted version of the Horde model. Horde is a multi-timescale future prediction model on an RL robot that uses thousands of parallel learners in order to approximate many different outputs at the same time. The task of continually prediction future in different time-scales is called "Nexting". We adopted this model for modeling human motion. We used a single walking sequence from MIT MoCap dataset and trained our adopted version of Horde on it, armed with our Theano-based implementation of Tile Coding. The data consists of ~3000 time-steps and in each time-step position of 17 body joint is presented. All body joints are represented in 49-dimensional space where each dimension is normalized to be in $[0, 1]$. The task is to predict future of each joint in the next 10 time-steps given the current position of all joints. The feature space is tile-coded into 30 tilings with 50 tiles each. An example of a trained model predictions for one of the joints is shown in Figure below,

![](https://github.com/mohammadpz/Theano_Tile_Coding/blob/master/files/description.png)

![](https://github.com/mohammadpz/Theano_Tile_Coding/blob/master/files/final.gif)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mpezeshki/theano_tile_coding

Awesome Lists containing this project

README