https://github.com/mpezeshki/theano_tile_coding
A tile coder in theano for Reinforcement Learning tasks
https://github.com/mpezeshki/theano_tile_coding
deep-learning gpu-computing reinforcement-learning theano tile-coding
Last synced: about 1 month ago
JSON representation
A tile coder in theano for Reinforcement Learning tasks
- Host: GitHub
- URL: https://github.com/mpezeshki/theano_tile_coding
- Owner: mpezeshki
- License: mit
- Created: 2017-03-02T16:25:52.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2017-04-21T03:42:14.000Z (about 8 years ago)
- Last Synced: 2025-03-26T17:57:13.586Z (about 2 months ago)
- Topics: deep-learning, gpu-computing, reinforcement-learning, theano, tile-coding
- Language: Python
- Size: 10.4 MB
- Stars: 3
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Theano_Tile_Coding
A tile coder in theano for Reinforcement Learning tasks# Experiments
## 2D surface approximation
As a simple toy task, we try to approximate the function z = sin(x) + cos(y). over the range of [0, 2 pi] for both x and y dimensions.
## Motion modeling
We test our framework on an adopted version of the Horde model. Horde is a multi-timescale future prediction model on an RL robot that uses thousands of parallel learners in order to approximate many different outputs at the same time. The task of continually prediction future in different time-scales is called "Nexting". We adopted this model for modeling human motion. We used a single walking sequence from MIT MoCap dataset and trained our adopted version of Horde on it, armed with our Theano-based implementation of Tile Coding. The data consists of ~3000 time-steps and in each time-step position of 17 body joint is presented. All body joints are represented in 49-dimensional space where each dimension is normalized to be in $[0, 1]$. The task is to predict future of each joint in the next 10 time-steps given the current position of all joints. The feature space is tile-coded into 30 tilings with 50 tiles each. An example of a trained model predictions for one of the joints is shown in Figure below,
