Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/phylliade/vinci
A generic, easy to use, and keras-compatible deep RL framework
https://github.com/phylliade/vinci
ddpg deep-reinforcement-learning keras openai-gym tensorflow visualization
Last synced: 29 days ago
JSON representation
A generic, easy to use, and keras-compatible deep RL framework
- Host: GitHub
- URL: https://github.com/phylliade/vinci
- Owner: Phylliade
- Created: 2017-08-07T15:07:55.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2017-11-21T16:17:54.000Z (about 7 years ago)
- Last Synced: 2024-12-21T07:41:51.273Z (about 1 month ago)
- Topics: ddpg, deep-reinforcement-learning, keras, openai-gym, tensorflow, visualization
- Language: Python
- Homepage:
- Size: 8.85 MB
- Stars: 6
- Watchers: 4
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Vinci
[![Build Status](https://travis-ci.org/Phylliade/vinci.svg?branch=master)](https://travis-ci.org/Phylliade/vinci)
[![Documentation Status](http://readthedocs.org/projects/vinci/badge/?version=latest)](http://vinci.readthedocs.io/en/latest/?badge=latest)
[![PyPI](https://img.shields.io/pypi/v/vinci.svg)](https://pypi.python.org/pypi/vinci/)
[![Python versions](https://img.shields.io/pypi/pyversions/vinci/.svg)](https://pypi.python.org/pypi/vinci/)This is a generic, easy to use, and keras-compatible deep RL framework.
It began as a fork of [keras-rl](https://github.com/matthiasplappert/keras-rl) but is now a separated project.
# Features
* Define your Deep Nets using Keras
* Simulate on the OpenAI Gym Environments
* Easy to implement a new algorithm, using a well-defined API
* Advanced training capabilities: Offline training, critic-only (or actor-only) training...
* Easy logging : Tensorboard, Terminal...Here's an example of the evolution of the policy during learning on the ContinuousMountainCar environment:
![](assets/animation.gif)
# Documentation
An online documentation can be found at:http://vinci.readthedocs.io/en/latest/
# Installation
Run :
```
pip install git+ssh://[email protected]/Phylliade/vinci.git
```# Creating the Deep Networks with Keras
Vinci is designed to seamlessy use Keras's networks.
You can design your networks as always, using the Sequential or Functional API.## Environment-agnostic networks
Vinci also adds some utilities to make the network creation **environment agnostic**, which can be nice!To do this, the `env` object (a wrapper around a gym env, of type `rl.EnvWrapper`) provides different utilities, depending if you're using the Sequential or Functional APIs.
### Using the functional API
You just have to design your Keras model using the functional API and the `state` and `action` placeholders of the `env` object.For example, for a simple critic:
```python
# Inputs
observation = env.state
action = env.action
# Concatenate the inputs for the critic
inputs = concatenate([observation, action])# Hidden layer
x = Dense(100)(inputs)
x = Activation('relu')(x)# Output layer
x = Dense(1)(x)
x = Activation('linear')(x)# Final model
critic = Model(inputs=[observation, action], outputs=[x])
```### Using the Sequential API
Since you have to specify the input shapes by hand, you can use the `state_space_dim` and `action_space_dim` attributes of the EnvWrapper.For example of an actor:
```python
actor = Sequential()# Hidden layers
actor.add(Dense(400, input_shape=(env.state_space_dim,)))
actor.add(Activation("relu"))
actor.add(Dense(300))
actor.add(Activation("relu"))# Output layer
actor.add(Dense(env.action_space_dim, activation="tanh"))
```## Efficiency of Keras models
Internally, Keras models are used in a functional fashion:```
out = keras_model(in)
```Some may wonder about some potential leaks with this usage, and they're right!
With a traditional function, each time `keras_model(in)` is called, a new `Tensor` is created (and every underlying ops) and added to the Graph.But, Keras uses a cache for the computations, so each call to `keras_model(in)` always results in the same variable.