https://github.com/yandexdataschool/sklearn-deeprl

Deep reinforcement learning. In scikit-learn. In less than 50 effective lines.
https://github.com/yandexdataschool/sklearn-deeprl

Last synced: 2 months ago
JSON representation

Deep reinforcement learning. In scikit-learn. In less than 50 effective lines.

Host: GitHub
URL: https://github.com/yandexdataschool/sklearn-deeprl
Owner: yandexdataschool
Created: 2017-01-22T02:35:18.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2017-01-23T03:54:18.000Z (over 8 years ago)
Last Synced: 2024-08-08T23:19:50.267Z (10 months ago)
Language: Jupyter Notebook
Homepage:
Size: 99.6 KB
Stars: 51
Watchers: 13
Forks: 12
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

### Deep reinforcement learning. In scikit-learn. In less than 50 effective lines.
Dive-in button: [![Binder](http://mybinder.org/badge.svg)](http://mybinder.org:/repo/yandexdataschool/sklearn-deeprl)

Currently both demos are vanilla crossentropy(CE) method for policy approximated by a neural network.
For RL, it boild down to
Repeat:
* Generate N games
* Take M best
* Fit to those M best samples

The CE is a very general approach for approximate estimation and maximization tasks, you can read about it [here](https://people.smp.uq.edu.au/DirkKroese/ps/eormsCE.pdf). For reinforcement learning, we use the optimization version, basically trying to fit agent to generating games where reward is high. More on that [here](http://www.aaai.org/Papers/ICML/2003/ICML03-068.pdf).

While this approach falls flat in some cases and it takes black magic to make it work with infinite MDPs or long session lengths, it still works unreasonably well in most cases. One more awesome trait is that it extendds effortlessly to policy approximation (e.g. deep RL), partially observable MDPs and all kinds of weird stuff you see in the wild.

If you want something heavier, take a look at [agentnet](https://github.com/yandexdataschool/AgentNet).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/yandexdataschool/sklearn-deeprl

Awesome Lists containing this project

README