https://github.com/lilianweng/multi-armed-bandit
Play with the solutions to the multi-armed-bandit problem.
https://github.com/lilianweng/multi-armed-bandit
Last synced: 7 months ago
JSON representation
Play with the solutions to the multi-armed-bandit problem.
- Host: GitHub
- URL: https://github.com/lilianweng/multi-armed-bandit
- Owner: lilianweng
- Created: 2018-01-26T22:08:51.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2024-05-21T17:58:11.000Z (over 1 year ago)
- Last Synced: 2025-03-29T10:09:45.921Z (7 months ago)
- Language: Python
- Size: 114 KB
- Stars: 405
- Watchers: 12
- Forks: 96
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## multi-armed-bandit
This repo is set up for a blog post I wrote on ["The Multi-Armed Bandit Problem and Its Solutions"](https://lilianweng.github.io/lil-log/2018/01/23/the-multi-armed-bandit-problem-and-its-solutions.html).
---
The result of a small experiment on solving a Bernoulli bandit with K = 10 slot machines, each with a randomly initialized reward probability.

- (Left) The plot of time step vs the cumulative regrets.
- (Middle) The plot of true reward probability vs estimated probability.
- (Right) The fraction of each action is picked during the 5000-step run.