https://github.com/banditml/banditml
A lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.
https://github.com/banditml/banditml
bandits contextual-bandits neural-networks personalization pytorch reinforcement-learning
Last synced: about 1 month ago
JSON representation
A lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.
- Host: GitHub
- URL: https://github.com/banditml/banditml
- Owner: banditml
- License: gpl-3.0
- Created: 2021-05-26T17:11:18.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2021-06-04T21:16:26.000Z (almost 4 years ago)
- Last Synced: 2025-03-29T12:11:22.396Z (2 months ago)
- Topics: bandits, contextual-bandits, neural-networks, personalization, pytorch, reinforcement-learning
- Language: Python
- Homepage:
- Size: 197 KB
- Stars: 66
- Watchers: 4
- Forks: 10
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: COPYING
Awesome Lists containing this project
README
[](https://badge.fury.io/py/banditml) [](https://github.com/ambv/black)
# What's banditml?
[banditml](https://github.com/banditml/banditml) is a lightweight contextual bandit & reinforcement learning library designed to be used in production Python services. This library is developed by [Bandit ML](https://www.banditml.com) and ex-authors of Facebook's applied reinforcement learning platform, [Reagent](https://github.com/facebookresearch/ReAgent).
Specifically, this repo contains:
- Feature engineering & preprocessing
- Model implementations
- Model training workflows
- Model serving code for Python services## Supported models
Models supported:
- Contextual Bandits (small datasets)
- [x] Linear bandit w/ ε-greedy exploration
- [x] Random forest bandit w/ ε-greedy exploration
- [x] Gradient boosted decision tree bandit w/ ε-greedy exploration
- Contextual Bandits (medium datasets)
- [x] Neural bandit with ε-greedy exploration
- [x] Neural bandit with UCB-based exploration [(via. dropout exploration)](https://arxiv.org/abs/1506.02142)
- [x] Neural bandit with UCB-based exploration [(via. mixture density networks)](https://publications.aston.ac.uk/id/eprint/373/1/NCRG_94_004.pdf)
- Reinforcement Learning (large datasets)
- [ ] [Deep Q-learning with ε-greedy exploration](https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf)
- [ ] [Quantile regression DQN with UCB-based exploration](https://arxiv.org/abs/1710.10044)
- [ ] [Soft Actor-Critic](https://arxiv.org/abs/1801.01290)4 feature types supported:
* Numeric: standard floating point features
* e.g. `{totalCartValue: 39.99}`
* Categorical: low-cardinality discrete features
* e.g. `{currentlyViewingCategory: "men's jeans"}`
* ID list: high-cardinality discrete features
* e.g. `{productsInCart: ["productId022", "productId109"...]}`
* Handled via. learned embedding tables
* "Dense" ID list: high-cardinality discrete features, manually mapped to dense feature vectors
* e.g `{productId022: [0.5, 1.3, ...], productId109: [1.9, 0.1, ...], ...}`## Docs
```
pip install banditml
```[Get started](DOCS.md)
## License
GNU General Public License v3.0 or later
See [COPYING](COPYING) to see the full text.