https://github.com/banditml/banditml

A lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.
https://github.com/banditml/banditml

bandits contextual-bandits neural-networks personalization pytorch reinforcement-learning

Last synced: about 1 month ago
JSON representation

A lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.

Host: GitHub
URL: https://github.com/banditml/banditml
Owner: banditml
License: gpl-3.0
Created: 2021-05-26T17:11:18.000Z (about 4 years ago)
Default Branch: main
Last Pushed: 2021-06-04T21:16:26.000Z (almost 4 years ago)
Last Synced: 2025-03-29T12:11:22.396Z (2 months ago)
Topics: bandits, contextual-bandits, neural-networks, personalization, pytorch, reinforcement-learning
Language: Python
Homepage:
Size: 197 KB
Stars: 66
Watchers: 4
Forks: 10
Open Issues: 1
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: COPYING

Awesome Lists containing this project

README

        


  

    

  



[![PyPI version](https://badge.fury.io/py/banditml.svg)](https://badge.fury.io/py/banditml) [![](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)

# What's banditml?

[banditml](https://github.com/banditml/banditml) is a lightweight contextual bandit & reinforcement learning library designed to be used in production Python services. This library is developed by [Bandit ML](https://www.banditml.com) and ex-authors of Facebook's applied reinforcement learning platform, [Reagent](https://github.com/facebookresearch/ReAgent).

Specifically, this repo contains:

- Feature engineering & preprocessing

- Model implementations

- Model training workflows

- Model serving code for Python services

## Supported models

Models supported:

- Contextual Bandits (small datasets)

  - [x] Linear bandit w/ ε-greedy exploration

  - [x] Random forest bandit w/ ε-greedy exploration

  - [x] Gradient boosted decision tree bandit w/ ε-greedy exploration

- Contextual Bandits (medium datasets)

  - [x] Neural bandit with ε-greedy exploration

  - [x] Neural bandit with UCB-based exploration [(via. dropout exploration)](https://arxiv.org/abs/1506.02142)

  - [x] Neural bandit with UCB-based exploration [(via. mixture density networks)](https://publications.aston.ac.uk/id/eprint/373/1/NCRG_94_004.pdf)

- Reinforcement Learning (large datasets)

  - [ ] [Deep Q-learning with ε-greedy exploration](https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf)

  - [ ] [Quantile regression DQN with UCB-based exploration](https://arxiv.org/abs/1710.10044)

  - [ ] [Soft Actor-Critic](https://arxiv.org/abs/1801.01290)

4 feature types supported:

* Numeric: standard floating point features

  * e.g. `{totalCartValue: 39.99}`

* Categorical: low-cardinality discrete features

  * e.g. `{currentlyViewingCategory: "men's jeans"}`

* ID list: high-cardinality discrete features

  * e.g. `{productsInCart: ["productId022", "productId109"...]}`

  * Handled via. learned embedding tables

* "Dense" ID list: high-cardinality discrete features, manually mapped to dense feature vectors

  * e.g `{productId022: [0.5, 1.3, ...], productId109: [1.9, 0.1, ...], ...}`

## Docs

```

pip install banditml

```

[Get started](DOCS.md)

## License

GNU General Public License v3.0 or later

See [COPYING](COPYING) to see the full text.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/banditml/banditml

Awesome Lists containing this project

README