https://github.com/cyberagentailab/thresholded-lasso-bandit
https://github.com/cyberagentailab/thresholded-lasso-bandit
Last synced: 9 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/cyberagentailab/thresholded-lasso-bandit
- Owner: CyberAgentAILab
- License: mit
- Created: 2022-06-06T16:36:46.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2023-03-22T06:50:19.000Z (about 3 years ago)
- Last Synced: 2025-09-10T07:42:53.131Z (9 months ago)
- Language: Python
- Size: 14.6 KB
- Stars: 6
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Thresholded Lasso Bandit
Code for reproducing results in the paper "[Thresholded Lasso Bandit](https://arxiv.org/abs/2010.11994)".
## About
In this paper, we revisit the regret minimization problem in sparse stochastic contextual linear bandits, where feature vectors may be of large dimension $d$, but where the reward function depends on a few, say $s_0\ll d$, of these features only.
We present Thresholded Lasso bandit, an algorithm that (i) estimates the vector defining the reward function as well as its sparse support, i.e., significant feature elements, using the Lasso framework with thresholding, and (ii) selects an arm greedily according to this estimate projected on its support.
The algorithm does not require prior knowledge of the sparsity index $s_0$ and can be parameter-free.
For this simple algorithm, we establish non-asymptotic regret upper bounds scaling as $\mathcal{O}( \log d + \sqrt{T} )$ in general, and as $\mathcal{O}( \log d + \log T)$ under the so-called margin condition (a probabilistic condition on the separation of the arm rewards).
The regret of previous algorithms scales as $\mathcal{O}( \log d + \sqrt{T \log (d T)})$ and $\mathcal{O}( \log T \log d)$ in the two settings, respectively.
Through numerical experiments, we confirm that our algorithm outperforms existing methods.
## Installation
This code is written in Python 3.
To install the required dependencies, execute the following command:
```bash
$ pip install -r requirements.txt
```
### For Docker User
Build the container:
```bash
$ docker build -t thresholded-lasso-bandit .
```
After build finished, run the container:
```bash
$ docker run -it thresholded-lasso-bandit
```
## Run Experiments
In order to investigate the performance of TH Lasso bandit on features drawn from a Gaussian distribution, execute the following command:
```bash
$ python run_gaussian_experiment.py
```
In this experiment, the following options can be specified:
* `--K`: Number of arms. The default value is `2`.
* `--T`: Number of rounds to be played. The default value is `1000`.
* `--d`: Dimension of feature vectors. The default value is `1000`.
* `--s0`: Sparsity index. The default value is `20`.
* `--x_max`: Maximum l2-norm of feature vectors. The default value is `10`.
* `--rho_sq`: Correlation level between feature vectors of arms. The default value is `0.7`.
* `--num_trial`: Number of trials to run experiments. The default value is `20`.
To evaluate TH Lasso bandit via an experiment with a feature distribution other than the Gaussian distribution (uniform, elliptical, hard instance), execute the following command:
```bash
$ python run_uniform_experiment.py
```
```bash
$ python run_elliptical_experiment.py
```
```bash
$ python run_hard_instance_experiment.py
```
## Citation
If you use our code in your work, please cite our paper:
```
@InProceedings{ariu2022thlassobandit,
title = {Thresholded Lasso Bandit},
author = {Ariu, Kaito and Abe, Kenshi and Proutiere, Alexandre},
booktitle = {Proceedings of the 39th International Conference on Machine Learning},
pages = {878--928},
year = {2022},
volume = {162}
}
```