Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yard1/hpbandster-sklearn
A scikit-learn wrapper for HpBandSter hyper parameter search.
https://github.com/yard1/hpbandster-sklearn
automated-machine-learning automl bayesian-optimization hyperparameter-optimization hyperparameter-tuning machine-learning python scikit-learn sklearn
Last synced: 3 months ago
JSON representation
A scikit-learn wrapper for HpBandSter hyper parameter search.
- Host: GitHub
- URL: https://github.com/yard1/hpbandster-sklearn
- Owner: Yard1
- License: mit
- Created: 2020-10-15T20:36:53.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2022-10-15T12:38:58.000Z (over 2 years ago)
- Last Synced: 2024-11-01T18:51:42.338Z (3 months ago)
- Topics: automated-machine-learning, automl, bayesian-optimization, hyperparameter-optimization, hyperparameter-tuning, machine-learning, python, scikit-learn, sklearn
- Language: Python
- Homepage: https://hpbandster-sklearn.readthedocs.io/en/latest/
- Size: 125 KB
- Stars: 23
- Watchers: 4
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# hpbandster-sklearn
`hpbandster-sklearn` is a Python library providing a [`scikit-learn`](http://scikit-learn.org/) wrapper - `HpBandSterSearchCV` - for [`HpBandSter`](https://github.com/automl/HpBandSter), a hyper parameter tuning library.
## Motivation
`HpBandSter` implements several cutting-edge hyper parameter algorithms, including HyperBand and BOHB. They often outperform standard Random Search, finding best parameter combinations in less time.
`HpBandSter` is powerful and configurable, but its usage is often unintuitive for beginners and necessitating a large amount of boilerplate code. In order to solve that issue, `HpBandSterSearchCV` was created as a drop-in replacement for `scikit-learn` hyper parameter searchers, following its well-known and popular API, making it possible to tune `scikit-learn` API estimators with minimal setup.
`HpBandSterSearchCV` API has been based on `scikit-learn`'s [`HalvingRandomSearchCV`](https://scikit-learn.org/dev/modules/generated/sklearn.model_selection.HalvingRandomSearchCV.html), implementing nearly all of the parameters it does.
## Installation
```
pip install hpbandster-sklearn
```## Usage
Use it like any other `scikit-learn` hyper parameter searcher:
```python
import numpy as np
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.utils.validation import check_is_fitted
from hpbandster_sklearn import HpBandSterSearchCVX, y = load_iris(return_X_y=True)
clf = RandomForestClassifier(random_state=0)
np.random.seed(0)param_distributions = {"max_depth": [2, 3, 4], "min_samples_split": list(range(2, 12))}
search = HpBandSterSearchCV(clf, param_distributions,random_state=0, n_jobs=1, n_iter=10, verbose=1).fit(X, y)
search.best_params_
```You can also use `ConfigSpace.ConfigurationSpace` objects instead of dicts (in fact, it is recommended)!
```python
import numpy as np
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.utils.validation import check_is_fitted
from hpbandster_sklearn import HpBandSterSearchCV
import ConfigSpace as CS
import ConfigSpace.hyperparameters as CSHX, y = load_iris(return_X_y=True)
clf = RandomForestClassifier(random_state=0)
np.random.seed(0)param_distributions = CS.ConfigurationSpace(seed=42)
param_distributions.add_hyperparameter(CSH.UniformIntegerHyperparameter("min_samples_split", 2, 11))
param_distributions.add_hyperparameter(CSH.UniformIntegerHyperparameter("max_depth", 2, 4))search = HpBandSterSearchCV(clf, param_distributions,random_state=0, n_jobs=1, n_iter=10, verbose=1).fit(X, y)
search.best_params_
```Please refer to the [documentation of this library](https://hpbandster-sklearn.readthedocs.io/en/latest/), as well as to the documentation of [`HpBandSter`](https://automl.github.io/HpBandSter/build/html/index.html) and [`ConfigSpace`](https://automl.github.io/ConfigSpace/master/index.html) for more information.
Pipelines and `TransformedTargetRegressor` are also supported. Make sure to prefix the hyper parameter and resource names accordingly should you use either (or both) - for example, `final_estimator__n_estimators`. `n_samples` is not to be prefixed.
## Early stopping
As almost every search algorithm in `HpBandSter` leverages early stopping (mostly through Successive Halving), the user can configure the resource and budget to be used through the arguments of `HpBandSterSearchCV` object.
```python
search = HpBandSterSearchCV(
clf,
param_distributions,
resource_name='n_samples', # can be either 'n_samples' or a string corresponding to an estimator attribute, eg. 'n_estimators' for an ensemble
resource_type=float, # if specified, the resource value will be cast to that type before being passed to the estimator, otherwise it will be derived automatically
min_budget=0.2,
max_budget=1,
)search = HpBandSterSearchCV(
clf,
param_distributions,
resource_name='n_estimators', # can be either 'n_samples' or a string corresponding to an estimator attribute, eg. 'n_estimators' for an ensemble
resource_type=int, # if specified, the resource value will be cast to that type before being passed to the estimator, otherwise it will be derived automatically
min_budget=20,
max_budget=200,
)
```By default, the object will try to automatically determine the best resource, by checking the following in order:
- `'n_estimators'`, if the model has that attribute and the `warm_start` attribute
- `'max_iter'`, if the model has that attribute and the `warm_start` attribute
- `'n_samples'` - if the model doesn't support `warm_start`, the dataset samples will be used as the resource instead, meaing the model will be iteratively fitted on a bigger and bigger portion of the dataset.Furthermore, special support has been added for `LightGBM`, `XGBoost` and `CatBoost` `scikit-learn` estimators.
## Documentation
https://hpbandster-sklearn.readthedocs.io/en/latest/
## References
- `HpBandSter` - https://github.com/automl/HpBandSter
- `ConfigSpace` - https://github.com/automl/ConfigSpace
- `scikit-learn` - http://scikit-learn.org/## Author
Antoni Baum (Yard1)
## License
[MIT](https://github.com/Yard1/hpbandster-sklearn/blob/master/LICENSE)