Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/zygmuntz/hyperband
Tuning hyperparams fast with Hyperband
https://github.com/zygmuntz/hyperband
gradient-boosting gradient-boosting-classifier hyperparameter-optimization hyperparameter-tuning hyperparameters machine-learning
Last synced: 3 days ago
JSON representation
Tuning hyperparams fast with Hyperband
- Host: GitHub
- URL: https://github.com/zygmuntz/hyperband
- Owner: zygmuntz
- License: other
- Created: 2017-02-26T21:33:42.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2018-08-15T09:21:05.000Z (about 6 years ago)
- Last Synced: 2024-08-01T15:30:01.829Z (3 months ago)
- Topics: gradient-boosting, gradient-boosting-classifier, hyperparameter-optimization, hyperparameter-tuning, hyperparameters, machine-learning
- Language: Python
- Homepage: http://fastml.com/tuning-hyperparams-fast-with-hyperband/
- Size: 370 KB
- Stars: 590
- Watchers: 22
- Forks: 75
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-llmops - Hyperband - square) | (AutoML / Profiling)
README
hyperband
=========Code for tuning hyperparams with Hyperband, adapted from [Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization](https://people.eecs.berkeley.edu/~kjamieson/hyperband.html).
defs/ - functions and search space definitions for various classifiers
defs_regression/ - the same for regression models
common_defs.py - imports and definitions shared by defs files
hyperband.py - from hyperband import Hyperband
load_data.py - classification defs import data from this file
load_data_regression.py - regression defs import data from this file
main.py - a complete example for classification
main_regression.py - the same, for regression
main_simple.py - a simple, bare-bones, exampleThe goal is to provide a fully functional implementation of Hyperband, as well as a number of ready to use functions for a number of models (classifiers and regressors). Currently these include four from _scikit-learn_ and four others:
* gradient boosting (GB)
* random forest (RF)
* extremely randomized trees (XT)
* linear SGD
* factorization machines from polylearn
* polynomial networks from polylearn
* a multilayer perceptron from Keras
* gradient boosting from XGBoost (classification only)Meta-classifier/regressor
-------------------------Use `defs.meta`/`defs_regression.meta` to try many models in one Hyperband run. This is an automatic alternative to constructing search spaces with multiple models (like `defs.rf_xt`, or `defs.polylearn_fm_pn`) by hand.
Loading data
------------Definitions files in `defs`/`defs_regression` import data from `load_data.py` and `load_data_regression.py`, respectively.
Edit these files, or a definitions file directly, to make your data available for tuning.
Regression defs use the _kin8nm_ dataset in `data/kin8nm`. There is no attached data for classification.
For the provided models data format follows _scikit-learn_ conventions, that is, there are _x_train_, _y_train_, _x_test_ and _y_test_ Numpy arrays.
Usage
-----Run `main.py` (with your own data), or `main_regression.py`. The essence of it is
```python
from hyperband import Hyperband
from defs.gb import get_params, try_paramshb = Hyperband( get_params, try_params )
results = hb.run()
```Here's a sample output from a run (three configurations tested) using `defs.xt`:
3 | Tue Feb 28 15:39:54 2017 | best so far: 0.5777 (run 2)
n_estimators: 5
{'bootstrap': False,
'class_weight': 'balanced',
'criterion': 'entropy',
'max_depth': 5,
'max_features': 'sqrt',
'min_samples_leaf': 5,
'min_samples_split': 6}# training | log loss: 62.21%, AUC: 75.25%, accuracy: 67.20%
# testing | log loss: 62.64%, AUC: 74.81%, accuracy: 66.78%7 seconds.
4 | Tue Feb 28 15:40:01 2017 | best so far: 0.5777 (run 2)
n_estimators: 5
{'bootstrap': False,
'class_weight': None,
'criterion': 'gini',
'max_depth': 5,
'max_features': 'sqrt',
'min_samples_leaf': 1,
'min_samples_split': 2}# training | log loss: 53.39%, AUC: 75.69%, accuracy: 72.37%
# testing | log loss: 53.96%, AUC: 75.29%, accuracy: 71.89%7 seconds.
5 | Tue Feb 28 15:40:07 2017 | best so far: 0.5396 (run 4)
n_estimators: 5
{'bootstrap': True,
'class_weight': None,
'criterion': 'gini',
'max_depth': 3,
'max_features': None,
'min_samples_leaf': 7,
'min_samples_split': 8}# training | log loss: 50.20%, AUC: 77.04%, accuracy: 75.39%
# testing | log loss: 50.67%, AUC: 76.77%, accuracy: 75.12%8 seconds.
Early stopping
--------------Some models may use early stopping (as the Keras MLP example does). If a configuration stopped early, it doesn't make sense to run it with more iterations (duh). To indicate this, make `try_params()`
```python
return { 'loss': loss, 'early_stop': True }
```
This way, Hyperband will know not to select that configuration for any further runs.Moar
----See [http://fastml.com/tuning-hyperparams-fast-with-hyperband/](http://fastml.com/tuning-hyperparams-fast-with-hyperband/) for a detailed description.