Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/deephyper/benchmark
Repository to benchmark DeepHyper on different systems.
https://github.com/deephyper/benchmark
Last synced: 5 days ago
JSON representation
Repository to benchmark DeepHyper on different systems.
- Host: GitHub
- URL: https://github.com/deephyper/benchmark
- Owner: deephyper
- License: bsd-2-clause
- Created: 2021-10-08T07:22:35.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2024-04-04T08:19:16.000Z (7 months ago)
- Last Synced: 2024-10-30T11:42:04.455Z (14 days ago)
- Language: Python
- Size: 1.96 MB
- Stars: 4
- Watchers: 3
- Forks: 3
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# DeepHyper Benchmark
## Table of Contents
- [DeepHyper Benchmark](#deephyper-benchmark)
- [Table of Contents](#table-of-contents)
- [Introduction](#introduction)
- [Organization of the Repository](#organization-of-the-repository)
- [Installation](#installation)
- [Defining a Benchmark](#defining-a-benchmark)
- [Standard Metadata](#standard-metadata)
- [List of Benchmarks](#list-of-benchmarks)## Introduction
This repository is a collection of machine learning benchmark for DeepHyper.
## Organization of the Repository
The repository follows this organization:
```bash
# Python package containing utility code
deephyper_benchmark/# Library of benchmarks
lib/
```## Installation
To install the DeepHyper benchmark suite, run:
```console
git clone https://github.com/deephyper/benchmark.git deephyper_benchmark
cd deephyper_benchmark/
pip install -e "."
```## Defining a Benchmark
A benchmark is defined as a sub-folder of the `lib/` folder such as `lib/Benchmark-101/`. Then a benchmark folder needs to follow a python package structure and therefore it needs to contain a `__init__.py` file at its root. In addition, a benchmark folder needs to define a `benchmark.py` script that defines its requirements.
General benchmark structure:
```
lib/
Benchmark-101/
__init__.py
benchmark.py
data.py
model.py
hpo.py # Defines hyperparameter optimization inputs (run-function + problem)
README.md # Description of the benchmark
```Then to use the benchmark:
```python
import deephyper_benchmark as dhbdhb.install("Benchmark-101")
dhb.load("Benchmark-101")
from deephyper_benchmark.lib.benchmark_101.hpo import problem, run
```All `run`-functions (i.e., functions returning the objective(s) to be optimized) should follow the **MAXIMIZATION** standard. If a benchmark needs minimization then the negative of the minimized objective can be returned `return -minimized_objective`.
A benchmark inherits from the `Benchmark` class:
```python
import osfrom deephyper_benchmark import *
DIR = os.path.dirname(os.path.abspath(__file__))
class Benchmark101(Benchmark):
version = "0.0.1"
requires = {
"bash-install": {"type": "cmd", "cmd": "cd .. && " + os.path.join(DIR, "../install.sh")},
}```
Finally, when testing a benchmark it can be useful to activate the logging:
```python
import logginglogging.basicConfig(
# filename="deephyper.log", # Uncomment if you want to create a file with the logs
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(filename)s:%(funcName)s - %(message)s",
force=True,
)
```## Configuration
Benchmarks can sometimes be configured. The configuration can use environment variables with the prefix `DEEPHYPER_BENCHMARK_`.
## Standard Metadata
Benchmarks must return the following standard metadata when it applies, some metadata are specific to neural networks (e.g., `num_parameters`):
- [ ] `num_parameters`: integer value of the number of parameters in the neural network.
- [ ] `num_parameters_train`: integer value of the number of **trainable** parameters of the neural network.
- [ ] `budget`: scalar value (float/int) of the budget consumed by the neural network. Therefore the budget should be defined for each benchmark (e.g., number of epochs in general).
- [ ] `stopped`: boolean value indicating if the evaluation was stopped before consuming the maximum budget.
- `train_X`: scalar value of the training metrics (replace `X` by the metric name, 1 key per metric).
- `valid_X`: scalar value of the validation metrics (replace `X` by the metric name, 1 key per metric).
- `test_X`: scalar value of the testing metrics (replace `X` by the metric name, 1 key per metric).
- [ ] `flops`: number of flops of the model such as computed in `fvcore.nn.FlopCountAnalysis(...).total()` (See [documentation](https://detectron2.readthedocs.io/en/latest/modules/fvcore.html#module-fvcore.nn)).
- [ ] `latency`: TO BE CLARIFIED
- [ ] `lc_train_X`: recorded learning curves of the trained model, the `bi` variables are the budget value (e.g., epochs/batches), and the `yi` values are the recorded metric. `X` in `train_X` is replaced by the name of the metric such as `train_loss` or `train_accuracy`. The format is `[[b0, y0], [b1, y1], ...]`.
- [ ] `lc_valid_X`: Same as `lc_train_X` but for validation data.The `@profile` decorator should be used on all `run`-functions to collect the `timestamp_start` and `timestamp_end` metadata.
## List of Benchmarks
In the following table:
- $\mathbb{R}$ denotes real parameters.
- $\mathbb{D}$ denotes discrete parameters.
- $\mathbb{C}$ denotes categorical parameters.| Name | Description | Variable(s) Type | Objective(s) Type | Multi-Objective | Multi-Fidelity | Evaluation Duration |
| ---------- | ---------------------------------------------------------------------------- | -------------------------------------------- | ----------------- | --------------- | -------------- | ------------------- |
| C-BBO | Continuous Black-Box Optimization problems. | $\mathbb{R}^n$ | $\mathbb{R}$ | ❌ | ❌ | configurable |
| DTLZ | The modified DTLZ multiobjective test suite. | $\mathbb{R}^n$ | $\mathbb{R}$ | ✅ | ❌ | configurable |
| ECP-Candle | Deep Neural-Networks on multiple "biological" scales of Cancer related data. | $\mathbb{R}\times\mathbb{D}\times\mathbb{C}$ | $\mathbb{R}$ | ✅ | ✅ | min |
| HPOBench | Hyperparameter Optimization Benchmark. | $\mathbb{R}\times\mathbb{D}\times\mathbb{C}$ | $\mathbb{R}$ | ✅ | ✅ | ms to min |
| JAHSBench | A slightly modified JAHSBench 201 wrapper. | $\mathbb{R}^2\times\mathbb{D}\times\mathbb{C}^8$ | $\mathbb{R}$ | ✅ | ❌ | configurable |
| LCu | Learning curve hyperparameter optimization benchmark. | | | | | |
| LCbench | Multi-fidelity benchmark without hyperparameter optimization. | NA | $\mathbb{R}$ | ❌ | ✅ | secondes |
| PINNBench | Physics Informed Neural Networks Benchmark. | $\mathbb{R}\times\mathbb{D}\times\mathbb{C}$ | $\mathbb{R}$ | ✅ | ✅ | ms |
| | | | | | | |
## List of Optimization Algorithm- COBYQA: `deephyper_benchmark.search.COBYQA(...)`
- PyBOBYQA: `deephyper_benchmark.search.PyBOBYQA(...)`
- TPE: `deephyper_benchmark.search.MPIDistributedOptuna(..., sampler="TPE")`
- BoTorch: `deephyper_benchmark.search.MPIDistributedOptuna(..., sampler="BOTORCH")`
- CMAES: `deephyper_benchmark.search.MPIDistributedOptuna(..., sampler="CMAES")`
- NSGAII: `deephyper_benchmark.search.MPIDistributedOptuna(..., sampler="NSGAII")`
- QMC: `deephyper_benchmark.search.MPIDistributedOptuna(..., sampler="QMC")`
- SMAC: `deephyper_benchmark.search.SMAC(...)`