Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/d-krupke/cpsat-autotune

WIP: Tune the hyperparameters of Google's OR-Tools' CP-SAT solver for specific models
https://github.com/d-krupke/cpsat-autotune

cp-sat hyperparameter-tuning optuna ortools

Last synced: 2 months ago
JSON representation

WIP: Tune the hyperparameters of Google's OR-Tools' CP-SAT solver for specific models

Host: GitHub
URL: https://github.com/d-krupke/cpsat-autotune
Owner: d-krupke
License: mit
Created: 2024-08-09T14:43:06.000Z (5 months ago)
Default Branch: main
Last Pushed: 2024-09-05T11:07:02.000Z (4 months ago)
Last Synced: 2024-10-24T02:55:35.721Z (2 months ago)
Topics: cp-sat, hyperparameter-tuning, optuna, ortools
Language: Jupyter Notebook
Homepage:
Size: 1020 KB
Stars: 10
Watchers: 1
Forks: 0
Open Issues: 4
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# cpsat-autotune: A Hyperparameter Tuning Tool for Google's OR-Tools CP-SAT Solver

**cpsat-autotune** is a Python library designed to optimize the hyperparameters
of Google's OR-Tools CP-SAT solver for specific problem instances. While CP-SAT
is already highly optimized for a broad range of generic problems, fine-tuning
its parameters for particular problem sets can yield significant performance
gains. This tool leverages the `optuna` optimization library to systematically
explore and suggest optimal hyperparameter configurations tailored to your
needs.

Also check out our other projects:

- [The CP-SAT Primer](https://d-krupke.github.io/cpsat-primer/): A comprehensive
guide to Google's OR-Tools CP-SAT solver.
- [CP-SAT Log Analyzer](https://github.com/d-krupke/CP-SAT-Log-Analyzer): A tool
to analyze the logs generated by Google's OR-Tools CP-SAT solver.

> CAVEAT: The approach of tuning the hyperparamters on the top-level is
> dangerous and instead one should tune the parameters of a subsolver. The
> reason is that the top-level parameters may be copied to every subsolver
> including LNS workers. If one enables expensive propagation mechanisms and
> such things, it could seriously impact overall solver. I will create an
> improved version of this tool soon.

## Use Case

**cpsat-autotune** is not a universal solution that guarantees a performance
boost for all uses of the CP-SAT solver. Instead, it is specifically designed to
enhance solver efficiency in targeted scenarios, particularly within the context
of Adaptive Large Neighborhood Search (ALNS).

### Adaptive Large Neighborhood Search (ALNS) Context

In ALNS, the CP-SAT solver is frequently invoked with a strict time limit to
solve similar problem instances as part of a larger iterative optimization
process. The goal is to incrementally improve a solution by exploring different
neighborhoods of the problem space. In this context, achieving even modest
performance gains on average instances can significantly impact the overall
efficiency of the search process, even if it results in occasional performance
drops on outlier instances.

### Benefits of Tuning in ALNS

- **Average Performance Gains:** By tuning the solver’s hyperparameters to
optimize performance on typical instances, **cpsat-autotune** can reduce the
average time per iteration. This is particularly valuable in ALNS, where a
large number of solver calls are made.
- **Tolerance for Outliers:** In an ALNS framework, occasional slower iterations
due to deteriorated performance on outlier instances are generally acceptable,
as the search process can recover in subsequent iterations. Thus, the focus
can be on enhancing the solver's average performance rather than ensuring
consistent performance across all instances.
- **Augmented Solver Strategies:** Instead of completely replacing the CP-SAT
solver with a single tuned configuration, **cpsat-autotune** allows you to
tune hyperparameters for one or more specific instance sets and incorporate
these as additional strategies within ALNS. This means you can maintain the
default CP-SAT parameters while augmenting the solver's capability with
tailored configurations. ALNS can then automatically select the most effective
strategy for each iteration, leveraging the diverse set of tuned
hyperparameters alongside the default configuration for optimal performance.

## Installation

You can install **cpsat-autotune** using `pip`:

```bash
pip install -U cpsat-autotune
```

Make sure to update the package before every use to ensure you have the latest
version, as this project is still a prototype.

## Basic Usage

Here is a basic example of how to use **cpsat-autotune** to optimize the time
required to find an optimal solution for a CP-SAT model:

```python
from cpsat_autotune import import_model, tune_time_to_optimal

# Load your model from a protobuf file
model = import_model("models/medium_hg.pb")

# Tune the model to minimize the time to reach an optimal solution
best = tune_time_to_optimal(
model,
max_time_in_seconds=3, # Enter a time limit slightly above what the solver with default parameters needs
n_samples_for_trial=5, # Number of samples for each trial
n_samples_for_verification=20, # Number of samples for each statistically relevant comparison.
n_trials=50, # Number of trials to run with Optuna
)
```

Sample output:

```plaintext
────────────────────────────────────────────── OPTIMIZED PARAMETERS ───────────────────────────────────────────────
┏━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ # ┃ Parameter ┃ Value ┃ Contribution ┃ Default Value ┃
┡━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ 1 │ cp_model_probing_level │ 0 │ 49.51% │ 2 │
│ 2 │ cut_level │ 0 │ 50.49% │ 1 │
└─────┴────────────────────────┴───────┴──────────────┴───────────────┘
────────────────────────────────────────────────── Descriptions ───────────────────────────────────────────────────
1. cp_model_probing_level Defines the intensity of probing during presolve, where variables are temporarily fixed
to infer more information about the problem. Higher levels of probing can result in a more simplified problem but
require more computation time during presolve.

2. cut_level Sets the level of effort the solver will invest in generating cutting planes, which are linear
constraints added to remove infeasible regions. Properly applied, cuts can significantly reduce the search space
and help the solver find an optimal solution more quickly.
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━┳━━━━━━┳━━━━━━┳━━━━━━━━━━┓
┃ Metric ┃ Mean ┃ Min ┃ Max ┃ #Samples ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━╇━━━━━━╇━━━━━━╇━━━━━━━━━━┩
│ Time in seconds with Default Parameters │ 1.61 │ 1.04 │ 2.34 │ 20 │
│ Time in seconds with Optimized Parameters │ 0.48 │ 0.34 │ 0.84 │ 20 │
└───────────────────────────────────────────┴──────┴──────┴──────┴──────────┘
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
╭──────────────────────────────────────────────────── WARNING ────────────────────────────────────────────────────╮
│ │
│ The optimized parameters listed above were obtained based on a sampling approach │
│ and may not fully capture the complexities of the entire problem space. │
│ While statistical reasoning has been applied, these results should be considered │
│ as a suggestion for further evaluation rather than definitive settings. │
│ │
│ It is strongly recommended to validate these parameters in larger, more comprehensive │
│ experiments before adopting them in critical applications. │
│ │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```

## Available Tuning Methods

**cpsat-autotune** provides two primary methods for tuning:

### 1. `tune_time_to_optimal`

This method tunes the CP-SAT solver's hyperparameters to minimize the time
required to find an optimal solution. It is useful when you need a guaranteed
solution without a fixed time limit.

#### Parameters:

- `model`: The CP-SAT model you wish to tune.
- `max_time_in_seconds`: The maximum time allowed for each solve operation. This
parameter influences the runtime of the tuning process, so it should be set
carefully.
- `relative_gap_limit`: (Optional) The relative optimality gap for considering a
solution as optimal. A value of `0.0` requires exact optimality. Defaults to
`0.0`.
- `n_samples_for_trial`: (Optional) The number of samples taken in each trial.
Defaults to `10`.
- `n_samples_for_verification`: (Optional) The number of samples used to verify
parameters after tuning. Defaults to `30`.
- `n_trials`: (Optional) The number of trials to run. Defaults to `100`.

#### Returns:

- `dict`: The best parameters found during the tuning process.

#### Notes:

- The concrete analysis, including baseline performance and the evaluation of
the best parameters, is printed to the console.

### 2. `tune_for_quality_within_timelimit`

This method tunes hyperparameters to maximize or minimize the objective value
within a specified time limit. It is useful when you need to find a good
solution within a fixed time frame without requiring guarantees.

#### Parameters:

- `model`: The CP-SAT model to be tuned.
- `max_time_in_seconds`: The time limit for each solve operation in seconds.
This value should be less than the time required for the solver to find an
optimal solution with default parameters.
- `obj_for_timeout`: The objective value to return if the solver times out. This
value should be worse than a trivial solution.
- `direction`: Specify `'maximize'` or `'minimize'` depending on whether you
want to optimize for the best or worst solution quality.
- `n_samples_for_trial`: (Optional) The number of samples taken in each trial.
Defaults to `10`.
- `n_samples_for_verification`: (Optional) The number of samples used to verify
parameters after tuning. Defaults to `30`.
- `n_trials`: (Optional) The number of trials to run. Defaults to `100`.

#### Returns:

- `dict`: The best parameters found during the tuning process.

#### Notes:

- The concrete analysis, including baseline performance and the evaluation of
the best parameters, is printed to the console.

## Using the `cpsat-autotune` CLI

The `cpsat-autotune` CLI is a command-line interface for tuning CP-SAT
hyperparameters to optimize the performance of your models. Below are the
instructions on how to use the CLI.

### Commands

The `cpsat-autotune` CLI provides three main commands: `time`, `quality`, and
`gap`.

#### `time` Command

The `time` command tunes CP-SAT hyperparameters to minimize the time required to
find an optimal solution.

##### Usage

```sh
cpsat-autotune time [OPTIONS] MODEL_PATH
```

##### Options

- `MODEL_PATH`: Path to the model file (required).
- `--max-time`: Maximum time allowed for each solve operation in seconds
(required).
- `--relative-gap`: Relative optimality gap for considering a solution as
optimal (default: 0.0).
- `--n-trials`: Number of trials to execute in the tuning process (default:
100).
- `--n-samples-trial`: Number of samples to take in each trial (default: 10).
- `--n-samples-verification`: Number of samples for verifying parameters
(default: 30).

##### Example

```sh
cpsat-autotune time --max-time 60 --relative-gap 0.01 --n-trials 50 --n-samples-trial 5 --n-samples-verification 20 path/to/model/file
```

#### `quality` Command

The `quality` command tunes CP-SAT hyperparameters to maximize or minimize
solution quality within a given time limit.

##### Usage

```sh
cpsat-autotune quality [OPTIONS] MODEL_PATH
```

##### Options

- `MODEL_PATH`: Path to the model file (required).
- `--max-time`: Time limit for each solve operation in seconds (required).
- `--obj-for-timeout`: Objective value to return if the solver times out
(required).
- `--direction`: Direction to optimize the objective value (`maximize` or
`minimize`, required).
- `--n-trials`: Number of trials to execute in the tuning process (default:
100).
- `--n-samples-trial`: Number of samples to take in each trial (default: 10).
- `--n-samples-verification`: Number of samples for verifying parameters
(default: 30).

##### Example

```sh
cpsat-autotune quality --max-time 60 --obj-for-timeout 100 --direction maximize --n-trials 50 --n-samples-trial 5 --n-samples-verification 20 path/to/model/file
```

#### `gap` Command

Tune CP-SAT hyperparameters to minimize the gap within a given time limit. This
is a good option for more complex models for which you have no chance of finding
the optimal solution within the time limit, but you still want to have some
guarantee on the quality of the solution. This can be considered as a proxy for
the time to optimal solution.

CAVEAT: If the time limit is too small, it will probably only minimize the
presolve time, which can have negative effects on the long-term performance of
the solver.

##### Usage

```sh
cpsat-autotune gap [OPTIONS] MODEL_PATH
```

##### Options

### Help

For more information on each command and its options, you can use the `--help`
flag:

```sh
cpsat-autotune time --help
cpsat-autotune quality --help
cpsat-autotune gap --help
```

This will display detailed descriptions and usage instructions for each command.

## The Importance of Avoiding Overfitting

While tuning hyperparameters can improve solver performance for specific
instances, it also increases the risk of overfitting. Overfitting occurs when
the solver's performance is significantly improved on the training set of
problems but deteriorates on new, slightly different instances. For example,
tuning may reduce solve times on a set of similar problems but could result in
excessive solve times or failure on problems that deviate from the training set.

## How does the Tuning Work?

**cpsat-autotune** uses the `optuna` library to perform hyperparameter tuning on
a preselected set of parameters. The output of optuna is then further refined
and the significance of certain parameters is evaluated. Based on the assumption
that the default parameters are already well-tuned for a broad range of
problems, **cpsat-autotune** identifies the most significant changes to the
default configuration and suggests these as potential improvements. It does take
a few shortcuts to speed things up, while collecting more samples for important
values.

### Recommendations:

- **Robust Performance:** If consistent performance across a variety of
instances is crucial, stick with the default CP-SAT parameters.
- **Targeted Performance:** If you are solving a large number of similar
problems and can tolerate potential performance drops on outliers, use the
suggested parameters after careful consideration.

## What are the main challenges?

There are a few challenges to make the tuning efficient and effective:

1. CP-SAT can have a very high variance (when using random seeds). This variance
can be much higher than for most other use cases of hypterparameter tuning.
Currently, we are solving the issue by taking multiple samples for each
trial, though, this is not a perfect solution.
2. CP-SAT has very many hyperparameters and it is not clear which ones are
important. Currently, there is a set of hyperparameters we already know are
important and and a further set of hyperparameters that we chose based on
educated guesses, though, we are already doubting some of these choices.
3. Identifying which changes are actually significant and worth the risk of
deviating from the well-tuned default parameters is a challenge. The current
implementation does some analysis but if the chosen strategies are truly
effective is still an open question.
4. Deciding on what the right number of samples for each measurement is is a
challenge. We will probably add some more advanced statistical analysis in
the future to make this decision more data-driven, instead of having some
static rules.

## Contributing

Contributions are welcome. Please ensure that your code adheres to the project's
style guidelines and includes appropriate tests.

## Changelog

- **0.5.0** Added parameters for `no_overlap`. Allowing to filter based on the
model's constraints.

## License

This project is licensed under the MIT License.