https://github.com/mbarbetti/optunapi
:package: API to distribute hyperparameters optimization through HTTP requests
https://github.com/mbarbetti/optunapi
distributed-computing http-api-client http-requests machine-learning optimization parallel-computing
Last synced: 8 months ago
JSON representation
:package: API to distribute hyperparameters optimization through HTTP requests
- Host: GitHub
- URL: https://github.com/mbarbetti/optunapi
- Owner: mbarbetti
- License: mit
- Created: 2021-04-14T18:00:30.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2022-05-19T15:29:38.000Z (almost 4 years ago)
- Last Synced: 2025-07-04T04:07:02.345Z (9 months ago)
- Topics: distributed-computing, http-api-client, http-requests, machine-learning, optimization, parallel-computing
- Language: Python
- Homepage:
- Size: 118 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
API to distribute hyperparameters optimization through HTTP requests
_OptunAPI_ is a simple API designed for Machine Learning applications that allows to distribute an automatic
hyperparameters optimization over different machines through **HTTP requests**. Each set of hyperparameters
can be studied independently since the minima research does't require any gradients computation, but instead
is performed through a **Bayesian optimization** based on [Optuna](https://optuna.org/). The machine running
Optuna manages centrally the optimization studies -- the so-called "Optuna-server" -- providing sets of
hyperparameters and assessing them by the scores evaluated and sent back by the single computing instance,
named "Trainer-client". The HTTP requests underlying such client-server system are powered by [FastAPI](https://fastapi.tiangolo.com).
## Key Features
OptunAPI inherits most of the modern functionalities of Optuna and FastAPI:
- **Lightweight and versatile**
- OptunAPI is entirely written in Python and has few dependencies.
- **Easy to configure**
- For hyperparameters sampling, OptunAPI relies on [configuration files](#configuration-file) easy to set up.
- **Easy to integrate**
- The hyperparameters values can be easily recover [decoding the HTTP response content](#trainer-client) from the server.
- **Easy parallelization**
- Different machines can run the hyperparameters study in parallel, centrally coordinated by the server.
- **Efficient optimization algorithms**
- The optimization task is headed by Optuna and its state-of-the-art algorithms.
- **Quick visualization for study analysis**
- _TODO_ - OptunAPI provides a set of reports to monitor the status of the hyperparameters study.
## Key Components
To understand how OptunAPI works, we need to spend a couple of words about its components:
- [`Study`](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.study.Study.html#optuna.study.Study) and
[`Trial`](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.trial.Trial.html#optuna.trial.Trial) objects
from Optuna
- Optuna's Ask-and-Tell interface
- HTTP requests to map the hyperparameters space
### Study and Trial
A _study_ corresponds to an optimization task, i.e., a set of trials. This object provides interfaces to run a new
[`Trial`](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.trial.Trial.html#optuna.trial.Trial)
and access trials' history. OptunAPI is designed so that, when the first machine ask for a hyperparameters set, it
starts a new study ([`create_study()`](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.study.create_study.html#optuna.study.create_study)) identified according to the HTTP request submitted. Any other machines referring
to the same optimization session don't initialize a new study, but recover the previous one ([`load_study()`](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.study.load_study.html#optuna.study.load_study)) contributing
to mapping the hyperparameters space.
A _trial_ allows to prepare a particular set of hyperparameters and evaluate its capability of optimizing a objective
function, not necessarily available in an explicit form as in the case of very complex Machine Learning algorithms.
This object provides the following interfaces to get parameter suggestion:
- [`optuna.trial.Trial.suggest_categorical()`](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.trial.Trial.html#optuna.trial.Trial.suggest_categorical) for categorical parameters
- [`optuna.trial.Trial.suggest_int()`](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.trial.Trial.html#optuna.trial.Trial.suggest_int) for integer parameters
- [`optuna.trial.Trial.suggest_float()`](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.trial.Trial.html#optuna.trial.Trial.suggest_float) for floating point parameters
With optional arguments of `step` and `log`, we can discretize or take the logarithm of integer and floating point parameters.
The following code block is taken from the [Optuna tutorial](https://optuna.readthedocs.io/en/stable/tutorial/10_key_features/002_configurations.html) and shows a standard use of these features:
```Python
import optuna
def objective (trial):
# Categorical parameter
optimizer = trial.suggest_categorical ('optimizer', ['RMSprop', 'Adam'])
# Integer parameter
num_layers = trial.suggest_int ('num_layers', 1, 3)
# Integer parameter (log)
num_channels = trial.suggest_int ('num_channels', 32, 512, log = True)
# Integer parameter (discretized)
num_units = trial.suggest_int ('num_units', 10, 100, step = 5)
# Floating point parameter
dropout_rate = trial.suggest_float ('dropout_rate', 0.0, 1.0)
# Floating point parameter (log)
learning_rate = trial.suggest_float ('learning_rate', 1e-5, 1e-2, log = True)
# Floating point parameter (discretized)
drop_path_rate = trial.suggest_float ('drop_path_rate', 0.0, 1.0, step = 0.1)
```
OptunAPI uses these methods internally and requires only a [configuration file](#configuration-file)
correctly filled to run the studies.
### Ask-and-Tell Interface
The Optuna's _Ask-and-Tell_ interface provides a more flexible interface for hyperparameter optimization
based on the two following methods:
- [`optuna.study.Study.ask()`](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.study.Study.html#optuna.study.Study.ask) creates a trial that can sample hyperparameters
- [`optuna.study.Study.tell()`](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.study.Study.html#optuna.study.Study.tell) finishes the trial by passing `trial` and an objective value
OptunAPI uses these methods in two different moments. When a machine ask for a set of hyperparameters,
that set belongs to a trial resulting from an _ask_ instance. Then, once the objective function was
evaluated with that particular set of hyperparameters, the machine sends a new request encoding the
objective value allowing to close the corresponding trial with a _tell_ instance.
### HTTP Requests
OptunAPI provides a simple Python module to run a server able to centrally manage the optimization studies:
[`optuna/optuna/server.py`](https://github.com/mbarbetti/optunapi/blob/main/optunapi/server.py). It is
equipped with a set of _path operation functions_ relying on the FastAPI ecosystem:
- `ping_server`
- the _path_ is `/optunapi/ping`
- the _operation_ is `GET`
- the _function_ allows to verify if the server is running
- `read_hparams`
- the _path_ is `/optuna/hparams/{model_name}` (`model_name` is a _path parameter_)
- the _operation_ is `GET`
- the _function_ allows to start (or load) an Optuna study and send sets of hyperparameters
- `send_score`
- the _path_ is `/optuna/score/{model_name}?trail_id=TRIAL_ID&score=SCORE` (with _query parameters_)
- the _operation_ is `GET`
- the _function_ allows to finish the trial identified by `trial_id` with the `score` value
## Requirements
Python 3.6+
OptunAPI is based on two modern and highly performant frameworks:
- [Optuna](https://optuna.org/) for the optimization parts.
- [FastAPI](https://fastapi.tiangolo.com) for the HTTP requests parts.
## Installation
OptunAPI is a [public repository](https://github.com/mbarbetti/optunapi) on GitHub.
```console
$ git clone https://github.com/mbarbetti/optunapi.git
---> 100%
```
To run and use OptunAPI it's preferable to create a virtual environment with Python 3.6+ and install Optuna and FastAPI within it.
```console
$ pip install optuna fastapi
---> 100%
```
Standing on the shoulder of FastAPI, OptunAPI needs an ASGI server to run the so-called Optuna-server,
such as [Uvicorn](https://www.uvicorn.org) or [Hypercorn](https://gitlab.com/pgjones/hypercorn).
```console
$ pip install uvicorn[standard]
---> 100%
```
## Example
### Configuration file
The high-level functions provided by Optuna [to suggest values for the hyperparameters](#study-and-trial)
are replaced with an appropriate _configuration file_ in OptunAPI. Referring to the example reported in
the [Optuna tutorial](https://optuna.readthedocs.io/en/stable/tutorial/10_key_features/002_configurations.html),
what follows is the corresponding YAML configuration file:
```YAML
# Categorical parameter
optimizer:
name : optimizer
type : categorical
choices :
- RMSprop
- Adam
# Integer parameter
num_layers:
name : num_layers
type : int
low : 1
high : 3
# Integer parameter (log)
num_channels:
name : num_channels
type : int
low : 32
high : 52
log : True
# Integer parameter (discretized)
num_units:
name : num_units
type : int
low : 10
high : 100
step : 5
# Floating point parameter
dropout_rate:
name : dropout_rate
type : float
low : 0.0
high : 1.0
# Floating point parameter (log)
learning_rate:
name : learning_rale
type : float
low : 1e-5
high : 1e-2
log : True
# Floating point parameter (discretized)
drop_path_rate:
name : drop_path_rate
type : float
low : 0.0
high : 1.0
step : 0.1
```
### Optuna-server
Prepared the configuration file for the optimization session and saved it into
[`optunapi/optunapi/config`](https://github.com/mbarbetti/optunapi/tree/main/optunapi/config),
we are ready to run the Optuna-server.
```console
$ uvicorn server:optunapi
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: Started reloader process [28720]
INFO: Started server process [28722]
INFO: Waiting for application startup.
INFO: Application startup complete.
```
What does the command uvicorn server:optunapi mean?
The command `uvicorn server:optunapi` refers to:
* `server`: the file `server.py` (the Python "module") in
[optunapi/optunapi](https://github.com/mbarbetti/optunapi/tree/main/optunapi).
* `optunapi`: the object created inside of `server.py` with the line `optunapi = FastAPI()`.
Note that Uvicorn sets `127.0.0.1` and `8000` as default values for the server IP and port.
To change the defaults it's enough launching the previous command with the arguments
`--host` and `--port` followed by the chosen values.
### Trainer-client
The optimization session is managed by an Optuna _study_, initialized with the first client HTTP request,
or loaded and expanded by any other connecting machines. To refer to a particular optimization session a
client has to encode the name of the corresponding configuration file within its HTTP request.
Consider the simple use-case provided by OptunAPI, where we want to find the minimum of a 2D-paraboloid:
[`optunapi/tests/simple_client.py`](https://github.com/mbarbetti/optunapi/blob/main/tests/simple_client.py).
Since the provided configuration file is named `optuna-test.yaml`, then the GET request submitted by the client
to receive the hyperparameters set has to contain the string `'optuna-test'`:
```Python
import requests
HOST = 'http://127.0.0.1:8000'
read_hparams = requests.get (HOST + '/optunapi/hparams/optunapi-test')
hp_req = read_hparams.json()
TRIAL_ID = hp_req ['trial_id']
PARAMS = hp_req [ 'params' ]
```
What happens behind the scenes is that the above HTTP request calls an _ask_ instance to the Optuna
_study_, stored in [`optunapi/optunapi/db`](https://github.com/mbarbetti/optunapi/tree/main/optunapi/db)
once created and named `optunapi-test.db`. As already said, an _ask_ instance is a _trial_ equipped with
a set of hyperparameters and the client can recover those values decoding the corresponding HTTP response.
In the example above, `hp_req` is a dictionary containing, among others, the identifier number of the current
_trial_ (`TRIAL_ID`) and a dictionary for the hyperparameters values (`PARAMS`).
Having accessed to the hyperparameters values, we can perform whatever learning algorithm one prefers and
evaluate the associated training score, that will be used as _objective value_ to finish the _trial_ instance.
This is done with a new GET request referring to the same optimization session (again, `'optunapi-test'` in the path)
and passing `TRIAL_ID` and `SCORE` as query parameters:
```Python
import requests
HOST = 'http://127.0.0.1:8000'
send_score = requests.get (HOST + '/optunapi/score/optunapi-test?trial_id=TRIAL_ID&score=SCORE')
score_req = send_score.json()
BEST_TRIAL_ID = score_req ['best_score_id']
BEST_PARAMS = score_req [ 'best_params' ]
```
Each running client allows to refine the search for minima performed by the Optuna algorithms, focusing
on smaller and smaller space portion and enhancing the mapping of the hyperparameters space.
## Securing HTTP requests
OptunAPI is designed to be used within a VPN not directly opened to the public Internet. On the other hand,
opening the Optuna-server to Internet allows to exploit easily a wide variety of computing resources, from
on-premises machines to instances deriving from different cloud computing services (AWS, Azure, GCP, etc.).
Such design raises a security issue since anyone can submit a request to the server or catch its response,
opening the system to cyberattack.
A possible solution to this issue relies on the SSH protocol. The idea is to set up the Optuna-server
as a _private server_ (from the perspective of `REMOTE SERVER`) not directly visible from the outside
(`LOCAL CLIENT`’s perspective). This configuration, schematically represented in the sketch below,
allows a _local client_ to still access the _private server_ passing through the _remote server_
authenticating with SSH credentials.
```
----------------------------------------------------------------------
|
-------------+ | +----------+ +---------
LOCAL | | | REMOTE | | PRIVATE
CLIENT | <== SSH ========> | SERVER | <== local ==> | SERVER
-------------+ | +----------+ +---------
|
FIREWALL (only port 22 is open)
----------------------------------------------------------------------
```
OptunAPI provides a very simple implementation of this scheme:
[`optunapi/tests/secured_client.py`](https://github.com/mbarbetti/optunapi/blob/main/tests/secured_client.py).
It is based on [sshtunnel](https://github.com/pahaz/sshtunnel/) and allows to submit a HTTP request to the
_private server_ after having specifying our SSH credentials (`ssh_username`, `ssh_pkey`).
```Python
import sshtunnel
import requests
with sshtunnel.open_tunnel (
(REMOTE_SERVER_IP, 22),
ssh_username = 'mbarbetti',
ssh_pkey = '/home/mbarbetti/.ssh/id_rsa',
remote_bind_address = (PRIVATE_SERVER_IP, PRIVATE_SERVER_PORT),
local_bind_address = ('127.0.0.1', 10022)
) as tunnel:
ping_server = requests.get ('http://localhost:10022/optunapi/ping')
ping_msg = ping_server.json()
print (ping_msg)
```
How to run the server in this case?
In this configuration the Optuna-server acts as _private server_,
then its IP and port are the ones declared within the `with` statement:
```
$ uvicorn server:optunapi --host PRIVATE_SERVER_IP --port PRIVATE_SERVER_PORT
```
## License
This project is licensed under the terms of the MIT license.