Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jeremiegince/automlpy

This package is an automatic machine learning module whose function is to optimize the hyper-parameters of an automatic learning model.
https://github.com/jeremiegince/automlpy

automl deep-learning gaussian-processes grid-search-hyperparameters machine-learning multiprocessing python3 pytorch random-search sklearn tensorflow

Last synced: 2 months ago
JSON representation

This package is an automatic machine learning module whose function is to optimize the hyper-parameters of an automatic learning model.

Host: GitHub
URL: https://github.com/jeremiegince/automlpy
Owner: JeremieGince
License: apache-2.0
Created: 2021-01-19T15:09:28.000Z (about 4 years ago)
Default Branch: main
Last Pushed: 2021-11-24T16:06:22.000Z (about 3 years ago)
Last Synced: 2024-11-15T10:56:47.032Z (3 months ago)
Topics: automl, deep-learning, gaussian-processes, grid-search-hyperparameters, machine-learning, multiprocessing, python3, pytorch, random-search, sklearn, tensorflow
Language: Python
Homepage:
Size: 13.4 MB
Stars: 3
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.md

Awesome Lists containing this project

README

        

  


---------------------------------------------------------------------------

This package is an automatic machine learning module whose function is to optimize the hyper-parameters 

of an automatic learning model. 

In this package you can find: a grid search method, a random search algorithm and a Gaussian process search method. 

Everything is implemented to be compatible with the _Tensorflow_, _pyTorch_ and _sklearn_ libraries. 

# Installation

## Latest stable version:

```

pip install AutoMLpy

```

## Latest unstable version:

0. Download the .whl file [here](https://github.com/JeremieGince/AutoMLpy/blob/main/dist/AutoMLpy-0.0.3-py3-none-any.whl);

1. Copy the path of this file on your computer;

2. pip install it with ``` pip install [path].whl ```

## With pip+git:

```

pip install git+https://github.com/JeremieGince/AutoMLpy

```

 

 ---------------------------------------------------------------------------

# Example - MNIST optimization with Tensorflow & Keras

Here you can see an example on how to optimize a model made with Tensorflow and Keras on the popular dataset MNIST.

## Imports

We start by importing some useful stuff.

```python

# Some useful packages

from typing import Union, Tuple

import time

import numpy as np

import pandas as pd

import pprint

# Tensorflow

import tensorflow as tf

import tensorflow_datasets as tfds

# Importing the HPOptimizer and the RandomHpSearch from the AutoMLpy package.

from AutoMLpy import HpOptimizer, RandomHpSearch

```

## Dataset

Now we load the MNIST dataset in the tensorflow way.

```python

def normalize_img(image, label):

    """Normalizes images: `uint8` -> `float32`."""

    return tf.cast(image, tf.float32) / 255., label

def get_tf_mnist_dataset(**kwargs):

    # https://www.tensorflow.org/datasets/keras_example

    (ds_train, ds_test), ds_info = tfds.load(

        'mnist',

        split=['train', 'test'],

        shuffle_files=True,

        as_supervised=True,

        with_info=True,

    )

    # Build training pipeline

    ds_train = ds_train.map(normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)

    ds_train = ds_train.cache()

    ds_train = ds_train.shuffle(ds_info.splits['train'].num_examples)

    ds_train = ds_train.batch(128)

    ds_train = ds_train.prefetch(tf.data.experimental.AUTOTUNE)

    # Build evaluation pipeline

    ds_test = ds_test.map(normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)

    ds_test = ds_test.batch(128)

    ds_test = ds_test.cache()

    ds_test = ds_test.prefetch(tf.data.experimental.AUTOTUNE)

    return ds_train, ds_test

```

## Keras Model

Now we make a function that return a keras model given a set of hyper-parameters (hp).

```python

def get_tf_mnist_model(**hp):

    if hp.get("use_conv", False):

        model = tf.keras.models.Sequential([

            # Convolution layers

            tf.keras.layers.Conv2D(10, 3, padding="same", input_shape=(28, 28, 1)),

            tf.keras.layers.MaxPool2D((2, 2)),

            tf.keras.layers.Conv2D(50, 3, padding="same"),

            tf.keras.layers.MaxPool2D((2, 2)),

            # Dense layers

            tf.keras.layers.Flatten(),

            tf.keras.layers.Dense(120, activation='relu'),

            tf.keras.layers.Dense(84, activation='relu'),

            tf.keras.layers.Dense(10)

        ])

    else:

        model = tf.keras.models.Sequential([

            tf.keras.layers.Flatten(input_shape=(28, 28)),

            tf.keras.layers.Dense(120, activation='relu'),

            tf.keras.layers.Dense(84, activation='relu'),

            tf.keras.layers.Dense(10)

        ])

    return model

```

## The Optimizer Model

It's time to implement the optimizer model. You just have to implement the following methods: "build_model",

"fit_dataset_model_" and "score_on_dataset". Those methods must respect their signature and output type. The objective 

here is to make the building, the training and the score phase depend on some hyper-parameters. So the optimizer can 

use those to find the best set of hp.

```python

class KerasMNISTHpOptimizer(HpOptimizer):

    def build_model(self, **hp) -> tf.keras.Model:

        model = get_tf_mnist_model(**hp)

        model.compile(

            optimizer=tf.keras.optimizers.SGD(

                learning_rate=hp.get("learning_rate", 1e-3),

                nesterov=hp.get("nesterov", True),

                momentum=hp.get("momentum", 0.99),

            ),

            loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),

            metrics=[tf.keras.metrics.SparseCategoricalAccuracy()],

        )

        return model

    def fit_dataset_model_(

            self,

            model: tf.keras.Model,

            dataset,

            **hp

    ) -> tf.keras.Model:

        history = model.fit(

            dataset,

            epochs=hp.get("epochs", 1),

            verbose=False,

        )

        return model

    def score_on_dataset(

            self,

            model: tf.keras.Model,

            dataset,

            **hp

    ) -> float:

        test_loss, test_acc = model.evaluate(dataset, verbose=0)

        return test_acc

```

## Execution & Optimization

First thing after creating our classes is to load the dataset in memory.

```python

mnist_train, mnist_test = get_tf_mnist_dataset()

mnist_hp_optimizer = KerasMNISTHpOptimizer()

```

After you will define your hyper-parameters space with a dictionary like this.

```python

hp_space = dict(

    epochs=list(range(1, 16)),

    learning_rate=np.linspace(1e-4, 1e-1, 50),

    nesterov=[True, False],

    momentum=np.linspace(0.01, 0.99, 50),

    use_conv=[True, False],

)

```

It's time to define your hp search algorithm and give it your budget in time and iteration. Here we will test for 

10 minutes and 100 iterations maximum.

```python

param_gen = RandomHpSearch(hp_space, max_seconds=60*10, max_itr=100)

```

Finally, you start the optimization by giving your parameter generator to the optimize method. Note that the 

"stop_criterion" argument is to stop the optimization when the given score is reached. It's really useful to save some 

time.

```python

save_kwargs = dict(

    save_name=f"tf_mnist_hp_opt",

    title="Random search: MNIST",

)

param_gen = mnist_hp_optimizer.optimize_on_dataset(

    param_gen, mnist_train, save_kwargs=save_kwargs,

    stop_criterion=1.0,

)

```

    

## Testing

Now, you can test the optimized hyper-parameters by fitting again with the full train dataset. Yes with the full 

dataset, because in the optimization phase a cross-validation is made which crop your train dataset by half. Plus, 

it's time to test the fitted model on the test dataset.

```python

opt_hp = param_gen.get_best_param()

model = mnist_hp_optimizer.build_model(**opt_hp)

mnist_hp_optimizer.fit_dataset_model_(

    model, mnist_train, **opt_hp

)

test_acc = mnist_hp_optimizer.score_on_dataset(

    model, mnist_test, **opt_hp

)

print(f"test_acc: {test_acc*100:.3f}%")

```

    

The optimized hyper-parameters:

```python

pp = pprint.PrettyPrinter(indent=4)

pp.pprint(opt_hp)

```

    

## Visualization

You can visualize the optimization with an interactive html file.

```python

fig = param_gen.write_optimization_to_html(show=True, dark_mode=True, **save_kwargs)

```

## Optimisation table

```python

opt_table = param_gen.get_optimization_table()

```

## Saving ParameterGenerator

```python

param_gen.save_history(**save_kwargs)

save_path = param_gen.save_obj(**save_kwargs)

```

## Loading ParameterGenerator

```python

param_gen = RandomHpSearch.load_obj(save_path)

```

## Re-lunch optimisation with loaded ParameterGenerator

```python

# Change the budget to be able to optimize again

param_gen.max_itr = param_gen.max_seconds + 100

param_gen.max_seconds = param_gen.max_seconds + 60

param_gen = mnist_hp_optimizer.optimize_on_dataset(

    param_gen, mnist_train, save_kwargs=save_kwargs,

    stop_criterion=1.0, reset_gen=False,

)

opt_hp = param_gen.get_best_param()

print(param_gen.get_optimization_table())

pp.pprint(param_gen.history)

pp.pprint(opt_hp)

```

 

 ---------------------------------------------------------------------------

 # Other examples

 Examples on how to use this package are in the folder [./examples](https://github.com/JeremieGince/AutoMLpy/blob/main/examples). 

 There you can find the previous example with [_Tensorflow_](https://github.com/JeremieGince/AutoMLpy/blob/main/examples/tensorflow_example.ipynb) 

 and an example with [_pyTorch_](https://github.com/JeremieGince/AutoMLpy/blob/main/examples/pytorch_example.ipynb).

 

# License

[Apache License 2.0](LICENSE.md)

# Citation

```

@article{Gince,

  title={Implémentation du module AutoMLpy, un outil d’apprentissage machine automatique},

  author={Jérémie Gince},

  year={2021},

  publisher={ULaval},

  url={https://github.com/JeremieGince/AutoMLpy},

}

```