{"id":21026633,"url":"https://github.com/jeremiegince/automlpy","last_synced_at":"2025-05-15T10:31:45.124Z","repository":{"id":46208466,"uuid":"331018987","full_name":"JeremieGince/AutoMLpy","owner":"JeremieGince","description":"This package is an automatic machine learning module whose function is to optimize the hyper-parameters of an automatic learning model.","archived":false,"fork":false,"pushed_at":"2021-11-24T16:06:22.000Z","size":14037,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-03T07:42:55.375Z","etag":null,"topics":["automl","deep-learning","gaussian-processes","grid-search-hyperparameters","machine-learning","multiprocessing","python3","pytorch","random-search","sklearn","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JeremieGince.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-01-19T15:09:28.000Z","updated_at":"2023-02-07T16:14:17.000Z","dependencies_parsed_at":"2022-07-23T10:34:35.132Z","dependency_job_id":null,"html_url":"https://github.com/JeremieGince/AutoMLpy","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JeremieGince%2FAutoMLpy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JeremieGince%2FAutoMLpy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JeremieGince%2FAutoMLpy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JeremieGince%2FAutoMLpy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JeremieGince","download_url":"https://codeload.github.com/JeremieGince/AutoMLpy/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254322939,"owners_count":22051688,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automl","deep-learning","gaussian-processes","grid-search-hyperparameters","machine-learning","multiprocessing","python3","pytorch","random-search","sklearn","tensorflow"],"created_at":"2024-11-19T11:45:33.906Z","updated_at":"2025-05-15T10:31:42.683Z","avatar_url":"https://github.com/JeremieGince.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003c!--- # \u003cp align=\"center\"\u003e AutoMLpy \u003c/p\u003e) --\u003e\n\n\u003cp align=\"center\"\u003e \u003cimg width=\"900\" height=\"400\" src=\"https://github.com/JeremieGince/AutoMLpy/blob/main/images/logo_001.png?raw=true\"\u003e \u003c/p\u003e\n\n---------------------------------------------------------------------------\n\nThis package is an automatic machine learning module whose function is to optimize the hyper-parameters \nof an automatic learning model. \n\nIn this package you can find: a grid search method, a random search algorithm and a Gaussian process search method. \nEverything is implemented to be compatible with the _Tensorflow_, _pyTorch_ and _sklearn_ libraries. \n\n\n# Installation\n\n## Latest stable version:\n```\npip install AutoMLpy\n```\n\n## Latest unstable version:\n0. Download the .whl file [here](https://github.com/JeremieGince/AutoMLpy/blob/main/dist/AutoMLpy-0.0.3-py3-none-any.whl);\n1. Copy the path of this file on your computer;\n2. pip install it with ``` pip install [path].whl ```\n\n## With pip+git:\n```\npip install git+https://github.com/JeremieGince/AutoMLpy\n```\n \n ---------------------------------------------------------------------------\n# Example - MNIST optimization with Tensorflow \u0026 Keras\n\nHere you can see an example on how to optimize a model made with Tensorflow and Keras on the popular dataset MNIST.\n\n## Imports\n\nWe start by importing some useful stuff.\n\n\n```python\n# Some useful packages\nfrom typing import Union, Tuple\nimport time\nimport numpy as np\nimport pandas as pd\nimport pprint\n\n# Tensorflow\nimport tensorflow as tf\nimport tensorflow_datasets as tfds\n\n# Importing the HPOptimizer and the RandomHpSearch from the AutoMLpy package.\nfrom AutoMLpy import HpOptimizer, RandomHpSearch\n\n```\n\n## Dataset\n\nNow we load the MNIST dataset in the tensorflow way.\n\n\n```python\ndef normalize_img(image, label):\n    \"\"\"Normalizes images: `uint8` -\u003e `float32`.\"\"\"\n    return tf.cast(image, tf.float32) / 255., label\n\ndef get_tf_mnist_dataset(**kwargs):\n    # https://www.tensorflow.org/datasets/keras_example\n    (ds_train, ds_test), ds_info = tfds.load(\n        'mnist',\n        split=['train', 'test'],\n        shuffle_files=True,\n        as_supervised=True,\n        with_info=True,\n    )\n\n    # Build training pipeline\n    ds_train = ds_train.map(normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)\n    ds_train = ds_train.cache()\n    ds_train = ds_train.shuffle(ds_info.splits['train'].num_examples)\n    ds_train = ds_train.batch(128)\n    ds_train = ds_train.prefetch(tf.data.experimental.AUTOTUNE)\n\n    # Build evaluation pipeline\n    ds_test = ds_test.map(normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)\n    ds_test = ds_test.batch(128)\n    ds_test = ds_test.cache()\n    ds_test = ds_test.prefetch(tf.data.experimental.AUTOTUNE)\n\n    return ds_train, ds_test\n```\n\n## Keras Model\n\nNow we make a function that return a keras model given a set of hyper-parameters (hp).\n\n\n```python\ndef get_tf_mnist_model(**hp):\n\n    if hp.get(\"use_conv\", False):\n        model = tf.keras.models.Sequential([\n            # Convolution layers\n            tf.keras.layers.Conv2D(10, 3, padding=\"same\", input_shape=(28, 28, 1)),\n            tf.keras.layers.MaxPool2D((2, 2)),\n            tf.keras.layers.Conv2D(50, 3, padding=\"same\"),\n            tf.keras.layers.MaxPool2D((2, 2)),\n\n            # Dense layers\n            tf.keras.layers.Flatten(),\n            tf.keras.layers.Dense(120, activation='relu'),\n            tf.keras.layers.Dense(84, activation='relu'),\n            tf.keras.layers.Dense(10)\n        ])\n    else:\n        model = tf.keras.models.Sequential([\n            tf.keras.layers.Flatten(input_shape=(28, 28)),\n            tf.keras.layers.Dense(120, activation='relu'),\n            tf.keras.layers.Dense(84, activation='relu'),\n            tf.keras.layers.Dense(10)\n        ])\n\n    return model\n\n```\n\n## The Optimizer Model\n\nIt's time to implement the optimizer model. You just have to implement the following methods: \"build_model\",\n\"fit_dataset_model_\" and \"score_on_dataset\". Those methods must respect their signature and output type. The objective \nhere is to make the building, the training and the score phase depend on some hyper-parameters. So the optimizer can \nuse those to find the best set of hp.\n\n\n```python\nclass KerasMNISTHpOptimizer(HpOptimizer):\n    def build_model(self, **hp) -\u003e tf.keras.Model:\n        model = get_tf_mnist_model(**hp)\n\n        model.compile(\n            optimizer=tf.keras.optimizers.SGD(\n                learning_rate=hp.get(\"learning_rate\", 1e-3),\n                nesterov=hp.get(\"nesterov\", True),\n                momentum=hp.get(\"momentum\", 0.99),\n            ),\n            loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n            metrics=[tf.keras.metrics.SparseCategoricalAccuracy()],\n        )\n        return model\n\n    def fit_dataset_model_(\n            self,\n            model: tf.keras.Model,\n            dataset,\n            **hp\n    ) -\u003e tf.keras.Model:\n        history = model.fit(\n            dataset,\n            epochs=hp.get(\"epochs\", 1),\n            verbose=False,\n        )\n        return model\n\n    def score_on_dataset(\n            self,\n            model: tf.keras.Model,\n            dataset,\n            **hp\n    ) -\u003e float:\n        test_loss, test_acc = model.evaluate(dataset, verbose=0)\n        return test_acc\n\n```\n\n## Execution \u0026 Optimization\n\nFirst thing after creating our classes is to load the dataset in memory.\n\n\n```python\nmnist_train, mnist_test = get_tf_mnist_dataset()\nmnist_hp_optimizer = KerasMNISTHpOptimizer()\n```\n\nAfter you will define your hyper-parameters space with a dictionary like this.\n\n\n```python\nhp_space = dict(\n    epochs=list(range(1, 16)),\n    learning_rate=np.linspace(1e-4, 1e-1, 50),\n    nesterov=[True, False],\n    momentum=np.linspace(0.01, 0.99, 50),\n    use_conv=[True, False],\n)\n```\n\nIt's time to define your hp search algorithm and give it your budget in time and iteration. Here we will test for \n10 minutes and 100 iterations maximum.\n\n\n```python\nparam_gen = RandomHpSearch(hp_space, max_seconds=60*10, max_itr=100)\n```\n\nFinally, you start the optimization by giving your parameter generator to the optimize method. Note that the \n\"stop_criterion\" argument is to stop the optimization when the given score is reached. It's really useful to save some \ntime.\n\n\n```python\nsave_kwargs = dict(\n    save_name=f\"tf_mnist_hp_opt\",\n    title=\"Random search: MNIST\",\n)\n\nparam_gen = mnist_hp_optimizer.optimize_on_dataset(\n    param_gen, mnist_train, save_kwargs=save_kwargs,\n    stop_criterion=1.0,\n)\n```\n    \n\n## Testing\n\nNow, you can test the optimized hyper-parameters by fitting again with the full train dataset. Yes with the full \ndataset, because in the optimization phase a cross-validation is made which crop your train dataset by half. Plus, \nit's time to test the fitted model on the test dataset.\n\n\n```python\nopt_hp = param_gen.get_best_param()\n\nmodel = mnist_hp_optimizer.build_model(**opt_hp)\nmnist_hp_optimizer.fit_dataset_model_(\n    model, mnist_train, **opt_hp\n)\n\ntest_acc = mnist_hp_optimizer.score_on_dataset(\n    model, mnist_test, **opt_hp\n)\n\nprint(f\"test_acc: {test_acc*100:.3f}%\")\n```\n    \n\nThe optimized hyper-parameters:\n\n\n```python\npp = pprint.PrettyPrinter(indent=4)\npp.pprint(opt_hp)\n```\n    \n\n## Visualization\n\nYou can visualize the optimization with an interactive html file.\n\n\n```python\nfig = param_gen.write_optimization_to_html(show=True, dark_mode=True, **save_kwargs)\n```\n\n## Optimisation table\n```python\nopt_table = param_gen.get_optimization_table()\n```\n\n## Saving ParameterGenerator\n```python\nparam_gen.save_history(**save_kwargs)\nsave_path = param_gen.save_obj(**save_kwargs)\n```\n\n## Loading ParameterGenerator\n```python\nparam_gen = RandomHpSearch.load_obj(save_path)\n```\n\n## Re-lunch optimisation with loaded ParameterGenerator\n```python\n# Change the budget to be able to optimize again\nparam_gen.max_itr = param_gen.max_seconds + 100\nparam_gen.max_seconds = param_gen.max_seconds + 60\n\nparam_gen = mnist_hp_optimizer.optimize_on_dataset(\n    param_gen, mnist_train, save_kwargs=save_kwargs,\n    stop_criterion=1.0, reset_gen=False,\n)\n\nopt_hp = param_gen.get_best_param()\n\nprint(param_gen.get_optimization_table())\npp.pprint(param_gen.history)\npp.pprint(opt_hp)\n```\n \n ---------------------------------------------------------------------------\n # Other examples\n Examples on how to use this package are in the folder [./examples](https://github.com/JeremieGince/AutoMLpy/blob/main/examples). \n There you can find the previous example with [_Tensorflow_](https://github.com/JeremieGince/AutoMLpy/blob/main/examples/tensorflow_example.ipynb) \n and an example with [_pyTorch_](https://github.com/JeremieGince/AutoMLpy/blob/main/examples/pytorch_example.ipynb).\n \n\n\n# License\n[Apache License 2.0](LICENSE.md)\n\n# Citation\n```\n@article{Gince,\n  title={Implémentation du module AutoMLpy, un outil d’apprentissage machine automatique},\n  author={Jérémie Gince},\n  year={2021},\n  publisher={ULaval},\n  url={https://github.com/JeremieGince/AutoMLpy},\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjeremiegince%2Fautomlpy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjeremiegince%2Fautomlpy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjeremiegince%2Fautomlpy/lists"}