{"id":13531884,"url":"https://github.com/automl/Auto-PyTorch","last_synced_at":"2025-04-01T20:30:39.314Z","repository":{"id":38237745,"uuid":"159791040","full_name":"automl/Auto-PyTorch","owner":"automl","description":"Automatic architecture search and hyperparameter optimization for PyTorch","archived":false,"fork":false,"pushed_at":"2024-04-09T06:17:08.000Z","size":20347,"stargazers_count":2436,"open_issues_count":75,"forks_count":297,"subscribers_count":46,"default_branch":"master","last_synced_at":"2025-03-25T20:06:41.229Z","etag":null,"topics":["automl","deep-learning","pytorch","tabular-data"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/automl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-11-30T08:18:34.000Z","updated_at":"2025-03-21T18:33:25.000Z","dependencies_parsed_at":"2024-06-21T14:19:19.478Z","dependency_job_id":"05735f62-1cf0-434c-a73b-0f766a75eb6c","html_url":"https://github.com/automl/Auto-PyTorch","commit_stats":{"total_commits":224,"total_committers":16,"mean_commits":14.0,"dds":0.7633928571428572,"last_synced_commit":"56a2ac1d69c7c61a847c678879a67f5d3672b3e8"},"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/automl%2FAuto-PyTorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/automl%2FAuto-PyTorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/automl%2FAuto-PyTorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/automl%2FAuto-PyTorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/automl","download_url":"https://codeload.github.com/automl/Auto-PyTorch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246709922,"owners_count":20821296,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automl","deep-learning","pytorch","tabular-data"],"created_at":"2024-08-01T07:01:06.557Z","updated_at":"2025-04-01T20:30:36.945Z","avatar_url":"https://github.com/automl.png","language":"Python","funding_links":[],"categories":["Uncategorized","Python","Automated Machine Learning","AutoML","Profiling","Deep Learning Framework","Scheduling","Tools and projects","AI Projects: Step by Steps","Libraries","Library","Auto-PyTorch"],"sub_categories":["Uncategorized","Others","Profiling","High-Level DL APIs","LLM","Library","3. Artificial Intelligence","Installing auto-sklearn"],"readme":"# Auto-PyTorch\n\nCopyright (C) 2021  [AutoML Groups Freiburg and Hannover](http://www.automl.org/)\n\nWhile early AutoML frameworks focused on optimizing traditional ML pipelines and their hyperparameters, another trend in AutoML is to focus on neural architecture search. To bring the best of these two worlds together, we developed **Auto-PyTorch**, which jointly and robustly optimizes the network architecture and the training hyperparameters to enable fully automated deep learning (AutoDL).\n\nAuto-PyTorch is mainly developed to support tabular data (classification, regression) and time series data (forecasting).\nThe newest features in Auto-PyTorch for tabular data are described in the paper [\"Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL\"](https://arxiv.org/abs/2006.13799) (see below for bibtex ref).\nDetails about Auto-PyTorch for multi-horizontal time series forecasting tasks can be found in the paper [\"Efficient Automated Deep Learning for Time Series Forecasting\"](https://arxiv.org/abs/2205.05511) (also see below for bibtex ref).\n\nAlso, find the documentation [here](https://automl.github.io/Auto-PyTorch/master).\n\n\n***From v0.1.0, AutoPyTorch has been updated to further improve usability, robustness and efficiency by using SMAC as the underlying optimization package as well as changing the code structure. Therefore, moving from v0.0.2 to v0.1.0 will break compatibility. \nIn case you would like to use the old API, you can find it at [`master_old`](https://github.com/automl/Auto-PyTorch/tree/master-old).***\n\n## Workflow\n\nThe rough description of the workflow of Auto-Pytorch is drawn in the following figure.\n\n![AutoPyTorch Workflow](https://raw.githubusercontent.com/automl/Auto-PyTorch/master/figs/apt_workflow.png)\n\nIn the figure, **Data** is provided by user and\n**Portfolio** is a set of configurations of neural networks that work well on diverse datasets.\nThe current version only supports the *greedy portfolio* as described in the paper *Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL*\nThis portfolio is used to warm-start the optimization of SMAC.\nIn other words, we evaluate the portfolio on a provided data as initial configurations.\nThen API starts the following procedures:\n1. **Validate input data**: Process each data type, e.g. encoding categorical data, so that Auto-Pytorch can handled.\n2. **Create dataset**: Create a dataset that can be handled in this API with a choice of cross validation or holdout splits.\n3. **Evaluate baselines** \n   * ***Tabular dataset*** *1: Train each algorithm in the predefined pool with a fixed hyperparameter configuration and dummy model from `sklearn.dummy` that represents the worst possible performance.\n   * ***Time Series Forecasting dataset*** : Train a dummy predictor that repeats the last observed value in each series\n4. **Search by [SMAC](https://github.com/automl/SMAC3)**:\\\n    a. Determine budget and cut-off rules by [Hyperband](https://jmlr.org/papers/volume18/16-558/16-558.pdf)\\\n    b. Sample a pipeline hyperparameter configuration *2 by SMAC\\\n    c. Update the observations by obtained results\\\n    d. Repeat a. -- c. until the budget runs out\n5. Build the best ensemble for the provided dataset from the observations and [model selection of the ensemble](https://www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icml04.icdm06long.pdf).\n\n*1: Baselines are a predefined pool of machine learning algorithms, e.g. LightGBM and support vector machine, to solve either regression or classification task on the provided dataset\n\n*2: A pipeline hyperparameter configuration specifies the choice of components, e.g. target algorithm, the shape of neural networks, in each step and \n(which specifies the choice of components in each step and their corresponding hyperparameters.\n\n## Installation\n\n### PyPI Installation\n\n```sh\n\npip install autoPyTorch\n\n```\n\nAuto-PyTorch for Time Series Forecasting requires additional dependencies \n\n```sh\n\npip install autoPyTorch[forecasting]\n\n```\n\n### Manual Installation\n\nWe recommend using Anaconda for developing as follows:\n\n```sh\n# Following commands assume the user is in a cloned directory of Auto-Pytorch\n\n# We also need to initialize the automl_common repository as follows\n# You can find more information about this here:\n# https://github.com/automl/automl_common/\ngit submodule update --init --recursive\n\n# Create the environment\nconda create -n auto-pytorch python=3.8\nconda activate auto-pytorch\nconda install swig\npython setup.py install\n\n```\n\nSimilarly, to install all the dependencies for Auto-PyTorch-TimeSeriesForecasting:\n\n\n```sh\n\ngit submodule update --init --recursive\n\nconda create -n auto-pytorch python=3.8\nconda activate auto-pytorch\nconda install swig\npip install -e[forecasting]\n\n```\n\n## Examples\n\nIn a nutshell:\n\n```py\nfrom autoPyTorch.api.tabular_classification import TabularClassificationTask\n\n# data and metric imports\nimport sklearn.model_selection\nimport sklearn.datasets\nimport sklearn.metrics\nX, y = sklearn.datasets.load_digits(return_X_y=True)\nX_train, X_test, y_train, y_test = \\\n        sklearn.model_selection.train_test_split(X, y, random_state=1)\n\n# initialise Auto-PyTorch api\napi = TabularClassificationTask()\n\n# Search for an ensemble of machine learning algorithms\napi.search(\n    X_train=X_train,\n    y_train=y_train,\n    X_test=X_test,\n    y_test=y_test,\n    optimize_metric='accuracy',\n    total_walltime_limit=300,\n    func_eval_time_limit_secs=50\n)\n\n# Calculate test accuracy\ny_pred = api.predict(X_test)\nscore = api.score(y_pred, y_test)\nprint(\"Accuracy score\", score)\n```\n\nFor Time Series Forecasting Tasks\n```py\n\nfrom autoPyTorch.api.time_series_forecasting import TimeSeriesForecastingTask\n\n# data and metric imports\nfrom sktime.datasets import load_longley\ntargets, features = load_longley()\n\n# define the forecasting horizon\nforecasting_horizon = 3\n\n# Dataset optimized by APT-TS can be a list of np.ndarray/ pd.DataFrame where each series represents an element in the \n# list, or a single pd.DataFrame that records the series\n# index information: to which series the timestep belongs? This id can be stored as the DataFrame's index or a separate\n# column\n# Within each series, we take the last forecasting_horizon as test targets. The items before that as training targets\n# Normally the value to be forecasted should follow the training sets\ny_train = [targets[: -forecasting_horizon]]\ny_test = [targets[-forecasting_horizon:]]\n\n# same for features. For uni-variant models, X_train, X_test can be omitted and set as None\nX_train = [features[: -forecasting_horizon]]\n# Here x_test indicates the 'known future features': they are the features known previously, features that are unknown\n# could be replaced with NAN or zeros (which will not be used by our networks). If no feature is known beforehand,\n# we could also omit X_test\nknown_future_features = list(features.columns)\nX_test = [features[-forecasting_horizon:]]\n\nstart_times = [targets.index.to_timestamp()[0]]\nfreq = '1Y'\n\n# initialise Auto-PyTorch api\napi = TimeSeriesForecastingTask()\n\n# Search for an ensemble of machine learning algorithms\napi.search(\n    X_train=X_train,\n    y_train=y_train,\n    X_test=X_test, \n    optimize_metric='mean_MAPE_forecasting',\n    n_prediction_steps=forecasting_horizon,\n    memory_limit=16 * 1024,  # Currently, forecasting models use much more memories\n    freq=freq,\n    start_times=start_times,\n    func_eval_time_limit_secs=50,\n    total_walltime_limit=60,\n    min_num_test_instances=1000,  # proxy validation sets. This only works for the tasks with more than 1000 series\n    known_future_features=known_future_features,\n)\n\n# our dataset could directly generate sequences for new datasets\ntest_sets = api.dataset.generate_test_seqs()\n\n# Calculate test accuracy\ny_pred = api.predict(test_sets)\nscore = api.score(y_pred, y_test)\nprint(\"Forecasting score\", score)\n```\n\nFor more examples including customising the search space, parellising the code, etc, checkout the `examples` folder\n\n```sh\n$ cd examples/\n```\n\n\nCode for the [paper](https://arxiv.org/abs/2006.13799) is available under `examples/ensemble` in the [TPAMI.2021.3067763](https://github.com/automl/Auto-PyTorch/tree/TPAMI.2021.3067763`) branch.\n\n## Contributing\n\nIf you want to contribute to Auto-PyTorch, clone the repository and checkout our current development branch\n\n```sh\n$ git checkout development\n```\n\n## License\n\nThis program is free software: you can redistribute it and/or modify\nit under the terms of the Apache license 2.0 (please see the LICENSE file).\n\nThis program is distributed in the hope that it will be useful,\nbut WITHOUT ANY WARRANTY; without even the implied warranty of\nMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.\n\nYou should have received a copy of the Apache license 2.0\nalong with this program (see LICENSE file).\n\n## Reference\n\nPlease refer to the branch `TPAMI.2021.3067763` to reproduce the paper *Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL*.\n\n```bibtex\n  @article{zimmer-tpami21a,\n  author = {Lucas Zimmer and Marius Lindauer and Frank Hutter},\n  title = {Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL},\n  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},\n  year = {2021},\n  note = {also available under https://arxiv.org/abs/2006.13799},\n  pages = {3079 - 3090}\n}\n```\n\n```bibtex\n@incollection{mendoza-automlbook18a,\n  author    = {Hector Mendoza and Aaron Klein and Matthias Feurer and Jost Tobias Springenberg and Matthias Urban and Michael Burkart and Max Dippel and Marius Lindauer and Frank Hutter},\n  title     = {Towards Automatically-Tuned Deep Neural Networks},\n  year      = {2018},\n  month     = dec,\n  editor    = {Hutter, Frank and Kotthoff, Lars and Vanschoren, Joaquin},\n  booktitle = {AutoML: Methods, Sytems, Challenges},\n  publisher = {Springer},\n  chapter   = {7},\n  pages     = {141--156}\n}\n```\n\n```bibtex\n@article{deng-ecml22,\n  author    = {Difan Deng and Florian Karl and Frank Hutter and Bernd Bischl and Marius Lindauer},\n  title     = {Efficient Automated Deep Learning for Time Series Forecasting},\n  year      = {2022},\n  booktitle = {Machine Learning and Knowledge Discovery in Databases. Research Track\n               - European Conference, {ECML} {PKDD} 2022},\n  url       = {https://doi.org/10.48550/arXiv.2205.05511},\n}\n```\n\n## Contact\n\nAuto-PyTorch is developed by the [AutoML Groups of the University of Freiburg and Hannover](http://www.automl.org/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fautoml%2FAuto-PyTorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fautoml%2FAuto-PyTorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fautoml%2FAuto-PyTorch/lists"}