{"id":25353569,"url":"https://github.com/lamalab-org/pyepal","last_synced_at":"2026-03-12T00:01:58.368Z","repository":{"id":37901289,"uuid":"253408969","full_name":"lamalab-org/pyepal","owner":"lamalab-org","description":"Multiobjective active learning with tunable accuracy/efficiency tradeoff and clear stopping criterion.","archived":false,"fork":false,"pushed_at":"2025-03-20T07:32:16.000Z","size":10983,"stargazers_count":41,"open_issues_count":25,"forks_count":6,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-03-02T13:38:16.678Z","etag":null,"topics":["active-learning","hacktoberfest","machine-learning","multiobjective","pareto","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lamalab-org.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-04-06T06:02:07.000Z","updated_at":"2025-09-10T03:05:06.000Z","dependencies_parsed_at":"2023-11-13T22:23:05.210Z","dependency_job_id":"77b6b19c-db56-4c21-9429-2684fe6b05a5","html_url":"https://github.com/lamalab-org/pyepal","commit_stats":{"total_commits":329,"total_committers":8,"mean_commits":41.125,"dds":"0.31610942249240126","last_synced_commit":"c0b3ab02f5a8d3235023050b05182502f7a093de"},"previous_names":["kjappelbaum/pyepal"],"tags_count":17,"template":false,"template_full_name":null,"purl":"pkg:github/lamalab-org/pyepal","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lamalab-org%2Fpyepal","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lamalab-org%2Fpyepal/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lamalab-org%2Fpyepal/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lamalab-org%2Fpyepal/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lamalab-org","download_url":"https://codeload.github.com/lamalab-org/pyepal/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lamalab-org%2Fpyepal/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30243779,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-08T00:58:18.660Z","status":"online","status_checked_at":"2026-03-08T02:00:06.215Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["active-learning","hacktoberfest","machine-learning","multiobjective","pareto","python"],"created_at":"2025-02-14T19:19:22.840Z","updated_at":"2026-03-12T00:01:58.345Z","avatar_url":"https://github.com/lamalab-org.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n \u003cimg src=\"pyepal_logo.png\" /\u003e\n\u003c/p\u003e\n\n|                            |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |\n| -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| Continuous integration     | ![Python package](https://github.com/kjappelbaum/pyepal/workflows/Python%20package/badge.svg) ![pre-commit](https://github.com/kjappelbaum/pyepal/workflows/pre-commit/badge.svg)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |\n| Code health                | [![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://GitHub.com/Naereen/StrapDown.js/graphs/commit-activity) [![Maintainability](https://api.codeclimate.com/v1/badges/db9b3f21528574dfb141/maintainability)](https://codeclimate.com/github/kjappelbaum/pyepal/maintainability) [![Language grade: Python](https://img.shields.io/lgtm/grade/python/g/kjappelbaum/pyepal.svg?logo=lgtm\u0026logoWidth=18)](https://lgtm.com/projects/g/kjappelbaum/pyepal/context:python) ![GitHub last commit](https://img.shields.io/github/last-commit/kjappelbaum/pyepal) [![codecov](https://codecov.io/gh/kjappelbaum/pyepal/branch/master/graph/badge.svg?token=BL2CF4HQ06)](https://codecov.io/gh/kjappelbaum/pyepal) |\n| Documentation and tutorial | [![Documentation Status](https://readthedocs.org/projects/pyepal/badge/?version=latest)](https://pyepal.readthedocs.io/en/latest/?badge=latest) [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/kjappelbaum/pyepal/HEAD?filepath=examples)                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |\n| Social                     | [![Gitter](https://badges.gitter.im/kjappelbaum/pyepal.svg)](https://gitter.im/kjappelbaum/pyepal?utm_source=badge\u0026utm_medium=badge\u0026utm_campaign=pr-badge)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |\n| Python                     | ![PyPI - Python Version](https://img.shields.io/pypi/pyversions/pyepal) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |\n| License                    | [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |\n| [Citation](#citation)      | [![Paper DOI](https://img.shields.io/badge/DOI-10.26434/chemrxiv.13200197.v1-blue.svg)](http://www.nature.com/articles/s41467-021-22437-0) [![Zenodo archive](https://zenodo.org/badge/253408969.svg)](https://zenodo.org/badge/latestdoi/253408969)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |\n\nGeneralized Python implementation of a modified version of the ε-PAL algorithm [[1](#1), [2](#2)].\n\nFor more detailed docs [go here](https://pyepal.readthedocs.io/en/latest/?badge=latest).\n\n## Installation\n\nTo install the latest stable release use\n\n```(bash)\npip install pyepal\n```\n\nor the conda channel (recommended)\n\n```(bash)\nconda install pyepal -c conda-forge\n```\n\nto install the latest development version from the head use\n\n```(bash)\npip install git+https://github.com/kjappelbaum/pyepal.git\n```\n\nDevelopers can install the extras `[testing, docs, pre-commit]`. Installation should take only a few minutes.\n\n### Additional Notes\n\n- On macOS you might need to install `libomp` (e.g., `brew install libomp`) for multithreading in some models.\n\n- We currently support Python 3.7 and 3.8.\n\n- If you want to [limit how many CPUs openblas uses](https://github.com/numpy/numpy/issues/8120), you can `export OPENBLAS_NUM_THREADS=1`\n\n## Usage\n\nThe main logic is implemented in the `PALBase` class. There are some prebuilt classes for common use cases (`GPy`, `sklearn`) that inherit from this class.\nFor more details about how to use the code and notes about the tutorials [see the docs](https://kjappelbaum.github.io/pyepal/).\n\n### Pre-Built classes\n\n#### scikit-learn\n\nIf you want to use a list of [sklearn](https://scikit-learn.org/stable/index.html) models, you can use the `PALSklearn` class. To use it for one step,\nyou can follow the following code snippet. The basic principle is the same for all the different `PAL` classes.\n\n```python\nfrom pyepal import PALSklearn\nfrom sklearn.gaussian_process import GaussianProcessRegressor\nfrom sklearn.gaussian_process.kernels import RBF, Matern\n\n# For each objective, initialize a model\ngpr_objective_0 = GaussianProcessRegressor(RBF())\ngpr_objective_1 = GaussianProcessRegressor(RBF())\n\n# The minimal input to create a PAL instance is a list of models,\n# the design space (X, in ML terms \"feature matrix\") and the number of objectives\npalsklearn_instance = PALSklearn(X, [gpr_objective_0, gpr_objective_1], 2)\n\n# the next step is to provide some initial measurements.\n# You can do this with the update_train_set function, which you\n# can use throughout the active learning process to update the training set.\n# For this, provide a numpy array of indices in your design space\n# and the corresponding measurements\nsampled_indices = np.array([1,2,3])\nmeasurements = np.array([[1,2],\n                        [0.8, 1],\n                        [7,1]])\npalsklearn_instance.update_train_set(sampled_indices, measurements)\n\n# Now, you're ready to run the first iteration.\n# This will return the next index to sample and update all the attributes\n# If there are no unclassified samples left, it will return None and\n# print a statement saying that the classification is completed\nindex_to_sample = palsklearn_instance.run_one_step()\n```\n\n#### GPy\n\nIf you want to use a list of [GPy](https://sheffieldml.github.io/GPy/) models, you can use the `PALGPy` class.\n\n#### Coregionalized GPR\n\nCoregionalized GPR models can utilize correlations between the objectives and also work in the cases in which some of the objectives are not measured for all samples.\n\n### Custom classes\n\nYou will need to implement the `_train()` and `_predict()` functions if you inherit from `PALBase`. If you want to tune the hyperparameters of your models while new training points are added, you can implement a schedule by setting the `_should_optimize_hyperparameters()` function and the `_set_hyperparameters()` function, which sets the hyperparameters for the model(s).\n\nIf you need to train a model, use `self.design_space` as the feature matrix and `self.y` as the target vector. Note that in `self.y` all objectives are turned into maximization problems. That is, if one of your problems is a minimization problem, PyePAL will flip its sign in `self.y`.\n\nA basic example of how a custom class can be implemented is the `PALSklearn` class:\n\n```python\nclass PALSklearn(PALBase):\n    \"\"\"PAL class for a list of Sklearn (GPR) models, with one model per objective\"\"\"\n\n    def __init__(self, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n\n        validate_number_models(self.models, self.ndim)\n\n    def _train(self):\n        for i, model in enumerate(self.models):\n            model.fit(self.design_space[self.sampled], self.y[self.sampled, i].reshape(-1,1))\n\n    def _predict(self):\n        means, stds = [], []\n        for model in self.models:\n            mean, std = model.predict(self.design_space, return_std=True)\n            means.append(mean.reshape(-1, 1))\n            stds.append(std.reshape(-1, 1))\n\n        self._means = np.hstack(mean)\n        self.std = np.hstack(stds)\n```\n\nFor scheduling of the hyperparameter optimization, we have some predefined schedules in the `pyepal.pal.schedules` module.\n\n### Test the algorithms\n\nIf the full design space is known, you can use a while loop to fully explore the space with PyePAL.\nFor the theoretical guarantees of PyePAL to hold, you'll need to sample until all uncertainties are below epsilon. In practice, it is usually enough to require as a termination criterion that there are no unclassified samples left. For this you can use the following snippet\n\n```python\nfrom pyepal.utils import exhaust_loop\nfrom pyepal.models.gpr import build_model\n\n# indices for initialization\nsample_idx = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 50, 60, 70])\n\n# build one model per objective\nmodel_0 = build_model(X[sample_idx], y[sample_idx], 0)\nmodel_1 = build_model(X[sample_idx], y[sample_idx], 1)\n\n# initialize the PAL instance\npalinstance = PALGPy(X, [model_0, model_1], 2, beta_scale=1)\npalinstance.update_train_set(sample_idx, y[sample_idx])\n\n# This will run the sampling and training as long as there\n# are unclassified samples\nexhaust_loop(palinstance, y)\n```\n\nTo measure the performance, you can use the `get_hypervolume` function from `pyepal.pal.utils`. More indicators are implemented in packages like [deap](https://github.com/DEAP/deap), [pagmo](https://github.com/esa/pagmo), or [pymoo](https://github.com/msu-coinlab/pymoo/tree/master).\n\n## References\n\n1. \u003ca name=\"1\"\u003e\u003c/a\u003e Zuluaga, M.; Krause, A.; Püschel, M. E-PAL: An Active Learning Approach to the Multi-Objective Optimization Problem. Journal of Machine Learning Research 2016, 17 (104), 1–32.\n2. \u003ca name=\"2\"\u003e\u003c/a\u003e Zuluaga, M.; Sergent, G.; Krause, A.; Püschel, M. Active Learning for Multi-Objective Optimization; Dasgupta, S., McAllester, D., Eds.; Proceedings of machine learning research; PMLR: Atlanta, Georgia, USA, 2013; Vol. 28, pp 462–470.\n\n## Citation\n\n\u003ca name=\"citation\"\u003e\u003c/a\u003e\n\nIf you find this code useful for your work, please cite:\n\n- Our paper that describes the implementation and an application to materials discovery: [Jablonka, K. M.; Jothiappan, G. M.; Wang, S.; Smit, B.; Yoo, B. Bias Free Multiobjective Active Learning for Materials Design and Discovery. Nat Commun 2021, 12 (1), 2312.](http://www.nature.com/articles/s41467-021-22437-0)\n\n- The original paper that describes the ε-PAL algorithm: [Zuluaga, M.; Krause, A.; Püschel, M. E-PAL: An Active Learning Approach to the Multi-Objective Optimization Problem. Journal of Machine Learning Research 2016, 17 (104), 1–32.](https://jmlr.csail.mit.edu/papers/volume17/15-047/15-047.pdf)\n\n## Acknowledgments\n\nThe research was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme ([grant agreement 666983, MaGic](https://cordis.europa.eu/project/id/666983)), by the [NCCR-MARVEL](https://www.nccr-marvel.ch/), funded by the Swiss National Science Foundation, and by the Swiss National Science Foundation (SNSF) under Grant 200021_172759. Part of the work was performed as part of the [Explore Together internship program at BASF](https://www.basf.com/global/en/careers/students/explore-together.html).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flamalab-org%2Fpyepal","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flamalab-org%2Fpyepal","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flamalab-org%2Fpyepal/lists"}