{"id":28386952,"url":"https://github.com/thieu1995/metasklearn","last_synced_at":"2025-10-20T02:51:38.171Z","repository":{"id":292167124,"uuid":"979534427","full_name":"thieu1995/MetaSklearn","owner":"thieu1995","description":"MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models.","archived":false,"fork":false,"pushed_at":"2025-06-04T16:31:52.000Z","size":140,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-25T09:09:18.815Z","etag":null,"topics":["adaboost","bayesian-optimization","decision-tree","gridsearchcv","hyperparameter-optimization","hyperparameter-tuning","knn","metaheuristic-search","nature-inspired-algorithms","random-forest","randomized-search","scikit-compatible","scikit-learn","svm","xgboost"],"latest_commit_sha":null,"homepage":"https://metasklearn.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thieu1995.png","metadata":{"files":{"readme":"README.md","changelog":"ChangeLog.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-07T16:59:15.000Z","updated_at":"2025-06-04T16:28:07.000Z","dependencies_parsed_at":null,"dependency_job_id":"e6831349-cd01-4eb0-9b48-4ae99446cdc6","html_url":"https://github.com/thieu1995/MetaSklearn","commit_stats":null,"previous_names":["thieu1995/metasklearn"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/thieu1995/MetaSklearn","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thieu1995%2FMetaSklearn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thieu1995%2FMetaSklearn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thieu1995%2FMetaSklearn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thieu1995%2FMetaSklearn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thieu1995","download_url":"https://codeload.github.com/thieu1995/MetaSklearn/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thieu1995%2FMetaSklearn/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262132589,"owners_count":23264025,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adaboost","bayesian-optimization","decision-tree","gridsearchcv","hyperparameter-optimization","hyperparameter-tuning","knn","metaheuristic-search","nature-inspired-algorithms","random-forest","randomized-search","scikit-compatible","scikit-learn","svm","xgboost"],"created_at":"2025-05-30T16:08:23.809Z","updated_at":"2025-10-20T02:51:33.122Z","avatar_url":"https://github.com/thieu1995.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models.\n\n[![GitHub release](https://img.shields.io/badge/release-0.3.0-yellow.svg)](https://github.com/thieu1995/MetaSklearn/releases)\n[![PyPI version](https://badge.fury.io/py/metasklearn.svg)](https://badge.fury.io/py/metasklearn)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/metasklearn.svg)\n![PyPI - Downloads](https://img.shields.io/pypi/dm/metasklearn.svg)\n[![Downloads](https://pepy.tech/badge/metasklearn)](https://pepy.tech/project/metasklearn)\n[![Tests \u0026 Publishes to PyPI](https://github.com/thieu1995/MetaSklearn/actions/workflows/publish-package.yml/badge.svg)](https://github.com/thieu1995/MetaSklearn/actions/workflows/publish-package.yml)\n[![Documentation Status](https://readthedocs.org/projects/metasklearn/badge/?version=latest)](https://metasklearn.readthedocs.io/en/latest/?badge=latest)\n[![Chat](https://img.shields.io/badge/Chat-on%20Telegram-blue)](https://t.me/+fRVCJGuGJg1mNDg1)\n[![DOI](https://img.shields.io/badge/DOI-10.6084%2Fm9.figshare.28978805-blue)](https://doi.org/10.6084/m9.figshare.28978805)\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n\n\n---\n\n## 🌟 Overview\n\n**MetaSklearn** is a flexible and extensible Python library that brings metaheuristic optimization to \nhyperparameter tuning of `scikit-learn` models. It provides a seamless interface to optimize hyperparameters \nusing nature-inspired algorithms from the [Mealpy](https://github.com/thieu1995/mealpy) library.\nIt is designed to be user-friendly and efficient, making it easy to integrate into your machine learning workflow.\n\n\n## 🚀 Features\n\n- ✅ Hyperparameter optimization by **metaheuristic algorithms** with [`mealpy`](https://github.com/thieu1995/mealpy).\n- ✅ Compatible with any **scikit-learn** model (SVM, RandomForest, XGBoost, etc.)\n- ✅ Supports **classification** and **regression** tasks\n- ✅ Custom and scikit-learn scoring support\n- ✅ Integration with [`PerMetrics`](https://github.com/thieu1995/permetrics) for rich evaluation metrics\n- ✅ Scikit-learn compatible API: `.fit()`, `.predict()`, `.score()`\n\n\n## 📦 Installation\n\nInstall the latest version using pip:\n\n```bash\npip install metasklearn\n```\n\nAfter that, check the version to ensure successful installation:\n\n```sh\n$ python\n\u003e\u003e\u003e import metasklearn\n\u003e\u003e\u003e metasklearn.__version__\n```\n\n## 🧠 How It Works\n\n`MetaSklearn` defines a custom `MetaSearchCV` class that wraps your model and performs hyperparameter tuning using \nany optimizer supported by Mealpy. The framework evaluates model performance using either \nscikit-learn’s metrics or additional ones from `PerMetrics` library.\n\n\n## 🚀 Quick Start\n\n#### 📘 Example with SVM model for regression task\n\n\n```python\nfrom sklearn.svm import SVR\nfrom sklearn.datasets import load_diabetes\nfrom metasklearn import MetaSearchCV, FloatVar, StringVar, Data\n\n## Load data object\nX, y = load_diabetes(return_X_y=True)\ndata = Data(X, y)\n\n## Split train and test\ndata.split_train_test(test_size=0.2, random_state=42, inplace=True)\nprint(data.X_train.shape, data.X_test.shape)\n\n## Scaling dataset\ndata.X_train, scaler_X = data.scale(data.X_train, scaling_methods=(\"standard\", \"minmax\"))\ndata.X_test = scaler_X.transform(data.X_test)\n\ndata.y_train, scaler_y = data.scale(data.y_train, scaling_methods=(\"standard\", \"minmax\"))\ndata.y_train = data.y_train.ravel()\ndata.y_test = scaler_y.transform(data.y_test.reshape(-1, 1)).ravel()\n\n# Define param bounds for SVC\n\n# param_bounds = {          ==\u003e This is for GridSearchCV, show you how to convert to our MetaSearchCV\n#     \"C\": [0.1, 100],\n#     \"gamma\": [1e-4, 1],\n#     \"kernel\": [\"linear\", \"rbf\", \"poly\"]\n# }\n\nparam_bounds = [\n    FloatVar(lb=0., ub=100., name=\"C\"),\n    FloatVar(lb=1e-4, ub=1., name=\"gamma\"),\n    StringVar(valid_sets=(\"linear\", \"rbf\", \"poly\"), name=\"kernel\")\n]\n\n# Initialize and fit MetaSearchCV\nsearcher = MetaSearchCV(\n    estimator=SVR(),\n    param_bounds=param_bounds,\n    task_type=\"regression\",\n    optim=\"BaseGA\",\n    optim_params={\"epoch\": 20, \"pop_size\": 30, \"name\": \"GA\"},\n    cv=3,\n    scoring=\"MSE\",  # or any custom scoring like \"F1_macro\"\n    seed=42,\n    n_jobs=2,\n    verbose=True,\n    mode='single', n_workers=None, termination=None\n)\n\nsearcher.fit(data.X_train, data.y_train)\nprint(\"Best parameters (Classification):\", searcher.best_params)\nprint(\"Best model: \", searcher.best_estimator)\nprint(\"Best score during searching: \", searcher.best_score)\n\n# Make prediction after re-fit\ny_pred = searcher.predict(data.X_test)\nprint(\"Test Accuracy:\", searcher.score(data.X_test, data.y_test))\nprint(\"Test Score: \", searcher.scores(data.X_test, data.y_test, list_metrics=(\"RMSE\", \"R\", \"KGE\", \"NNSE\")))\n```\n\n#### 📘 Example with SVM model for classification task\n\n```python\nfrom sklearn.svm import SVC\nfrom sklearn.datasets import load_iris\nfrom sklearn.model_selection import train_test_split\nfrom metasklearn import MetaSearchCV, FloatVar, StringVar\n\n# Load dataset\nX, y = load_iris(return_X_y=True)\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n\n# Define param bounds for SVC\n\n# param_bounds = {          ==\u003e This is for GridSearchCV, show you how to convert to our MetaSearchCV\n#     \"C\": [0.1, 100],\n#     \"gamma\": [1e-4, 1],\n#     \"kernel\": [\"linear\", \"rbf\", \"poly\"]\n# }\n\nparam_bounds = [\n    FloatVar(lb=0., ub=100., name=\"C\"),\n    FloatVar(lb=1e-4, ub=1., name=\"gamma\"),\n    StringVar(valid_sets=(\"linear\", \"rbf\", \"poly\"), name=\"kernel\")\n]\n\n# Initialize and fit MetaSearchCV\nsearcher = MetaSearchCV(\n    estimator=SVC(),\n    param_bounds=param_bounds,\n    task_type=\"classification\",\n    optim=\"BaseGA\",\n    optim_params={\"epoch\": 20, \"pop_size\": 30, \"name\": \"GA\"},\n    cv=3,\n    scoring=\"AS\",  # or any custom scoring like \"F1_macro\"\n    seed=42,\n    n_jobs=2,\n    verbose=True,\n    mode='single', n_workers=None, termination=None\n)\n\nsearcher.fit(X_train, y_train)\nprint(\"Best parameters (Classification):\", searcher.best_params)\nprint(\"Best model: \", searcher.best_estimator)\nprint(\"Best score during searching: \", searcher.best_score)\n\n# Make prediction after re-fit\ny_pred = searcher.predict(X_test)\nprint(\"Test Accuracy:\", searcher.score(X_test, y_test))\nprint(\"Test Score: \", searcher.scores(X_test, y_test, list_metrics=(\"AS\", \"RS\", \"PS\", \"F1S\")))\n```\n\nAs can be seen, you do it like any other model from Scikit-Learn library such as Random Forest, Decision Tree, XGBoost,...\n\n## 📋 Parameters - Variable Types in MetaSearchCV. How to choose them?\n\nThis section explains how to use different types of variables from the `MetaSearchCV` library when defining hyperparameter \nsearch spaces. Each variable type is suitable for different kinds of optimization parameters.\n\n#### 1. `IntegerVar` – Integer Variable\n```python\nfrom metasklearn import IntegerVar\n\nvar = IntegerVar(lb=1, ub=100, name=\"n_estimators\")\n```\nUsed for discrete numerical parameters like number of neighbors in KNN, number of estimators in ensembles, etc.\n\n#### 2. `FloatVar` – Float/Continuous Variable\n```python\nfrom metasklearn import FloatVar\n\nvar = FloatVar(lb=0.001, ub=1.0, name=\"learning_rate\")\n```\nUsed for continuous numerical parameters such as `learning_rate`, `C`, `gamma`, etc.\n\n#### 3. `StringVar` – Categorical/String Variable\n```python\nfrom metasklearn import StringVar\n\nvar = StringVar(valid_sets=(\"linear\", \"poly\", \"rbf\"), name=\"kernel\")\n```\nUsed for string parameters with a limited number of choices, e.g., `kernel` in SVM. Value None can be set also.\n\n#### 4. `BinaryVar` – Binary Variable (0 or 1)\n```python\nfrom metasklearn import BinaryVar\n\nvar = BinaryVar(n_vars=1, name=\"feature_selected\")\n```\nUsed in binary feature selection problems or any 0/1-based decision.\n\n#### 5. `BoolVar` – Boolean Variable (True or False)\n```python\nfrom metasklearn import BoolVar\n\nvar = BoolVar(n_vars=1, name=\"use_bias\")\n```\nUsed for Boolean-type arguments such as `fit_intercept`, `use_bias`, etc.\n\n#### 6. `CategoricalVar` - A set of mixed discrete variables such as int, float, string, None\n```python\nfrom metasklearn import CategoricalVar\n\nvar = CategoricalVar(valid_sets=((3., None, \"alpha\"), (5, 12, 32), (\"auto\", \"exp\", \"sin\")), name=\"categorical\")\n```\n\nThis type of variable is useful when a hyperparameter can take on a predefined set of mixed values, \nsuch as: Mixed types of parameters in optimization tasks (int, string, bool, float,...).\n\n#### 7. `SequenceVar` - Variables as tuple, list, or set\n```python\nfrom metasklearn import SequenceVar\n\nvar = SequenceVar(valid_sets=((10, ), (20, 15), (30, 10, 5)), return_type=list, name=\"hidden_layer_sizes\")\n```\n\nThis type of variable is useful for defining hyperparameters that represent sequences, such as the sizes of hidden layers in a neural network.\n\n#### 8. `PermutationVar` – Permutation Variable\n```python\nfrom metasklearn import PermutationVar\n\nvar = PermutationVar(valid_set=(1, 2, 5, 10), name=\"job_order\")\n```\nUsed for optimization problems involving permutations, like scheduling or routing.\n\n#### 9. `TransferBinaryVar` – Transfer Binary Variable\n```python\nfrom metasklearn import TransferBinaryVar\n\nvar = TransferBinaryVar(n_vars=1, tf_func=\"vstf_01\", lb=-8., ub=8., all_zeros=True, name=\"transfer_binary\")\n```\nUsed in binary search spaces that support transformation-based metaheuristics.\n\n#### 10. `TransferBoolVar` – Transfer Boolean Variable\n```python\nfrom metasklearn import TransferBoolVar\n\nvar = TransferBoolVar(n_vars=1, tf_func=\"vstf_01\", lb=-8., ub=8., name=\"transfer_bool\")\n```\nUsed in Boolean search spaces with transferable logic between states.\n\n#### 🔧 Example: Define a Mixed Search Space\n\n```python\nfrom metasklearn import (IntegerVar, FloatVar, StringVar, BinaryVar, BoolVar, \n        PermutationVar, CategoricalVar, SequenceVar, TransferBinaryVar, TransferBoolVar)\n\nparam_bounds = [\n    IntegerVar(lb=1, ub=20, name=\"n_neighbors\"),\n    FloatVar(lb=0.001, ub=1.0, name=\"alpha\"),\n    StringVar(valid_sets=[\"uniform\", \"distance\"], name=\"weights\"),\n    BinaryVar(name=\"use_feature\"),\n    BoolVar(name=\"fit_bias\"),\n    PermutationVar(valid_set=(1, 2, 5, 10), name=\"job_order\"),\n    CategoricalVar(valid_sets=[0.1, \"relu\", False, None, 3], name=\"activation_choice\"),\n    SequenceVar(valid_sets=((10,), (20, 10), (30, 50, 5)), name=\"mixed_choice\"),\n    TransferBinaryVar(name=\"bin_transfer\"),\n    TransferBoolVar(name=\"bool_transfer\")\n]\n```\nUse this format when designing hyperparameter spaces for advanced models in `MetaSearchCV`.\n\n\n## ⚙ Supported Optimizers\n\n`MetaSklearn` integrates all metaheuristic algorithms from Mealpy, including:\n\n+ AOA (Arithmetic Optimization Algorithm)\n+ GWO (Grey Wolf Optimizer)\n+ PSO (Particle Swarm Optimization)\n+ DE (Differential Evolution)\n+ WOA, SSA, MVO, and many more...\n\nYou can pass any optimizer name or an instantiated optimizer object to MetaSearchCV. For more details, please refer to the [link](https://mealpy.readthedocs.io/en/latest/pages/support.html#classification-table)\n\n\n## 📊 Custom Metrics\nYou can use custom scoring functions from:\n\n+ sklearn.metrics.get_scorer_names()\n\n+ permetrics.RegressionMetric and ClassificationMetric\n\nFor details on `PerMetrics` library, please refer to the [link](https://permetrics.readthedocs.io/en/latest/pages/support.html#all-performance-metrics)\n\n\n## 📚 Documentation\n\nDocumentation is available at: 👉 https://metasklearn.readthedocs.io\n\nYou can build the documentation locally:\n\n```shell\ncd docs\nmake html\n```\n\n## 🧪 Testing\nYou can run unit tests using:\n\n```shell\npytest tests/\n```\n\n## 🤝 Contributing\nWe welcome contributions to `MetaSklearn`! If you have suggestions, improvements, or bug fixes, feel free to fork \nthe repository, create a pull request, or open an issue.\n\n\n## 📄 License\nThis project is licensed under the GPLv3 License. See the LICENSE file for more details.\n\n\n## Citation Request\nPlease include these citations if you plan to use this library:\n\n```bibtex\n@software{thieu20250510MetaSklearn,\n  author       = {Nguyen Van Thieu},\n  title        = {MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models},\n  month        = June,\n  year         = 2025,\n  doi         = {10.6084/m9.figshare.28978805},\n  url          = {https://github.com/thieu1995/MetaSklearn}\n}\n```\n\n## Official Links \n\n* Official source code repo: https://github.com/thieu1995/MetaSklearn\n* Official document: https://metasklearn.readthedocs.io/\n* Download releases: https://pypi.org/project/metasklearn/\n* Issue tracker: https://github.com/thieu1995/MetaSklearn/issues\n* Notable changes log: https://github.com/thieu1995/MetaSklearn/blob/master/ChangeLog.md\n* Official chat group: https://t.me/+fRVCJGuGJg1mNDg1\n\n---\n\nDeveloped by: [Thieu](mailto:nguyenthieu2102@gmail.com?Subject=MetaSklearn_QUESTIONS) @ 2025\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthieu1995%2Fmetasklearn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthieu1995%2Fmetasklearn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthieu1995%2Fmetasklearn/lists"}