https://github.com/thieu1995/metasklearn
MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models.
https://github.com/thieu1995/metasklearn
adaboost bayesian-optimization decision-tree gridsearchcv hyperparameter-optimization hyperparameter-tuning knn metaheuristic-search nature-inspired-algorithms random-forest randomized-search scikit-compatible scikit-learn svm xgboost
Last synced: 4 months ago
JSON representation
MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models.
- Host: GitHub
- URL: https://github.com/thieu1995/metasklearn
- Owner: thieu1995
- License: gpl-3.0
- Created: 2025-05-07T16:59:15.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2025-06-04T16:31:52.000Z (4 months ago)
- Last Synced: 2025-06-25T09:09:18.815Z (4 months ago)
- Topics: adaboost, bayesian-optimization, decision-tree, gridsearchcv, hyperparameter-optimization, hyperparameter-tuning, knn, metaheuristic-search, nature-inspired-algorithms, random-forest, randomized-search, scikit-compatible, scikit-learn, svm, xgboost
- Language: Python
- Homepage: https://metasklearn.readthedocs.io/
- Size: 137 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: ChangeLog.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Citation: CITATION.cff
Awesome Lists containing this project
README
# MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models.
[](https://github.com/thieu1995/MetaSklearn/releases)
[](https://badge.fury.io/py/metasklearn)


[](https://pepy.tech/project/metasklearn)
[](https://github.com/thieu1995/MetaSklearn/actions/workflows/publish-package.yml)
[](https://metasklearn.readthedocs.io/en/latest/?badge=latest)
[](https://t.me/+fRVCJGuGJg1mNDg1)
[](https://doi.org/10.6084/m9.figshare.28978805)
[](https://www.gnu.org/licenses/gpl-3.0)---
## π Overview
**MetaSklearn** is a flexible and extensible Python library that brings metaheuristic optimization to
hyperparameter tuning of `scikit-learn` models. It provides a seamless interface to optimize hyperparameters
using nature-inspired algorithms from the [Mealpy](https://github.com/thieu1995/mealpy) library.
It is designed to be user-friendly and efficient, making it easy to integrate into your machine learning workflow.## π Features
- β Hyperparameter optimization by **metaheuristic algorithms** with [`mealpy`](https://github.com/thieu1995/mealpy).
- β Compatible with any **scikit-learn** model (SVM, RandomForest, XGBoost, etc.)
- β Supports **classification** and **regression** tasks
- β Custom and scikit-learn scoring support
- β Integration with [`PerMetrics`](https://github.com/thieu1995/permetrics) for rich evaluation metrics
- β Scikit-learn compatible API: `.fit()`, `.predict()`, `.score()`## π¦ Installation
Install the latest version using pip:
```bash
pip install metasklearn
```After that, check the version to ensure successful installation:
```sh
$ python
>>> import metasklearn
>>> metasklearn.__version__
```## π§ How It Works
`MetaSklearn` defines a custom `MetaSearchCV` class that wraps your model and performs hyperparameter tuning using
any optimizer supported by Mealpy. The framework evaluates model performance using either
scikit-learnβs metrics or additional ones from `PerMetrics` library.## π Quick Start
#### π Example with SVM model for regression task
```python
from sklearn.svm import SVR
from sklearn.datasets import load_diabetes
from metasklearn import MetaSearchCV, FloatVar, StringVar, Data## Load data object
X, y = load_diabetes(return_X_y=True)
data = Data(X, y)## Split train and test
data.split_train_test(test_size=0.2, random_state=42, inplace=True)
print(data.X_train.shape, data.X_test.shape)## Scaling dataset
data.X_train, scaler_X = data.scale(data.X_train, scaling_methods=("standard", "minmax"))
data.X_test = scaler_X.transform(data.X_test)data.y_train, scaler_y = data.scale(data.y_train, scaling_methods=("standard", "minmax"))
data.y_train = data.y_train.ravel()
data.y_test = scaler_y.transform(data.y_test.reshape(-1, 1)).ravel()# Define param bounds for SVC
# param_bounds = { ==> This is for GridSearchCV, show you how to convert to our MetaSearchCV
# "C": [0.1, 100],
# "gamma": [1e-4, 1],
# "kernel": ["linear", "rbf", "poly"]
# }param_bounds = [
FloatVar(lb=0., ub=100., name="C"),
FloatVar(lb=1e-4, ub=1., name="gamma"),
StringVar(valid_sets=("linear", "rbf", "poly"), name="kernel")
]# Initialize and fit MetaSearchCV
searcher = MetaSearchCV(
estimator=SVR(),
param_bounds=param_bounds,
task_type="regression",
optim="BaseGA",
optim_params={"epoch": 20, "pop_size": 30, "name": "GA"},
cv=3,
scoring="MSE", # or any custom scoring like "F1_macro"
seed=42,
n_jobs=2,
verbose=True,
mode='single', n_workers=None, termination=None
)searcher.fit(data.X_train, data.y_train)
print("Best parameters (Classification):", searcher.best_params)
print("Best model: ", searcher.best_estimator)
print("Best score during searching: ", searcher.best_score)# Make prediction after re-fit
y_pred = searcher.predict(data.X_test)
print("Test Accuracy:", searcher.score(data.X_test, data.y_test))
print("Test Score: ", searcher.scores(data.X_test, data.y_test, list_metrics=("RMSE", "R", "KGE", "NNSE")))
```#### π Example with SVM model for classification task
```python
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from metasklearn import MetaSearchCV, FloatVar, StringVar# Load dataset
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)# Define param bounds for SVC
# param_bounds = { ==> This is for GridSearchCV, show you how to convert to our MetaSearchCV
# "C": [0.1, 100],
# "gamma": [1e-4, 1],
# "kernel": ["linear", "rbf", "poly"]
# }param_bounds = [
FloatVar(lb=0., ub=100., name="C"),
FloatVar(lb=1e-4, ub=1., name="gamma"),
StringVar(valid_sets=("linear", "rbf", "poly"), name="kernel")
]# Initialize and fit MetaSearchCV
searcher = MetaSearchCV(
estimator=SVC(),
param_bounds=param_bounds,
task_type="classification",
optim="BaseGA",
optim_params={"epoch": 20, "pop_size": 30, "name": "GA"},
cv=3,
scoring="AS", # or any custom scoring like "F1_macro"
seed=42,
n_jobs=2,
verbose=True,
mode='single', n_workers=None, termination=None
)searcher.fit(X_train, y_train)
print("Best parameters (Classification):", searcher.best_params)
print("Best model: ", searcher.best_estimator)
print("Best score during searching: ", searcher.best_score)# Make prediction after re-fit
y_pred = searcher.predict(X_test)
print("Test Accuracy:", searcher.score(X_test, y_test))
print("Test Score: ", searcher.scores(X_test, y_test, list_metrics=("AS", "RS", "PS", "F1S")))
```As can be seen, you do it like any other model from Scikit-Learn library such as Random Forest, Decision Tree, XGBoost,...
## π Parameters - Variable Types in MetaSearchCV. How to choose them?
This section explains how to use different types of variables from the `MetaSearchCV` library when defining hyperparameter
search spaces. Each variable type is suitable for different kinds of optimization parameters.#### 1. `IntegerVar` β Integer Variable
```python
from metasklearn import IntegerVarvar = IntegerVar(lb=1, ub=100, name="n_estimators")
```
Used for discrete numerical parameters like number of neighbors in KNN, number of estimators in ensembles, etc.#### 2. `FloatVar` β Float/Continuous Variable
```python
from metasklearn import FloatVarvar = FloatVar(lb=0.001, ub=1.0, name="learning_rate")
```
Used for continuous numerical parameters such as `learning_rate`, `C`, `gamma`, etc.#### 3. `StringVar` β Categorical/String Variable
```python
from metasklearn import StringVarvar = StringVar(valid_sets=("linear", "poly", "rbf"), name="kernel")
```
Used for string parameters with a limited number of choices, e.g., `kernel` in SVM. Value None can be set also.#### 4. `BinaryVar` β Binary Variable (0 or 1)
```python
from metasklearn import BinaryVarvar = BinaryVar(n_vars=1, name="feature_selected")
```
Used in binary feature selection problems or any 0/1-based decision.#### 5. `BoolVar` β Boolean Variable (True or False)
```python
from metasklearn import BoolVarvar = BoolVar(n_vars=1, name="use_bias")
```
Used for Boolean-type arguments such as `fit_intercept`, `use_bias`, etc.#### 6. `CategoricalVar` - A set of mixed discrete variables such as int, float, string, None
```python
from metasklearn import CategoricalVarvar = CategoricalVar(valid_sets=((3., None, "alpha"), (5, 12, 32), ("auto", "exp", "sin")), name="categorical")
```This type of variable is useful when a hyperparameter can take on a predefined set of mixed values,
such as: Mixed types of parameters in optimization tasks (int, string, bool, float,...).#### 7. `SequenceVar` - Variables as tuple, list, or set
```python
from metasklearn import SequenceVarvar = SequenceVar(valid_sets=((10, ), (20, 15), (30, 10, 5)), return_type=list, name="hidden_layer_sizes")
```This type of variable is useful for defining hyperparameters that represent sequences, such as the sizes of hidden layers in a neural network.
#### 8. `PermutationVar` β Permutation Variable
```python
from metasklearn import PermutationVarvar = PermutationVar(valid_set=(1, 2, 5, 10), name="job_order")
```
Used for optimization problems involving permutations, like scheduling or routing.#### 9. `TransferBinaryVar` β Transfer Binary Variable
```python
from metasklearn import TransferBinaryVarvar = TransferBinaryVar(n_vars=1, tf_func="vstf_01", lb=-8., ub=8., all_zeros=True, name="transfer_binary")
```
Used in binary search spaces that support transformation-based metaheuristics.#### 10. `TransferBoolVar` β Transfer Boolean Variable
```python
from metasklearn import TransferBoolVarvar = TransferBoolVar(n_vars=1, tf_func="vstf_01", lb=-8., ub=8., name="transfer_bool")
```
Used in Boolean search spaces with transferable logic between states.#### π§ Example: Define a Mixed Search Space
```python
from metasklearn import (IntegerVar, FloatVar, StringVar, BinaryVar, BoolVar,
PermutationVar, CategoricalVar, SequenceVar, TransferBinaryVar, TransferBoolVar)param_bounds = [
IntegerVar(lb=1, ub=20, name="n_neighbors"),
FloatVar(lb=0.001, ub=1.0, name="alpha"),
StringVar(valid_sets=["uniform", "distance"], name="weights"),
BinaryVar(name="use_feature"),
BoolVar(name="fit_bias"),
PermutationVar(valid_set=(1, 2, 5, 10), name="job_order"),
CategoricalVar(valid_sets=[0.1, "relu", False, None, 3], name="activation_choice"),
SequenceVar(valid_sets=((10,), (20, 10), (30, 50, 5)), name="mixed_choice"),
TransferBinaryVar(name="bin_transfer"),
TransferBoolVar(name="bool_transfer")
]
```
Use this format when designing hyperparameter spaces for advanced models in `MetaSearchCV`.## β Supported Optimizers
`MetaSklearn` integrates all metaheuristic algorithms from Mealpy, including:
+ AOA (Arithmetic Optimization Algorithm)
+ GWO (Grey Wolf Optimizer)
+ PSO (Particle Swarm Optimization)
+ DE (Differential Evolution)
+ WOA, SSA, MVO, and many more...You can pass any optimizer name or an instantiated optimizer object to MetaSearchCV. For more details, please refer to the [link](https://mealpy.readthedocs.io/en/latest/pages/support.html#classification-table)
## π Custom Metrics
You can use custom scoring functions from:+ sklearn.metrics.get_scorer_names()
+ permetrics.RegressionMetric and ClassificationMetric
For details on `PerMetrics` library, please refer to the [link](https://permetrics.readthedocs.io/en/latest/pages/support.html#all-performance-metrics)
## π Documentation
Documentation is available at: π https://metasklearn.readthedocs.io
You can build the documentation locally:
```shell
cd docs
make html
```## π§ͺ Testing
You can run unit tests using:```shell
pytest tests/
```## π€ Contributing
We welcome contributions to `MetaSklearn`! If you have suggestions, improvements, or bug fixes, feel free to fork
the repository, create a pull request, or open an issue.## π License
This project is licensed under the GPLv3 License. See the LICENSE file for more details.## Citation Request
Please include these citations if you plan to use this library:```bibtex
@software{thieu20250510MetaSklearn,
author = {Nguyen Van Thieu},
title = {MetaSklearn: A Metaheuristic-Powered Hyperparameter Optimization Framework for Scikit-Learn Models},
month = June,
year = 2025,
doi = {10.6084/m9.figshare.28978805},
url = {https://github.com/thieu1995/MetaSklearn}
}
```## Official Links
* Official source code repo: https://github.com/thieu1995/MetaSklearn
* Official document: https://metasklearn.readthedocs.io/
* Download releases: https://pypi.org/project/metasklearn/
* Issue tracker: https://github.com/thieu1995/MetaSklearn/issues
* Notable changes log: https://github.com/thieu1995/MetaSklearn/blob/master/ChangeLog.md
* Official chat group: https://t.me/+fRVCJGuGJg1mNDg1---
Developed by: [Thieu](mailto:nguyenthieu2102@gmail.com?Subject=MetaSklearn_QUESTIONS) @ 2025