Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/eurobios-mews-labs/active-bagging-learning
This library proposes a plug-in approach to active learning utilizing bagging techniques. Bagging, or bootstrap aggregating, is an ensemble learning method designed to improve the stability and accuracy of machine learning algorithms.
https://github.com/eurobios-mews-labs/active-bagging-learning
active-learning bagging latin-square machine-learning ml surrogate-models uncertainty-sampling
Last synced: about 1 month ago
JSON representation
This library proposes a plug-in approach to active learning utilizing bagging techniques. Bagging, or bootstrap aggregating, is an ensemble learning method designed to improve the stability and accuracy of machine learning algorithms.
- Host: GitHub
- URL: https://github.com/eurobios-mews-labs/active-bagging-learning
- Owner: eurobios-mews-labs
- License: apache-2.0
- Created: 2024-07-01T13:34:27.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-07-30T09:41:33.000Z (5 months ago)
- Last Synced: 2024-07-31T11:17:56.054Z (5 months ago)
- Topics: active-learning, bagging, latin-square, machine-learning, ml, surrogate-models, uncertainty-sampling
- Language: Python
- Homepage:
- Size: 3.08 MB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Active Strategy for surface response estimation
[![License](https://img.shields.io/badge/license-apache_2.0-blue.svg)]( https://github.com/eurobios-mews-labs/active-bagging-learning/blob/master/LICENSE)
![cov](https://github.com/eurobios-mews-labs/active-bagging-learning/blob/coverage-badge/coverage.svg)
[![Maintenance](https://img.shields.io/badge/maintained%3F-yes-green.svg)](https://GitHub.com/eurobios-mews-labs/active-bagging-learning/graphs/commit-activity)This library proposes a plug-in approach to active learning utilizing bagging techniques.
Bagging, or bootstrap aggregating, is an ensemble learning method designed to improve
the stability and accuracy of machine learning algorithms. By leveraging bagging,
we aim to enhance the efficiency of active learning strategies in approximating the target function $`f`$.
* The objective is to approximate function $`f \in \mathcal{X} \rightarrow \mathbb{R}^n`$.
* **Objective :** find an estimation of $`f`$, $`\hat{f}`$ in a family of measurable function $`\mathcal{F}`$ such that $` f^* = \underset{\hat{f} \in \mathcal{F}}{\text{argmin}} \|f - \hat{f} \| `$
* At time $`t`$ we dispose of a set of $`n`$ evaluations $`(x_i, f(x_i))_{i\leqslant n}`$
* All feasible points can be sampled in domain $`\mathcal{X}`$
* This tools enable users to query new point based on uncertainty measure.### Installation
```shell
python -m pip install git+https://github.com/eurobios-mews-labs/active-bagging-learning.git
```### Basic usage
```python
import numpy as np
import pandas as pd
from sklearn.ensemble import ExtraTreesRegressorfrom active_learning import ActiveSurfaceLearner
from active_learning.components.active_criterion import VarianceEnsembleMethod
from active_learning.components.query_strategies import ServiceQueryVariancePDF
from active_learning.benchmark import functionsfun = functions.grammacy_lee_2009 # The function we want to learn
bounds = np.array(functions.bounds[fun]) # [x1 bounds, x2 bounds]
n = 50
X_train = pd.DataFrame(
{'x1': (bounds[0, 0] - bounds[0, 1]) * np.random.rand(n) + bounds[0, 1],
'x2': (bounds[1, 0] - bounds[1, 1]) * np.random.rand(n) + bounds[1, 1],
}) # Initiate distribution
y_train = -fun(X_train)active_criterion = VarianceEnsembleMethod( # Parameters to be used to estimate the surface response
estimator=ExtraTreesRegressor( # Base estimator for the surface
max_features=0.8, bootstrap=True)
)
query_strategy = ServiceQueryVariancePDF(bounds, num_eval=int(20000))# QUERY NEW POINTS
active_learner = ActiveSurfaceLearner(
active_criterion, # Active criterion yields a surface
query_strategy, # Given active criterion surface, execute query
bounds=bounds)active_learner.fit(
X_train, # Input data X
y_train) # Input data y (target))X_new = active_learner.query(3) # Request 3 points
```
To use the approach, one has to dispose of1. An estimator (a set of function) to fit the surface (linear model, gaussian vectors, etc.) in sklearn's API (`base_estimator` parameter)
2. A surface describing an active learning criterion that will adjust the estimator and estimate its variance in some way (`active_criterion` component).
3. A resampling strategy that will take a function (the active criterion surface) and makes it a query (`query_strategy` component).### Exemples
* 1D example :
### Benchmark
![benchmark](.public/active_vs_passive.png)