https://github.com/mlr-org/mlr3fselect

Feature selection package of the mlr3 ecosystem.
https://github.com/mlr-org/mlr3fselect

evolutionary-algorithms exhaustive-search feature-selection machine-learning mlr3 optimization r r-package random-search recursive-feature-elimination sequential-feature-selection

Last synced: 3 months ago
JSON representation

Feature selection package of the mlr3 ecosystem.

Host: GitHub
URL: https://github.com/mlr-org/mlr3fselect
Owner: mlr-org
License: lgpl-3.0
Created: 2019-07-24T11:55:21.000Z (almost 6 years ago)
Default Branch: main
Last Pushed: 2024-05-21T15:14:13.000Z (about 1 year ago)
Last Synced: 2024-05-22T11:35:37.359Z (about 1 year ago)
Topics: evolutionary-algorithms, exhaustive-search, feature-selection, machine-learning, mlr3, optimization, r, r-package, random-search, recursive-feature-elimination, sequential-feature-selection
Language: R
Homepage: https://mlr3fselect.mlr-org.com/
Size: 11.7 MB
Stars: 19
Watchers: 7
Forks: 4
Open Issues: 2
Metadata Files:
- Readme: README.Rmd
- License: LICENSE

Awesome Lists containing this project

README

        ---

output: github_document

---

```{r, include = FALSE}

lgr::get_logger("mlr3")$set_threshold("warn")

lgr::get_logger("bbotk")$set_threshold("warn")

set.seed(0)

options(

  datatable.print.nrows = 10,

  datatable.print.class = FALSE,

  datatable.print.keys = FALSE,

  datatable.print.trunc.cols = TRUE,

  width = 100)

# mute load messages

library("mlr3fselect")

```

# mlr3fselect 

Package website: [release](https://mlr3fselect.mlr-org.com/) | [dev](https://mlr3fselect.mlr-org.com/dev/)

[![r-cmd-check](https://github.com/mlr-org/mlr3fselect/actions/workflows/r-cmd-check.yml/badge.svg)](https://github.com/mlr-org/mlr3fselect/actions/workflows/r-cmd-check.yml)

[![CRAN Status](https://www.r-pkg.org/badges/version/mlr3fselect)](https://cran.r-project.org/package=mlr3fselect)

[![StackOverflow](https://img.shields.io/badge/stackoverflow-mlr3-orange.svg)](https://stackoverflow.com/questions/tagged/mlr3)

[![Mattermost](https://img.shields.io/badge/chat-mattermost-orange.svg)](https://lmmisld-lmu-stats-slds.srv.mwn.de/mlr_invite/)

*mlr3fselect* is the feature selection package of the [mlr3](https://mlr-org.com/) ecosystem.

It selects the optimal feature set for any mlr3 [learner](https://github.com/mlr-org/mlr3learners).

The package works with several optimization algorithms e.g. Random Search, Recursive Feature Elimination, and Genetic Search.

Moreover, it can automatically optimize learners and estimate the performance of optimized feature sets with [nested resampling](https://mlr3book.mlr-org.com/chapters/chapter6/feature_selection.html#sec-autofselect).

The package is built on the optimization framework [bbotk](https://github.com/mlr-org/bbotk).

## Resources

There are several section about feature selection in the [mlr3book](https://mlr3book.mlr-org.com).

* Getting started with [wrapper feature selection](https://mlr3book.mlr-org.com/chapters/chapter6/feature_selection.html#sec-fs-wrapper).

* Do a [sequential forward selection](https://mlr3book.mlr-org.com/chapters/chapter6/feature_selection.html#sec-fs-wrapper-example) Palmer Penguins data set.

* Optimize [multiple performance measures](https://mlr3book.mlr-org.com/chapters/chapter6/feature_selection.html#sec-multicrit-featsel).

* Estimate Model Performance with [nested resampling](https://mlr3book.mlr-org.com/chapters/chapter6/feature_selection.html#sec-autofselect).

The [gallery](https://mlr-org.com/gallery.html) features a collection of case studies and demos about optimization.

* Utilize the built-in feature importance of models with [Recursive Feature Elimination](https://mlr-org.com/gallery/optimization/2023-02-07-recursive-feature-elimination/).

* Run a feature selection with [Shadow Variable Search](https://mlr-org.com/gallery/optimization/2023-02-01-shadow-variable-search/).

The [cheatsheet](https://cheatsheets.mlr-org.com/mlr3fselect.pdf) summarizes the most important functions of mlr3fselect.

## Installation

Install the last release from CRAN:

```{r eval = FALSE}

install.packages("mlr3fselect")

```

Install the development version from GitHub:

```{r eval = FALSE}

remotes::install_github("mlr-org/mlr3fselect")

```

## Example

We run a feature selection for a support vector machine on the [Spam](https://mlr3.mlr-org.com/reference/mlr_tasks_spam.html) data set.

```{r}

library("mlr3verse")

tsk("spam")

```

We construct an instance with the `fsi()` function.

The instance describes the optimization problem.

```{r}

instance = fsi(

  task = tsk("spam"),

  learner = lrn("classif.svm", type = "C-classification"),

  resampling = rsmp("cv", folds = 3),

  measures = msr("classif.ce"),

  terminator = trm("evals", n_evals = 20)

)

instance

```

We select a simple random search as the optimization algorithm.

```{r}

fselector = fs("random_search", batch_size = 5)

fselector

```

To start the feature selection, we simply pass the instance to the fselector.

```{r, results='hide'}

fselector$optimize(instance)

```

The fselector writes the best hyperparameter configuration to the instance.

```{r}

instance$result_feature_set

```

And the corresponding measured performance.

```{r}

instance$result_y

```

The archive contains all evaluated hyperparameter configurations.

```{r}

as.data.table(instance$archive)

```

We fit a final model with the optimized feature set to make predictions on new data.

```{r}

task = tsk("spam")

learner = lrn("classif.svm", type = "C-classification")

task$select(instance$result_feature_set)

learner$train(task)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mlr-org/mlr3fselect

Awesome Lists containing this project

README