Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jemus42/xplainfi

A very early and experimental bit of trying something out.
https://github.com/jemus42/xplainfi

Last synced: 24 days ago
JSON representation

A very early and experimental bit of trying something out.

Host: GitHub
URL: https://github.com/jemus42/xplainfi
Owner: jemus42
License: other
Created: 2024-04-19T14:03:30.000Z (7 months ago)
Default Branch: main
Last Pushed: 2024-10-25T14:10:40.000Z (26 days ago)
Last Synced: 2024-10-25T16:45:39.826Z (26 days ago)
Language: R
Homepage: https://jemus42.github.io/xplainfi/
Size: 1.63 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 7
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- License: LICENSE

Awesome Lists containing this project

README

        ---

output: github_document

editor_options: 

  chunk_output_type: console

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%"

)

# Quiet down

lgr::get_logger("mlr3")$set_threshold("warn")

set.seed(123)

```

# `xplainfi`

[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)

[![R-CMD-check](https://github.com/jemus42/xplainfi/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/jemus42/xplainfi/actions/workflows/R-CMD-check.yaml)

The goal of `xplainfi` is to collect common feature importance methods under a unified and extensible interface.  

For now, it is built specifically around [mlr3](https://mlr-org.com/), as available abstractions for learners, tasks, measures, etc. greatly simplify the implementation of importance measures.

## Installation

You can install the development version of `xplainfi` like so:

``` r

# install.packages(pak)

pak::pak("jemus42/xplainfi")

```

## Example: PFI

Here is a basic example on how to calculate PFI for a given learner and task, using repeated cross-validation as resampling strategy and computing PFI within each resampling 5 times:

```{r}

library(xplainfi)

library(mlr3)

library(mlr3learners)

task = tsk("german_credit")

learner = lrn("classif.ranger", num.trees = 100)

measure = msr("classif.ce")

pfi = PFI$new(

  task = task,

  learner = learner,

  measure = measure,

  resampling = rsmp("repeated_cv", folds = 3, repeats = 2),

  iters_perm = 5

)

```

Compute and print PFI scores:

```{r}

pfi$compute()

```

Retrieve scores later in `pfi$importance`.

When PFI is computed based on resampling with multiple iterations, and / or multiple permutation iterations, the individual scores can be retrieved as a `data.table`:

```{r}

pfi$scores

```

Where `iter_rsmp` corresponds to the resampling iteration, i.e., 3 * 2 = 6 for 2 repeats of 3-fold cross-validation, and `iter_perm` corresponds to the permutation iteration, 5 in this case.

While `pfi$importance` contains the means across all iterations, `pfi$scores` allows you to manually aggregate them in any way you see fit.

In the simplest case, you run PFI with a single resampling iteration (holdout) and a single permutation iteration, and `pfi$importance` will contain the same values as `pfi$scores`.

```{r}

pfi_single = PFI$new(

  task = task,

  learner = learner,

  measure = measure

)

pfi_single$compute()

pfi_single$scores

```