Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/modeloriented/modelstudio
π Interactive Studio for Explanatory Model Analysis
https://github.com/modeloriented/modelstudio
ai explainable explainable-ai explainable-machine-learning explanatory-model-analysis human iml interactive interactivity interpretability interpretable interpretable-machine-learning learning machine model model-visualization r visualization xai
Last synced: 4 days ago
JSON representation
π Interactive Studio for Explanatory Model Analysis
- Host: GitHub
- URL: https://github.com/modeloriented/modelstudio
- Owner: ModelOriented
- License: gpl-3.0
- Created: 2019-07-10T07:59:07.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2023-08-31T11:00:20.000Z (over 1 year ago)
- Last Synced: 2024-10-30T06:54:57.270Z (3 months ago)
- Topics: ai, explainable, explainable-ai, explainable-machine-learning, explanatory-model-analysis, human, iml, interactive, interactivity, interpretability, interpretable, interpretable-machine-learning, learning, machine, model, model-visualization, r, visualization, xai
- Language: R
- Homepage: https://doi.org/10.1007/s10618-023-00924-w
- Size: 36.2 MB
- Stars: 325
- Watchers: 21
- Forks: 32
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# Interactive Studio for Explanatory Model Analysis
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/modelStudio)](https://cran.r-project.org/package=modelStudio)
[![R build status](https://github.com/ModelOriented/modelStudio/workflows/R-CMD-check/badge.svg)](https://github.com/ModelOriented/modelStudio/actions?query=workflow%3AR-CMD-check)
[![Codecov test coverage](https://codecov.io/gh/ModelOriented/modelStudio/branch/master/graph/badge.svg)](https://app.codecov.io/gh/ModelOriented/modelStudio?branch=master)
[![](https://cranlogs.r-pkg.org/badges/grand-total/modelStudio)](https://cranlogs.r-pkg.org/badges/grand-total/modelStudio)
[![JOSS-status](https://joss.theoj.org/papers/9eec8c9d1969fbd44b3ea438a74af911/status.svg)](https://joss.theoj.org/papers/9eec8c9d1969fbd44b3ea438a74af911)## Overview
The `modelStudio` package **automates the explanatory analysis of machine learning predictive models**. It generates advanced interactive model explanations in the form of a **serverless HTML site** with only one line of code. This tool is model-agnostic, therefore compatible with most of the black-box predictive models and frameworks (e.g.Β `mlr/mlr3`, `xgboost`, `caret`, `h2o`, `parsnip`, `tidymodels`, `scikit-learn`, `lightgbm`, `keras/tensorflow`).
The main `modelStudio()` function computes various (instance and model-level) explanations and produces aΒ **customisable dashboard**, which consists of multiple panels for plots with their short descriptions. It is possible to easily **save** the dashboard andΒ **share** it with others. Tools for [Explanatory Model Analysis](https://ema.drwhy.ai) unite with tools for Exploratory Data Analysis to give a broad overview of the model behavior.
explain COVID-19 β
R & Python examples β
More resources β
Interactive EMA β[![](man/figures/demo_small.gif)](https://modelstudio.drwhy.ai/demo.html)
The `modelStudio` package is a part of the [**DrWhy.AI**](http://drwhy.ai) universe.
## Installation
```r
# Install from CRAN:
install.packages("modelStudio")# Install the development version from GitHub:
devtools::install_github("ModelOriented/modelStudio")
```## Simple demo
```r
library("DALEX")
library("ranger")
library("modelStudio")# fit a model
model <- ranger(score ~., data = happiness_train)# create an explainer for the model
explainer <- explain(model,
data = happiness_test,
y = happiness_test$score,
label = "Random Forest")# make a studio for the model
modelStudio(explainer)
```[Save the output](https://modelstudio.drwhy.ai/#save--share) in the form of a HTML file - [**Demo Dashboard**](https://modelstudio.drwhy.ai/demo.html).
[![](man/figures/demo_big.gif)](https://modelstudio.drwhy.ai/demo.html)
## R & Python examples [more](https://modelstudio.drwhy.ai/articles/ms-r-python-examples.html)
-------------------------------
The `modelStudio()` function uses `DALEX` explainers created with `DALEX::explain()` or `DALEXtra::explain_*()`.
```r
# packages for the explainer objects
install.packages("DALEX")
install.packages("DALEXtra")
```### mlr [dashboard](https://modelstudio.drwhy.ai/mlr.html)
Make a studio for the regression `ranger` model on the `apartments` data.
code
```r
# load packages and data
library(mlr)
library(DALEXtra)
library(modelStudio)data <- DALEX::apartments
# split the data
index <- sample(1:nrow(data), 0.7*nrow(data))
train <- data[index,]
test <- data[-index,]# fit a model
task <- makeRegrTask(id = "apartments", data = train, target = "m2.price")
learner <- makeLearner("regr.ranger", predict.type = "response")
model <- train(learner, task)# create an explainer for the model
explainer <- explain_mlr(model,
data = test,
y = test$m2.price,
label = "mlr")# pick observations
new_observation <- test[1:2,]
rownames(new_observation) <- c("id1", "id2")# make a studio for the model
modelStudio(explainer, new_observation)
```### xgboost [dashboard](https://modelstudio.drwhy.ai/xgboost.html)
Make a studio for the classification `xgboost` model on the `titanic` data.
code
```r
# load packages and data
library(xgboost)
library(DALEX)
library(modelStudio)data <- DALEX::titanic_imputed
# split the data
index <- sample(1:nrow(data), 0.7*nrow(data))
train <- data[index,]
test <- data[-index,]train_matrix <- model.matrix(survived ~.-1, train)
test_matrix <- model.matrix(survived ~.-1, test)# fit a model
xgb_matrix <- xgb.DMatrix(train_matrix, label = train$survived)
params <- list(max_depth = 3, objective = "binary:logistic", eval_metric = "auc")
model <- xgb.train(params, xgb_matrix, nrounds = 500)# create an explainer for the model
explainer <- explain(model,
data = test_matrix,
y = test$survived,
type = "classification",
label = "xgboost")# pick observations
new_observation <- test_matrix[1:2, , drop=FALSE]
rownames(new_observation) <- c("id1", "id2")# make a studio for the model
modelStudio(explainer, new_observation)
```-------------------------
The `modelStudio()` function uses `dalex` explainers created with `dalex.Explainer()`.
```console
:: package for the Explainer object
pip install dalex -U
```Use `pickle` Python module and `reticulate` R package to easily make a studio for a model.
```r
# package for pickle load
install.packages("reticulate")
```### scikit-learn [dashboard](https://modelstudio.drwhy.ai/scikitlearn.html)
Make a studio for the regression `Pipeline SVR` model on the `fifa` data.
code
First, use `dalex` in Python:
```python
# load packages and data
import dalex as dx
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVR
from numpy import logdata = dx.datasets.load_fifa()
X = data.drop(columns=['overall', 'potential', 'value_eur', 'wage_eur', 'nationality'], axis=1)
y = log(data.value_eur)# split the data
X_train, X_test, y_train, y_test = train_test_split(X, y)# fit a pipeline model
model = Pipeline([('scale', StandardScaler()), ('svm', SVR())])
model.fit(X_train, y_train)# create an explainer for the model
explainer = dx.Explainer(model, data=X_test, y=y_test, label='scikit-learn')# pack the explainer into a pickle file
explainer.dump(open('explainer_scikitlearn.pickle', 'wb'))
```Then, use `modelStudio` in R:
```r
# load the explainer from the pickle file
library(reticulate)
explainer <- py_load_object("explainer_scikitlearn.pickle", pickle = "pickle")# make a studio for the model
library(modelStudio)
modelStudio(explainer, B = 5)
```### lightgbm [dashboard](https://modelstudio.drwhy.ai/lightgbm.html)
Make a studio for the classification `Pipeline LGBMClassifier` model on the `titanic` data.
code
First, use `dalex` in Python:
```python
# load packages and data
import dalex as dx
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.impute import SimpleImputer
from sklearn.compose import ColumnTransformer
from lightgbm import LGBMClassifierdata = dx.datasets.load_titanic()
X = data.drop(columns='survived')
y = data.survived# split the data
X_train, X_test, y_train, y_test = train_test_split(X, y)# fit a pipeline model
numerical_features = ['age', 'fare', 'sibsp', 'parch']
numerical_transformer = Pipeline(
steps=[
('imputer', SimpleImputer(strategy='median')),
('scaler', StandardScaler())
]
)
categorical_features = ['gender', 'class', 'embarked']
categorical_transformer = Pipeline(
steps=[
('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
('onehot', OneHotEncoder(handle_unknown='ignore'))
]
)preprocessor = ColumnTransformer(
transformers=[
('num', numerical_transformer, numerical_features),
('cat', categorical_transformer, categorical_features)
]
)classifier = LGBMClassifier(n_estimators=300)
model = Pipeline(
steps=[
('preprocessor', preprocessor),
('classifier', classifier)
]
)
model.fit(X_train, y_train)# create an explainer for the model
explainer = dx.Explainer(model, data=X_test, y=y_test, label='lightgbm')# pack the explainer into a pickle file
explainer.dump(open('explainer_lightgbm.pickle', 'wb'))
```Then, use `modelStudio` in R:
```r
# load the explainer from the pickle file
library(reticulate)
explainer <- py_load_object("explainer_lightgbm.pickle", pickle = "pickle")# make a studio for the model
library(modelStudio)
modelStudio(explainer)
```-------------------------------
## Save & share
Save `modelStudio` as a HTML file using buttons on the top of the RStudio Viewer
or with [`r2d3::save_d3_html()`](https://rstudio.github.io/r2d3/articles/publishing.html#save-as-html).
## Citations
If you use `modelStudio`, please cite our [JOSS article](https://joss.theoj.org/papers/10.21105/joss.01798):
```
@article{baniecki2019modelstudio,
title = {{modelStudio: Interactive Studio with Explanations for ML Predictive Models}},
author = {Hubert Baniecki and Przemyslaw Biecek},
journal = {Journal of Open Source Software},
year = {2019},
volume = {4},
number = {43},
pages = {1798},
url = {https://doi.org/10.21105/joss.01798}
}
```For a description and evaluation of the Interactive EMA process, refer to our [DAMI article](https://doi.org/10.1007/s10618-023-00924-w):
```
@article{baniecki2023grammar,
title = {The grammar of interactive explanatory model analysis},
author = {Hubert Baniecki and Dariusz Parzych and Przemyslaw Biecek},
journal = {Data Mining and Knowledge Discovery},
year = {2023},
pages = {1--37},
url = {https://doi.org/10.1007/s10618-023-00924-w}
}
```## More resources
- Introduction to the plots: [Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models](https://ema.drwhy.ai)- Vignettes: [perks and features](https://modelstudio.drwhy.ai/articles/ms-perks-features.html), [R & Python examples](https://modelstudio.drwhy.ai/articles/ms-r-python-examples.html), [modelStudio in R Markdown HTML](https://modelstudio.drwhy.ai/articles/ms-rmarkdown.html)
- Changelog: [NEWS](https://modelstudio.drwhy.ai/news/index.html)
- Conference poster: [ML in PL 2019](https://github.com/ModelOriented/modelStudio/blob/master/misc/MLinPL2019_modelStudio_poster.pdf)
## AcknowledgmentsWork on this package was financially supported by the National Science Centre (Poland) grant `2016/21/B/ST6/02176` and National Centre for Research and Development grant `POIR.01.01.01-00-0328/17`.