Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ketchbrookanalytics/migrate

An R package for building state transition matrices
https://github.com/ketchbrookanalytics/migrate

credit-risk finance risk-management rstats-package

Last synced: 3 months ago
JSON representation

An R package for building state transition matrices

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
options(width = 999)

knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)

devtools::load_all()
```

# migrate

[![lifecycle](https://img.shields.io/badge/lifecycle-stable-green.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![CRAN status](https://www.r-pkg.org/badges/version/migrate)](https://CRAN.R-project.org/package=migrate)
[![metacran downloads](https://cranlogs.r-pkg.org/badges/migrate)](https://cran.r-project.org/package=migrate)
[![R-CMD-check](https://github.com/ketchbrookanalytics/migrate/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ketchbrookanalytics/migrate/actions/workflows/R-CMD-check.yaml)

The goal of {migrate} is to provide users with an easy set of tools for building *state transition matrices*.


![](man/figures/gt_tbl.png)

## Methodology

{migrate} provides an easy way to calculate absolute or percentage migration within a credit portfolio. The above image shows a typical credit migration matrix using the *absolute* approach; each cell in the grid represents the total balance in the portfolio at 2020-06-30 that started at the Risk Rating represented on the left-hand vertical axis and ended (at 2020-09-30) at the Risk Rating represented on the upper horizontal axis of the matrix. For example, $6.58M moved from a Risk Rating **AAA** at 2020-06-30 to a Risk Rating **AA** at 2020-09-30.

While the above, *absolute*, migration example is typically more of a reporting function, the *percentage* (or probabilistic) methodology is often more of a statistical modeling exercise, often used in credit portfolio risk management. Currently, this package only supports the simple "cohort" methodology. This estimates the probability of moving from state *i* to state *j* in a single time step, echoing a Markov process. We can visualize this in a matrix, for a credit portfolio with *N* unique, ordinal states:

![](man/figures/markov_matrix.png)

### Future Plans for {migrate}

Future development plans for this package include building functionality for the more complex **duration**/**hazard** methodology, including both the *time-homogeneous* and *non-homogeneous* implementations.

## Installation

You can install the released version of {migrate} from [CRAN](https://CRAN.R-project.org) with:

``` {r, eval = FALSE}
install.packages("migrate")
```

And the development version from [GitHub](https://github.com/) with:

``` {r, eval = FALSE}
# install.packages("devtools")
devtools::install_github("ketchbrookanalytics/migrate")
```

## Practical Usage

{migrate} currently only handles transitions between exactly two (2) timepoints. Under the hood, `migrate()` finds the earliest & latest dates in the given *time* variable, and filters out any observations where the *time* value does not match those two dates.

If you are writing a SQL query to get data to be used with `migrate()`, the query would likely look something like this:

```{r, eval = FALSE}
# -- Get the *State* risk status and *Balance* dollar amount for each ID, at two distinct dates

# SELECT ID, Date, State, Balance
# FROM my_database
# WHERE Date IN ('2020-12-31', '2021-06-30')
```

By default, `migrate()` drops observations that belong to IDs found at a single timepoint. However, users can define a *filler state* so that IDs with a single timepoint are not removed but rather migrated from or to this *filler state*. This allows for more flexible handling of such data, ensuring that no information is lost during the migration process. Check [Handle IDs with observations at a single timepoint](https://ketchbrookanalytics.github.io/migrate/articles/migrate.html#handle-ids-with-observations-at-a-single-timepoint) for more information.

## Example

First, load the package using `library()`

```{r load, eval = FALSE}
library(migrate)
```

The package has a built-in mock dataset, which can be loaded into the environment like so:

```{r data, eval = FALSE}
data("mock_credit")

head(mock_credit[order(mock_credit$customer_id), ]) # sort by 'customer_id'
```

```{r data_tbl, echo = FALSE}
head(mock_credit[order(mock_credit$customer_id), ]) |>
knitr::kable(row.names = FALSE)
```

Note that an important feature of the `mock_credit` dataset is that there are exactly two (2) unique values in the `date` column variable; if the `time` argument passed to `migrate()` has more than two (2) unique values, the function will throw an error.

```{r dates}
unique(mock_credit$date)
```

To summarize the migration within the data, use the `migrate()` function

```{r migrate}
migrated_df <- migrate(
data = mock_credit,
id = customer_id,
time = date,
state = risk_rating,
)
head(migrated_df)
```

To create the state transition matrix, use the `build_matrix()` function
```{r matrix}
build_matrix(migrated_df)
```

Or, to do it all in one shot, use the `|>`
```{r pipe}
mock_credit |>
migrate(
id = customer_id,
time = date,
state = risk_rating,
metric = principal_balance,
percent = FALSE,
verbose = FALSE
) |>
build_matrix(
state_start = risk_rating_start,
state_end = risk_rating_end,
metric = principal_balance
)
```