Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/paulhendricks/scorer
Metrics for scoring machine learning models in R
https://github.com/paulhendricks/scorer
Last synced: 6 days ago
JSON representation
Metrics for scoring machine learning models in R
- Host: GitHub
- URL: https://github.com/paulhendricks/scorer
- Owner: paulhendricks
- License: mit
- Created: 2015-07-17T19:12:42.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2017-07-01T19:37:25.000Z (over 7 years ago)
- Last Synced: 2023-10-20T22:16:52.811Z (about 1 year ago)
- Language: R
- Homepage:
- Size: 540 KB
- Stars: 21
- Watchers: 3
- Forks: 2
- Open Issues: 4
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
Awesome Lists containing this project
README
---
output:
github_document
---```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "inst/imgs/README-"
)
```# scorer
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/scorer)](http://cran.r-project.org/package=scorer)
[![Downloads from the RStudio CRAN mirror](http://cranlogs.r-pkg.org/badges/scorer)](https://cran.rstudio.com/web/packages/scorer/index.html)
[![Build Status](https://travis-ci.org/paulhendricks/scorer.png?branch=master)](https://travis-ci.org/paulhendricks/scorer)
[![Build status](https://ci.appveyor.com/api/projects/status/vuumrc0607xa44q9/branch/master?svg=true)](https://ci.appveyor.com/project/paulhendricks/scorer/branch/master)
[![codecov.io](http://codecov.io/github/paulhendricks/scorer/coverage.svg?branch=master)](http://codecov.io/github/paulhendricks/scorer?branch=master)
[![Project Status: Active - The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/0.1.0/active.svg)](http://www.repostatus.org/#active)`scorer` is a set of tools for quickly scoring models in data science and machine learning. This toolset is written in C++, where possible, for blazing fast performance. This toolset's API follows that of Python's [sklearn.metrics](http://scikit-learn.org/stable/modules/classes.html#sklearn-metrics-metrics) as closely as possible so one can easily switch back and forth between R and Python without too much cognitive dissonance. The following types of metrics are currently implemented in `scorer`:
* Regression metrics (implemented in 0.2.0)
* Classification metrics (implemented in 0.3.0)The following types of metrics are soon to be implemented in `scorer`:
* Multilabel ranking metrics (to be implemented in 0.4.0)
* Clustering metrics (to be implemented in 0.4.0)
* Biclustering metrics (to be implemented in 0.4.0)
* Pairwise metrics (to be implemented in 0.4.0)## Installation
You can install the latest development version from CRAN:
```R
install.packages("scorer")
````Or from GitHub with:
```R
if (packageVersion("devtools") < 1.6) {
install.packages("devtools")
}
devtools::install_github("paulhendricks/scorer")
```If you encounter a clear bug, please file a [minimal reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on [GitHub](https://github.com/paulhendricks/scorer/issues).
## Examples
### Regression metrics
#### Load library and data
```{r}
library("scorer")
packageVersion("scorer")
data(mtcars)
```#### Visualize data
```{r}
library("ggplot2")
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
geom_smooth(method = 'lm') +
expand_limits(x = c(0, 6), y = c(0, 40))
```#### Partition data into train and test sets
```{r}
set.seed(1)
n_train <- floor(nrow(mtcars) * 0.60)
n_test <- nrow(mtcars) - n_train
mask <- sample(c(rep(x = TRUE, times = n_train), rep(x = FALSE, times = n_test)))
mtcars[, "Type"] <- ifelse(mask, "Train", "Test")
train_mtcars <- mtcars[mask, ]
test_mtcars <- mtcars[!mask, ]
ggplot(mtcars, aes(x = wt, y = mpg, color = Type)) +
geom_point() +
expand_limits(x = c(0, 6), y = c(0, 40))
```#### Build a model on train data set
```{r}
model <- lm(mpg ~ wt, data = train_mtcars)
```#### Predict model using the test data set
```{r}
test_mtcars[, "predicted_mpg"] <- predict(model, newdata = test_mtcars)
```#### Score model using various metrics
```{r}
scorer::mean_absolute_error(test_mtcars[, "mpg"], test_mtcars[, "predicted_mpg"])
scorer::mean_squared_error(test_mtcars[, "mpg"], test_mtcars[, "predicted_mpg"])
```#### Build a final model on all the data
```{r}
final_model <- lm(mpg ~ wt, data = mtcars)
```#### Predict final model using the original data set
```{r}
mtcars[, "predicted_mpg"] <- predict(final_model, newdata = mtcars)
```#### Score final model using various metrics
```{r}
scorer::explained_variance_score(mtcars[, "mpg"], mtcars[, "predicted_mpg"])
scorer::unexplained_variance_score(mtcars[, "mpg"], mtcars[, "predicted_mpg"])
scorer::total_variance_score(mtcars[, "mpg"], mtcars[, "predicted_mpg"])
scorer::r2_score(mtcars[, "mpg"], mtcars[, "predicted_mpg"])
```## Citation
To cite package ‘scorer’ in publications use:
```
Paul Hendricks (2016). scorer: Quickly Score Models in Data Science and Machine Learning. R package version 0.2.0. https://CRAN.R-project.org/package=scorer
```A BibTeX entry for LaTeX users is
```
@Manual{,
title = {scorer: Quickly Score Models in Data Science and Machine Learning},
author = {Paul Hendricks},
year = {2016},
note = {R package version 0.2.0},
url = {https://CRAN.R-project.org/package=scorer},
}
```