https://github.com/tmastny/leadr

https://tmastny.github.io/leadr/
https://github.com/tmastny/leadr

Last synced: 3 months ago
JSON representation

https://tmastny.github.io/leadr/

Host: GitHub
URL: https://github.com/tmastny/leadr
Owner: tmastny
License: mit
Created: 2018-03-01T04:09:50.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-05-28T22:10:31.000Z (about 7 years ago)
Last Synced: 2024-07-31T19:28:10.861Z (12 months ago)
Language: R
Homepage:
Size: 35 MB
Stars: 26
Watchers: 2
Forks: 2
Open Issues: 9
Metadata Files:
- Readme: README.Rmd
- License: LICENSE.md

Awesome Lists containing this project

README

---
output: github_document
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# leadr

The goal of leadr is to stream-line model organization in data science projects and Kaggle competitions. The main function `leadr::board` takes a [caret](https://github.com/topepo/caret) model and automatically builds a personal leaderboard for the entire project.

This leaderboard allows you to easily sort models by metric (accuracy, RMSE, etc.) and ensures that you never lose track of a good model during interactive analysis. Check out my [blog post](https://timmastny.rbind.io/blog/kaggle-contest-r-package/) for some background.

## Installation

The package is not currently available on CRAN. You can install the development version with:

```r
# install.packages("devtools")
devtools::install_github("tmastny/leadr")
```

## Getting Started

Let's say you want to build a classifier for the iris data set. We start by initializing an R project with this directory:

```
.
└── iris.Rproj
```

Then we fit our first model.

```r
library(caret)
model <- train(Species ~ ., data = iris, method = 'glmnet')
```

Before leadr, we might create the script `glmnet_1.R` to record the model, save the `train` object as a [.RDS file](https://www.fromthebottomoftheheap.net/2012/04/01/saving-and-loading-r-objects/), and keep track of the accuracy in a spreadsheet.

With leadr, we only need to do the following:

```r
leadr::board(model)
```

## # A tibble: 1 x 13
## rank id dir model metric score public method num group index
##
## 1 1. 1 models… glmnet Accur… 0.964 NA boot 25. 1. , seeds

`board` creates a personal leaderboard for your project that ranks and sorts your model based on the model's metric. The leaderboard tibble has all the information needed to successfully recreate and document any model.

`board` also modifies the project directory:

```
.
├── iris.Rproj
├── leadrboard.RDS
└── models
└── initial
└── model1.RDS
```
By default, `board` saves the leaderboard tibble as a `.RDS` file at the project root and creates the `models` directory. Within `models`, each `caret` model is saved in a subdirectory and named in the order they were ran.

## Interactive

In the previous example, we did everything from the command line and leadr took care of the organization and documentation. In fact, leadr benefits from interactive use in other ways. For example, leadr uses [pillar](https://github.com/r-lib/pillar) and [crayon](https://github.com/r-lib/crayon) to programmatically color outputs:

```{r echo=FALSE, fig.align='center'}
knitr::include_graphics("vignettes/leadr_pic.png")
```

## Vignettes

For a full description of the features, check out my vignettes hosted here:
https://tmastny.github.io/leadr/

1. [Introduction](https://tmastny.github.io/leadr/articles/introduction.html): walkthrough of the basic workflow of leadr

2. [Ensembles](https://tmastny.github.io/leadr/articles/ensemble.html): overview of the tools that leadr provides to make ensemble models

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tmastny/leadr

Awesome Lists containing this project

README