Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/softloud/retention

Simulate unity analytics game events daily retention data
https://github.com/softloud/retention

Last synced: about 2 months ago
JSON representation

Simulate unity analytics game events daily retention data

Awesome Lists containing this project

README

        

---
title: README
---

# Retention

`Retention` is an R package that contains datasets related to user activity, user details, and build versions. These datasets can be used for analyses such as user retention, activity patterns, and the impact of different build versions on user activity.

It has functions for simulating builds, users, and activity data, each of which is customisable to control scale of data output.

The core of the data is simulating a decrease in activity according to the number of days a player has been active. So, on day 0, there is a 100% chance of activity, but on day 1, only a 30% chance of activity, after day 30, a very small chance of activity. This mimics how player retention data is usually shaped.

This is done by creating a probability function dependent on days from start of activity, and sampling from a binomial distribution. See `get_activity_probability`and `get_activity` for more details.

## Inspiration

The inspiration for these data is Unity Analytics game events. In order to present retention analytics on game data open source, I need simulated data that imitates the data structure I worked with at a video game studio.

## Installation

You can install the `Retention` package from GitHub using the `devtools` package. Run the following commands in your R console:

``` r
# Install devtools if you haven't already
if (!require(devtools)) {
install.packages("devtools")
}

# Install the Retention package from GitHub
devtools::install_github("softloud/retention")
```

## Using the package

```{r}
library(retention)

```

## Data

The data in this package was generated using the `simulate_retention_data.R` script. The datasets are stored as RDS files, called by the retention package, or accessed as csv files in retention_data/ and are:

1. `user_activity`: This dataset tracks the activity of users across different build versions and dates. It contains 146,463 rows and 3 variables: `user`, `build`, and `activity_date`.

```{r}
dim(retention::user_activity)

retention::user_activity %>% head()

```

2. `users`: This dataset tracks the activity of users from their first build version and date. It contains 47,031 rows and 4 variables: `user`, `first_build`, `activity_start`, and `activity_days`.

```{r}
dim(retention::users)

retention::users %>% head()

```

3. `builds`: This dataset tracks the release information of different build versions. It contains 57 rows and 4 variables: `build`, `release_length`, `release_start`, and `release_end`.

For more detailed information about these datasets, please refer to the documentation in the `pkg_data.R` file.

```{r}

dim(retention::builds)

retention::builds %>% head()

```

## Functions that simulate user activity for different versions of an app

Simulate builds.

```{r}

versions <-
get_versions(
major_change_max = 2,
minor_change_max = 1,
hot_fix_max = 1)

versions

```

Simulate release dates for builds.

```{r}
builds <- builds %>% set_build_releases(release_length_max = 7)

builds

```

Simulate users for builds.

```{r}
users <- get_users(
builds,
new_users_max = 3,
max_activity_days = 14)

users

```

Simulate activity.

```{r}
user_activity_data <- get_activity(builds, users) %>%
dplyr::filter(active_on_date == TRUE)

user_activity_data

```

## Limitation

One limitation of these data is that the simulation assumes that users update when the software is released, which is not necessarily the case. However for the retention analytics I intend to generate with this, that shouldn't be too much of an issue.