An open API service indexing awesome lists of open source software.

https://github.com/mhairi/medicalclaims

Medical Claims Data R Package
https://github.com/mhairi/medicalclaims

Last synced: 3 months ago
JSON representation

Medical Claims Data R Package

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# medicalclaims

A data package with a sample of 100,000 anonymised medical claims New Hampshire’s Comprehensive Health Information System (https://nhchis.com/).

## Installation

You can install though GitHub with:

``` r
# install.packages("devtools")
devtools::install_github("mhairi/medicalclaims")
```
## Example

Once you've loaded the package, the data is in an object called `claims`. The data frame has 100,000 rows and 57 variables.

```{r}
library(medicalclaims)
head(claims)
```

Here is how you find the procedures with the highest average cost, only counting procedures that have appeared at least 10 times in the data.

```{r, message=FALSE, warning=FALSE}
library(tidyverse)

claims %>%
group_by(cpt_desc) %>%
summarise(
avg_cost = mean(total_by_n),
n = n()
) %>%
filter(n > 10) %>%
arrange(desc(avg_cost)) %>%
top_n(10, avg_cost)
```

If you want to look at how expensive different diagnoses are, then you first need to summarise over `imputed_service_key` and `icd_diag_01_primary`. This gives us the total spending for each patient and each diagnosis.

```{r}
by_individual <-
claims %>%
group_by(new_diag_desc, imputed_service_key) %>%
summarise(spending = sum(total)) %>%
ungroup
```

Then we can summarise to find the most expensive diagnoses.

```{r}
by_individual %>%
group_by(new_diag_desc) %>%
summarise(
avg_cost = mean(spending),
n = n()
) %>%
filter(n > 10) %>%
arrange(desc(avg_cost)) %>%
top_n(10, avg_cost)
```