https://github.com/mhairi/medicalclaims

Medical Claims Data R Package
https://github.com/mhairi/medicalclaims

Last synced: 3 months ago
JSON representation

Medical Claims Data R Package

Host: GitHub
URL: https://github.com/mhairi/medicalclaims
Owner: mhairi
Created: 2020-05-12T16:06:24.000Z (almost 5 years ago)
Default Branch: master
Last Pushed: 2020-05-19T11:44:53.000Z (almost 5 years ago)
Last Synced: 2024-08-13T07:13:03.933Z (6 months ago)
Language: R
Size: 9.2 MB
Stars: 3
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.Rmd

Awesome Lists containing this project

jimsghstars - mhairi/medicalclaims - Medical Claims Data R Package (R)

README

        ---

output: github_document

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%"

)

```

# medicalclaims

A data package with a sample of 100,000 anonymised medical claims New Hampshire’s Comprehensive Health Information System (https://nhchis.com/). 

## Installation

You can install though GitHub with:

``` r

# install.packages("devtools")

devtools::install_github("mhairi/medicalclaims")

```

## Example

Once you've loaded the package, the data is in an object called `claims`. The data frame has 100,000 rows and 57 variables. 

```{r}

library(medicalclaims)

head(claims)

```

Here is how you find the procedures with the highest average cost, only counting procedures that have appeared at least 10 times in the data.

```{r, message=FALSE, warning=FALSE}

library(tidyverse)

claims %>% 

  group_by(cpt_desc) %>%

  summarise(

    avg_cost = mean(total_by_n),

    n = n()

  ) %>% 

  filter(n > 10) %>% 

  arrange(desc(avg_cost)) %>% 

  top_n(10, avg_cost)

```

If you want to look at how expensive different diagnoses are, then you first need to summarise over `imputed_service_key` and `icd_diag_01_primary`. This gives us the total spending for each patient and each diagnosis.

```{r}

by_individual <- 

claims %>% 

  group_by(new_diag_desc, imputed_service_key) %>% 

  summarise(spending = sum(total))  %>% 

  ungroup 

```

Then we can summarise to find the most expensive diagnoses.

```{r}

by_individual %>% 

  group_by(new_diag_desc) %>%

  summarise(

    avg_cost = mean(spending),

    n = n()

  ) %>% 

  filter(n > 10) %>% 

  arrange(desc(avg_cost)) %>% 

  top_n(10, avg_cost)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mhairi/medicalclaims

Awesome Lists containing this project

README