An open API service indexing awesome lists of open source software.

https://github.com/ropenspain/enseresp

Information on the Spanish Health Survey
https://github.com/ropenspain/enseresp

dataset health microdata r ropenspain rstats

Last synced: 3 months ago
JSON representation

Information on the Spanish Health Survey

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```

# enseResp

**Author**: Edu Gonzalo Almorox

`r badger::badge_devel("edugonzaloalmorox/enseResp", "blue")``r usethis::use_lifecycle_badge("experimental")`
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://www.tidyverse.org/lifecycle/#experimental)

`enseResp` is a R package to tidily access healthcare data from the [Spanish Health Survey](https://www.mscbs.gob.es/estadEstudios/estadisticas/bancoDatos.htm) (SHS) released by the [Spanish Health Ministry](https://www.mscbs.gob.es/home.htm). The main goal of `enseResp` is to provide data ready for analysis for researchers or other stakeholders interested in exploring health microdata in Spain. The current version of `enseResp` provides information about the SNS editions of 2017/19, 2011/12 and 2006/07 and compiles the surveys associated with the adult, children and household samples.

## Installation

You can install the development version from [GitHub](https://github.com/) with:

``` r
# install.packages("devtools")
devtools::install_github("edugonzaloalmorox/enseResp")
```
## Load main datasets

This package contains surveys formatted to be convenient for being accessed and analysed. The current version of the package is composed of the following datsets:

* `adults_19`: Dataset for adults survey for 2017/19
* `children_19`: Dataset for children survey for 2017/19
* `household_19`: Dataset for household survey for 2017/19
* `adults_12`: Dataset for adults survey for 2011/12
* `children_12`: Dataset for children survey for 2011/12
* `household_12`: Dataset for household survey for 2011/12
* `adults_06`: Dataset for adults survey for 2006/07
* `children_06`: Dataset for children survey for 2006/07
* `household_06`: Dataset for household survey for 2006/07

This is a basic example of how to obtain a dataset. For example, the survey of adults corresponding to 2017-19 survey.

```{r adults}
library(enseResp)
library(dplyr)

enseResp::adults_19
```

## Variables information

`adults_19` contains 455 variables. `adults_19_info` provides information about the description of every variable (in Spanish). In addition it offers other information such as the type of variable, the positions in the text archive or the module the variable belongs to (for example, European Health Survey)

```{r adults_info, message= FALSE, warning = FALSE}
library(enseResp)
library(dplyr)
library(knitr)

enseResp::adults_19_info %>%
select(variable_ine, descripcion_del_campo) %>%
head(10) %>% kable()
```

## Variables values

`enseResp` also provides information on the values for each variable. This is given by the `labels` datasets. `adults_19_labels` renders information on the values associated with the variables that compose `adults_19`. For example, lets check the values in the level of physical activity (variable `T111`)

```{r adults_labels}
library(enseResp)
library(dplyr)
library(knitr)

enseResp::adults_19_labels %>%
filter(variable_ine == "T111") %>%
kable()
```

## Example analysis

```{r example}
library(enseResp)
library(dplyr)
library(knitr)
library(ggplot2)

kids = enseResp::children_19
info = enseResp::children_19_info
labels = enseResp::children_19_labels

# Tidy data --------------------------------

obesity = kids %>%
count(CCAA, IMCm) %>%
mutate_at(vars(IMCm), as.factor) %>%
mutate(IMCm = case_when(IMCm == "1"~ "Peso insuficiente",
IMCm == '2' ~ "Normopeso",
IMCm == '3' ~ "Sobrepeso",
IMCm == '4' ~ "Obesidad",
IMCm == '9' ~ "No consta",
is.na(IMCm) ~ "No disponible"))

obesity$IMCm = factor(obesity$IMCm , levels = c("No disponible",
"No consta",
"Peso insuficiente",
"Normopeso",
"Sobrepeso",
"Obesidad"))

ccaa_lab = labels %>%
filter(variable_ine == "CCAA") %>%
select(valores_ine, valores)

obesity = obesity %>%
left_join(., ccaa_lab, by = c("CCAA" = "valores_ine"))%>%
select(ccaa = valores, IMCm, n)

# Plot ------------------------------------
obesity %>%
group_by(ccaa) %>%
mutate(prop = n/sum(n)) %>%
ungroup() %>%
ggplot(aes(x = 2, y = prop, fill = IMCm)) +
geom_bar(stat = "identity", width = 1, alpha = 0.85) +
facet_wrap(facets=. ~ ccaa) +
xlim(0.5, 2.5) +
coord_polar(theta = "y") +
theme_void() +
scale_fill_brewer(palette = "Dark1") +
labs(title = "Obesidad infantil",
subtitle = "Propocion IMC") +
theme(legend.position = "bottom",
legend.title = element_blank(),
panel.background = element_blank(),
strip.text.x = element_text(
size = 4.75, color = "black", face = "bold"
))

```

# Issues and bugs

If you find issues or bugs while using `enseResp` report the bug [here](https://github.com/edugonzaloalmorox/enseResp/issues) or reach me out on
[Twitter](https://twitter.com/EdudinGonzalo)