Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/nikdata/covidvirus

covid virus
https://github.com/nikdata/covidvirus

Last synced: 3 days ago
JSON representation

covid virus

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```

# covidvirus

[![lifecycle](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://www.tidyverse.org/lifecycle/#experimental)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)

The {covidvirus} package provides a tidy format dataset of the 2019 Novel Coronavirus COVID-19 (2019-nCoV) epidemic. The raw data pulled from the Johns Hopkins University Center for Systems Science and Engineering (JHU CCSE) Coronavirus [repository](https://github.com/CSSEGISandData/COVID-19).

This package was inspired by [Rami Krispin's {coronavirus} package](https://github.com/RamiKrispin/coronavirus). The key difference is that Rami's package provides a dataset which must be manually updated from the package author's perspective. In this package, I've used (i.e, respectfully copied) his data retrieval code and modified it to use substantially more "tidyverse" packages such as {janitor}, {lubridate}, etc. Another key difference is the name of columns. Since I've used the package {janitor}, the column names use a 'snake' style and do not have dots '.' (opting for the underscore '_').

## Installation

You can install the development version from [GitHub](https://github.com/) with:

``` r
# install.packages("devtools")
devtools::install_github("nikdata/covidvirus")
```
## Usage

This is a basic example which shows you how to solve a common problem:

```{r example}
library(covidvirus)

corona_virus <- get_cases(wide=FALSE)

```

Similar to Rami's package, the output is as follows:

```{r}
head(corona_virus)
```
```{r}
tail(corona_virus)
```

Here's an example of total cases by region and type (top 10):

```{r}
library(dplyr)

corona_virus %>%
group_by(country_region, type) %>%
summarize(total_cases = sum(cases)) %>%
arrange(desc(total_cases)) %>%
head(20)
```

To manually create a wide dataframe, you can do the following (it is recommended to use the wide=TRUE argument):

```{r, echo=FALSE}
library(tidyr)

corona_virus %>%
filter(date == max(date)) %>%
select(country = country_region, type, cases) %>%
group_by(country, type) %>%
summarize(total_cases = sum(cases, na.rm = T)) %>%
ungroup() %>%
pivot_wider(names_from = type,
values_from = total_cases) %>%
arrange(desc(confirmed)) %>%
head(10)
```

### Wide Dataframe

Sometimes it may be easier to have a "wide" dataframe that enables you to see the number of cases for each type in their own respective columns.

```{r, echo=FALSE}

covidvirus::get_cases(wide = TRUE) %>%
head(10)

```

```{r}
dplyr::glimpse(covidvirus::get_cases(wide = TRUE))
```

## Comments

I would greatly appreciate any feedback you may have. If you find a bug, please file an issue.

Finally, a **HUGE** thanks to Rami Krispin for creating the {coronavirus} package!