Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jumpingrivers/datasaurus
R Package 📦 Containing the Datasaurus Dozen datasets :bar_chart:
https://github.com/jumpingrivers/datasaurus
anscombesquartet datasaurus datasaurus-dozen datasets r r-package rstats summary-statistics
Last synced: about 11 hours ago
JSON representation
R Package 📦 Containing the Datasaurus Dozen datasets :bar_chart:
- Host: GitHub
- URL: https://github.com/jumpingrivers/datasaurus
- Owner: jumpingrivers
- License: other
- Created: 2017-05-01T19:47:21.000Z (over 7 years ago)
- Default Branch: main
- Last Pushed: 2024-02-29T11:39:45.000Z (9 months ago)
- Last Synced: 2024-05-20T01:15:59.435Z (6 months ago)
- Topics: anscombesquartet, datasaurus, datasaurus-dozen, datasets, r, r-package, rstats, summary-statistics
- Language: R
- Homepage: https://jumpingrivers.github.io/datasauRus
- Size: 19.2 MB
- Stars: 309
- Watchers: 15
- Forks: 46
- Open Issues: 1
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
- Code of conduct: .github/CODE_OF_CONDUCT.md
- Support: .github/SUPPORT.md
Awesome Lists containing this project
README
---
output: github_document
---```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/"
)
```# datasauRus
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![CRAN status](https://www.r-pkg.org/badges/version/datasauRus)](https://CRAN.R-project.org/package=datasauRus)
[![R-CMD-check](https://github.com/jumpingrivers/datasauRus/workflows/R-CMD-check/badge.svg)](https://github.com/jumpingrivers/datasauRus/actions)
[![R-CMD-check](https://github.com/jumpingrivers/datasauRus/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/jumpingrivers/datasauRus/actions/workflows/R-CMD-check.yaml)This package wraps the awesome Datasaurus Dozen datasets. The Datasaurus Dozen show us why visualisation is important -- summary statistics can be the same but distributions can be very different. In short, this package gives a fun alternative to [Anscombe's Quartet](https://en.wikipedia.org/wiki/Anscombe%27s_quartet), available in R as `anscombe`.
The original Datasaurus was created by Alberto Cairo. The other Dozen were generated using simulated annealing and the process
is described in the paper "Same Stats, Different Graphs: Generating
Datasets with Varied Appearance and Identical Statistics through
Simulated Annealing" by Justin
Matejka and George Fitzmaurice ([open access materials including manuscript and code](https://www.research.autodesk.com/publications/same-stats-different-graphs/), [official paper](https://doi.org/10.1145/3025453.3025912)).In the paper, Justin and George simulate a variety of datasets that the same summary statistics to the Datasaurus but have very different distributions.
```{r, out.width="600px", fig.alt="Sequential dinosaur gif", echo = FALSE}
knitr::include_graphics("https://damassets.autodesk.net/content/dam/autodesk/research/publications-assets/gifs/same-stats-different-graphs/DinoSequentialSmaller.gif")
```## Install
The latest stable version is available on CRAN```{r, eval = FALSE}
install.packages("datasauRus")
```You can get the latest development version from GitHub, so use {devtools} to install the package
```{r, eval = FALSE}
devtools::install_github("jumpingrivers/datasauRus")
```## Usage
You can use the package to produce Anscombe plots and more.
```{r datasets, fig.height=12, fig.width=9}
library("ggplot2")
library("datasauRus")
ggplot(datasaurus_dozen, aes(x = x, y = y, colour = dataset))+
geom_point() +
theme_void() +
theme(legend.position = "none")+
facet_wrap(~dataset, ncol = 3)
```## Code of Conduct
Please note that the datasauRus project is released with a [Contributor Code of Conduct](https://jumpingrivers.github.io/datasauRus/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms