Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/friendly/guerry

Maps, data and methods related to Guerry (1833) "Moral Statistics of France"
https://github.com/friendly/guerry

france moral-statistics multivariate-spatial-analysis

Last synced: 9 days ago
JSON representation

Maps, data and methods related to Guerry (1833) "Moral Statistics of France"

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
warning = FALSE, # avoid warnings and messages in the output
message = FALSE,
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%",
width = 90, # line width for text output
dpi = 96
)

par(mar=c(3,3,1,1)+.1)
```

[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://www.tidyverse.org/lifecycle/#stable)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/Guerry)](https://cran.r-project.org/package=Guerry)
[![](https://cranlogs.r-pkg.org/badges/grand-total/Guerry)](https://cran.r-project.org/package=Guerry)
[![DOI](https://zenodo.org/badge/133678938.svg)](https://zenodo.org/badge/latestdoi/133678938)
[![Last Commit](https://img.shields.io/github/last-commit/friendly/Guerry)](https://github.com/friendly/Guery)

# Guerry

**Version**: `r packageVersion("Guerry")`

The `Guerry` package comprises maps of France in 1830, multivariate data from A.-M. Guerry and others, and statistical and
graphic methods related to Guerry's *Moral Statistics of France*. The goal is to facilitate the exploration and
development of statistical and graphic methods for multivariate data in a geo-spatial context of historical interest.

The package stems from [Friendly (2007)](https://www.datavis.ca/papers/guerry-STS241.pdf). For a history of André-Michel Guerry and his work, see Friendly (2022), [_The Life and Work of André-Michel Guerry, Revisited_](https://www.datavis.ca/papers/guerryvie/GuerryLife2-SocSpectrum.pdf).

This figure shows a reproduction of six choropleth maps Guerry used to discuss
the relations among the main "moral variables". All of these are scaled so that
more is better. Guerry asked, do such patterns reflect simply individual behavior,
or are there laws to be discovered in his data?

## Installation

You can install Guerry from CRAN or the development version as follows:

| Version | Command |
|:------------|:------------------------------------------------|
| CRAN | `install.packages("Guerry")` |
| Devel | `remotes::install_github("friendly/Guerry")` |

## Data sets

The Guerry package contains the following data sets:

|Name | Description |
|:-----|:------------|
| `gfrance` | Map of France in 1830 with the `Guerry` data. It is a `SpatialPolygonsDataFrame` object created with the `sp` package.|
| `gfrance85`| The same for the 85 departments excluding Corsica|
| `Guerry` | A collection of 'moral variables' on the 86 departments of France around 1830 from Guerry (1833) and other sources.|
| `Angeville`| Data from d'Angeville (1836) on the population of France.|
| `propensity` | Distribution of crimes against persons at different ages |

## Examples

### Maps

In Guerry's time, the map of France and his data contained 86 departments.
The two base maps in this package are `gfrance` and `gfrance85`. They differ only in that Corsica,
outside the continental boundaries, is excluded in the later.

These two datasets are `SpatialPolygonsDataFrame`s constructed with the [`sp`](https://cran.r-project.org/package=sp)
package. This means
they contain S4 components and have S4 methods

* `gfrance@polygons` the polygon boundaries of the of the 1830 map of France
* `gfrance@data` equivalent to the variables contained in the `Guerry` data set

```{r}
#| width=100
data(gfrance)
names(gfrance) # list the @data variables
```

Thus, you can can just use `plot(gfrance)` to plot the outlines of the departments,

```{r gfrance1}
library(sp)
plot(gfrance)
```

The `spplot` method produces a choropleth map, shaded by a given variable in `gfrance@data`

```{r gfrance2}
spplot(gfrance, "Crime_pers")
```

You can plot the maps for several variables together simply by listing their names in a vector.

```{r gfrance3}
# plot several together
spplot(gfrance, c("Crime_pers", "Crime_prop", "Literacy" ),
layout=c(3,1), main="Guerry's moral variables")
```

But there's a problem here. `spplot` assumes all variables are on the same scale for comparative plots, so
it is best to transform variables to ranks (as Guerry did).
As well, use something like Guerry's pallet, where dark = Worse.

```{r gfrance4}
gfrance$Crime_pers <- rank(gfrance$Crime_pers)
gfrance$Crime_prop <- rank(gfrance$Crime_prop)
gfrance$Literacy <- rank(gfrance$Literacy)

my.palette <- rev(RColorBrewer::brewer.pal(n = 9, name = "PuBu"))
spplot(gfrance, c("Crime_pers", "Crime_prop", "Literacy" ),
names.attr = c("Personal crime", "Property crime", "Literacy"),
col.regions = my.palette, cuts = 8,
layout=c(3,1), as.table=TRUE, main="Guerry's moral variables")
```

For other purposes, you might want to produce the map, shaded by `Region` and adding labels
for the names of the departments. This is illustrated using the `gfrance85` map (excluding Corsica), where
`coordinates()` gets the (X, Y) coordinates of the centroids for each department,
and `text()` for the `sp` object plots the labels.

```{r gfrance85-labels}
#| echo=-1
op <-par(mar=rep(0.1,4))
data(gfrance85)
# extract region and dept names & assign colors
xy <- coordinates(gfrance85) # department centroids
dep.names <- data.frame(gfrance85)[,6]
region.names <- data.frame(gfrance85)[,5]
col.region <- colors()[c(149,254,468,552,26)] # assign colors

plot(gfrance85, col=col.region[region.names])
text(xy, labels=dep.names, cex=0.5)
```

### Plots

Guerry was most interested in determining whether the occurrence of crimes
was related to literacy or other "moral variables". But the idea of
correlation had not been invented, and he was not aware of the
idea of a scatterplot.

Plotting crimes against persons vs. Literacy ("% who can read & write").
In this base R version, we might want to code the point symbols
and colors by regions of France.

```{r ex-bivar1}
#| out.width="60%",
#| fig.height = 5,
#| fig.width = 5,
#| echo=-1
par(mar=c(4,4,1,1)+.1)
data(Guerry)

plot(Crime_pers ~ Literacy, data=Guerry,
col=Region,
pch=(15:19)[Region],
ylab = "Pop. per crime against persons",
xlab = "Percent who can read & write"
)

legend(x="bottomright",
legend = c("Center", "East", "North", "South", "West"),
pch = 15:19,
col = as.factor(levels(Guerry$Region)))
```

Now try this with a data ellipse, and a regression line. This version also uses a
a `loess` smooth and labels the 8 most outlying departments.

```{r ex-bivar2}
#| out.width="60%",
#| fig.height = 5,
#| fig.width = 5,
#| echo = -1
par(mar=c(4,4,1,1)+.1)
library(car)
with(Guerry,{
dataEllipse(Literacy, Crime_pers,
levels = 0.68,
ylim = c(0,40000), xlim = c(0, 80),
ylab="Pop. per crime against persons",
xlab="Percent who can read & write",
pch = 16,
grid = FALSE,
id = list(method="mahal",
n = 8, labels=Department, location="avoid", cex=1.2),
center.pch = 3, center.cex=5,
cex.lab=1.5)

dataEllipse(Literacy, Crime_pers,
levels = 0.95, add=TRUE,
ylim = c(0,40000), xlim = c(0, 80),
lwd=2, lty="longdash",
col="gray",
center.pch = FALSE
)

abline( lm(Crime_pers ~ Literacy), lwd=2)
lines(loess.smooth(Literacy, Crime_pers), col="red", lwd=3)
}
)
```

## Vignettes

The vignette, _Guerry data: Spatial Multivariate Analysis_, written by Stéphane Dray uses his packages
`ade4` and `adegraphics` to illustrate methods for spatial multivariate data that focus on either
the multivariate aspect or the spatial one, as well as some more modern methods that integrate
these simultaneously.

A new vignette, _Guerry data: Multivariate Analysis_, uses Guerry's data to illustrate some graphical
methods for multivariate visualization.

See:

``` r
vignette("MultiSpat", package="Guerry")
vignette("guerry-multivariate", package="Guerry")
```

## Citation

``` r
To cite package ‘Guerry’ in publications use:

Friendly M, Dray S (2021). _Guerry: Maps, Data and Methods Related to Guerry (1833) "Moral Statistics
of France"_. R package version 1.7.4, .

A BibTeX entry for LaTeX users is

@Manual{,
title = {Guerry: Maps, Data and Methods Related to Guerry (1833) "Moral Statistics of France"},
author = {Michael Friendly and Stéphane Dray},
year = {2021},
note = {R package version 1.7.4},
url = {https://CRAN.R-project.org/package=Guerry},
}
```

## References

Angeville, A. d' (1836).
_Essai sur la Statistique de la Population francaise_, Paris: F. Darfour.

Friendly, M. (2007). A.-M. Guerry's Moral Statistics of France: Challenges for Multivariable Spatial Analysis.
*Statistical Science*, **22**, 368-399. https://www.datavis.ca/papers/guerry-STS241.pdf

Friendly, M. (2007).
Supplementary materials for Andre-Michel Guerry's *Moral Statistics of France*:
Challenges for Multivariate Spatial Analysis,
https://www.datavis.ca/gallery/guerry/.

Friendly, M. (2022). The Life and Work of André-Michel Guerry, Revisited.
*Sociological Spectrum*, **42** (1).
https://www.tandfonline.com/doi/full/10.1080/02732173.2022.2078450.
[eprint](https://www.datavis.ca/papers/guerryvie/GuerryLife2-SocSpectrum.pdf)