Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kbenoit/sophistication

R package associated with Benoit, Munger and Spirling (2017) paper(s)
https://github.com/kbenoit/sophistication

Last synced: 16 days ago
JSON representation

R package associated with Benoit, Munger and Spirling (2017) paper(s)

Awesome Lists containing this project

README

        

---
output:
md_document:
variant: markdown_github
---

```{r echo = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "##")
```

[![CRAN Version](http://www.r-pkg.org/badges/version/sophistication)](http://cran.r-project.org/package=sophistication)
[![R build status](https://github.com/kbenoit/sophistication/workflows/R-CMD-check/badge.svg)](https://github.com/kbenoit/sophistication/actions)
[![Coverage Status](https://img.shields.io/codecov/c/github/kbenoit/sophistication/master.svg)](https://codecov.io/github/kbenoit/sophistication?branch=master)

## Code for use in measuring the sophistication of political text

"Measuring and Explaining Political Sophistication Through Textual Complexity" by Kenneth Benoit, Kevin Munger, and Arthur Spirling. This package is built on [**quanteda**](http://quanteda.io).

### How to install

Using the **devtools** package:
```{r, eval = FALSE}
devtools::install_github("kbenoit/sophistication")
```

If you have trouble with your **sophistication** installation using **devtools**, check that you have pre-installed **conda** or **miniconda** and are using the correct version of **spacyr**. Try installing **sophistication** with the following steps:

``` {r, eval = FALSE}
devtools::install_github("quanteda/spacyr", build_vignettes = FALSE)
library("spacyr")
spacy_install()
spacy_initialize()
devtools::install_github("kbenoit/sophistication")
```

For more information please see the **spacyr** documentation here: https://cran.r-project.org/web/packages/spacyr/readme/README.html .
### Included Data

new name | original name | description
--------------| -------- | -------
`data_corpus_fifthgrade` | `fifthCorpus` | Fifth-grade reading texts
`data_corpus_crimson` | `crimsonCorpus` | Editorials from the Harvard *Crimson*
`data_corpus_partybroadcast` | `partybcastCorpus` | UK political party broadcasts
`data_corpus_presdebates` | `presDebateCorpus` | US presidential debates 2016

### How to use

```{r}
library("sophistication")

# make the snipepts of one sentence, between 100-350 chars in length
data(data_corpus_sotu, package = "quanteda.corpora")
snippetData <- snippets_make(data_corpus_sotu, nsentence = 1, minchar = 150, maxchar = 250)
# clean up the snippets
snippetData <- snippets_clean(snippetData)

# randomly sample three snippets
set.seed(10)
testData <- snippetData[sample(1:nrow(snippetData), 5), ]

# generate pairs for a minimum spanning tree
(snippetPairsMST <- pairs_regular_make(testData))
```

We can also use the package function to generate "gold" questions based on readability differences:

```{r}
# make a lot of candidate pairs
snippetPairsAll <- pairs_regular_make(snippetData[sample(1:nrow(snippetData), 1000), ])
# make 10 gold from these
pairs_gold_make(snippetPairsAll, n.pairs = 10)
```

There is a lot more than this, of course. Our documentation will improve as we develop the package with an aim to eventual CRAN release.