https://github.com/quanteda/quanteda.corpora
A collection of corpora for quanteda
https://github.com/quanteda/quanteda.corpora
quanteda text-analysis
Last synced: about 1 month ago
JSON representation
A collection of corpora for quanteda
- Host: GitHub
- URL: https://github.com/quanteda/quanteda.corpora
- Owner: quanteda
- Created: 2017-12-19T15:19:05.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2020-11-09T06:10:16.000Z (over 4 years ago)
- Last Synced: 2025-03-21T02:21:23.916Z (about 2 months ago)
- Topics: quanteda, text-analysis
- Language: R
- Homepage:
- Size: 200 MB
- Stars: 19
- Watchers: 7
- Forks: 5
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
[](https://cran.r-project.org/package=quanteda.corpora)
[](https://travis-ci.org/quanteda/quanteda.corpora)# Corpora for quanteda
Package to provide easy access to large corpora for [**quanteda**](http://github.com/quanteda/quanteda).
## How to Install
You can download the files and build the package from source, or you can use the devtools library to install the package directly from GitHub. This is done as follows:
```r
devtools::install_github("quanteda/quanteda.corpora")
```## Available corpora
Corpora contained in the package are the following:
Corpus | Name
--|--
Amicus curiae briefs from Bakke (1978) and Bollinger (2008) | data_corpus_amicus
Annual budget speeches from the Irish Dáil, 2008-2012 | data_corpus_irishbudgets
UK news articles from 2014 that mention immigration | data_corpus_immigrationnews
Movie reviews from Pang, Lee, and Vaithyanathan (2002) | _moved to_ **quanteda.textmodels**
US State of the Union addresses from 1790 to present | data_corpus_sotu
UK political party manifestos, 1945-2005 | data_corpus_ukmanifestos
UN General Debate speeches, 2017 | data_corpus_ungd2017
Universal Declaration of Human Rights in 464 languages | data_corpus_udhrLarger corpora are also available from online locations using `download()`:
Corpus | Name
--|--
_Guardian_ newspaper articles in politics, economy, society and international sections from 2012 to 2016 | data_corpus_guardian
Transcripts of speeches at Japan's Committee on Foreign Affairs and Defense of the lower house (Shugiin) from 1947 to 2017 | data_corpus_foreignaffairscommittee