Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mkearney/cspan_data
A repo for tracking the number of followers of Congress, the Cabinet, and Governors
https://github.com/mkearney/cspan_data
congress dataset governors mkearney-dataset r rtweet the-cabinet twitter-api twitter-data
Last synced: 3 months ago
JSON representation
A repo for tracking the number of followers of Congress, the Cabinet, and Governors
- Host: GitHub
- URL: https://github.com/mkearney/cspan_data
- Owner: mkearney
- Created: 2018-03-02T16:50:01.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2019-04-25T05:00:16.000Z (almost 6 years ago)
- Last Synced: 2023-10-20T21:51:08.016Z (over 1 year ago)
- Topics: congress, dataset, governors, mkearney-dataset, r, rtweet, the-cabinet, twitter-api, twitter-data
- Language: R
- Homepage:
- Size: 1.8 GB
- Stars: 16
- Watchers: 4
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.Rmd
Awesome Lists containing this project
README
---
output: github_document
---```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, collapse = TRUE, comment = "#>")
```## cspan_data
Tracking users-level data of (a) [Members of Congress](https://twitter.com/cspan/lists/members-of-congress),
(b) [The Cabinet](https://twitter.com/cspan/lists/the-cabinet/), and (c) [Governors](https://twitter.com/cspan/lists/governors)
using CSPAN Twitter lists and the [rtweet package](http://rtweet.info).## \#dataviz
### Members of Congress
### The Cabinet
### Governors
## Data collection script
Data collected using [rtweet](http://rtweet.info)
```{r, eval=FALSE}
## load rtweet and tidyverse
library(rtweet)## define function for getting CSPAN Twitter lists data
get_cspan_list <- function(slug) {
## get users data of list members
x <- lists_members(slug = slug, owner_user = "CSPAN")
## document slug
x$cspan_list <- slug
## timestamp observations
x$timestamp <- Sys.time()
## return data
x
}## cspan lists
cspan_lists <- c("members-of-congress", "the-cabinet", "governors")## members of congress
cspan_data <- purrr::map(cspan_lists, get_cspan_list)## merge into single data frame
cspan_data <- dplyr::bind_rows(cspan_data)
````## Data visualization script
Plots created using [ggplot2](http://ggplot2.org/) and [ggrepel](https://github.com/slowkow/ggrepel)
```{r, eval=FALSE}
## load tidyverse
suppressPackageStartupMessages(library(tidyverse))## read all files
data_files <- list.files("data", full.names = TRUE)
cspan_data <- map(data_files, readRDS)## merge into single data set
cspan_data <- bind_rows(cspan_data)## shortcuts for subsetting into data sets
congress_data <- function(cspan_data) filter(
cspan_data, cspan_list == "members-of-congress")
cabinet_data <- function(cspan_data) filter(
cspan_data, cspan_list == "the-cabinet")
governors_data <- function(cspan_data) filter(
cspan_data, cspan_list == "governors")## plot most popular congress accounts
library(ggrepel)## hacky function for labels
timestamp_range <- function(timestamp) {
n <- length(unique(timestamp))
x <- seq(min(timestamp), max(timestamp), length.out = (length(timestamp) / n))
nas <- rep(as.POSIXct(NA_character_), length(x))
c(x, rep(nas, n - 1L))
}## member of congress
cspan_data %>%
filter(followers_count > 3e5) %>%
congress_data() %>%
mutate(followers_count = log10(followers_count)) %>%
arrange(timestamp) %>%
mutate(x = timestamp_range(timestamp)) %>%
group_by(screen_name) %>%
mutate(mean = mean(followers_count)) %>%
ungroup() %>%
ggplot(aes(x = timestamp, y = followers_count, colour = screen_name, label = screen_name)) +
theme_mwk(base_family = "Roboto Condensed") +
theme(legend.position = "none") +
geom_line() +
# geom_point() +
geom_label_repel(aes(x = x, y = mean), family = "Roboto Condensed") +
labs(title = "Tracking follower counts for members of Congress on Twitter",
subtitle = "Tracking the number of Twitter followers of members of the Congress over time",
x = NULL, y = "Number of followers (logged)",
caption = "\nSource: Data collected via Twitter's REST API using rtweet (http://rtweet.info") +
ggsave("plots/members-of-congress.png", width = 7, height = 13, units = "in")## cabinet members
cspan_data %>%
cabinet_data() %>%
mutate(followers_count = log10(followers_count)) %>%
arrange(timestamp) %>%
mutate(x = timestamp_range(timestamp)) %>%
group_by(screen_name) %>%
mutate(mean = mean(followers_count)) %>%
ungroup() %>%
ggplot(aes(x = timestamp, y = followers_count, colour = screen_name, label = screen_name)) +
theme_mwk(base_family = "Roboto Condensed") +
theme(legend.position = "none") +
geom_line() +
# geom_point() +
geom_label_repel(aes(x = x, y = mean), family = "Roboto Condensed") +
labs(title = "Tracking follower counts for Cabinet members on Twitter",
subtitle = "Tracking the number of Twitter followers of members of the Cabinet over time",
x = NULL, y = "Number of followers (logged)",
caption = "\nSource: Data collected via Twitter's REST API using rtweet (http://rtweet.info") +
ggsave("plots/the-cabinet.png", width = 7, height = 13, units = "in")## governors
cspan_data %>%
governors_data() %>%
mutate(followers_count = log10(followers_count)) %>%
arrange(timestamp) %>%
mutate(x = timestamp_range(timestamp)) %>%
group_by(screen_name) %>%
mutate(mean = mean(followers_count)) %>%
ungroup() %>%
ggplot(aes(x = timestamp, y = followers_count, colour = screen_name, label = screen_name)) +
theme_mwk(base_family = "Roboto Condensed") +
theme(legend.position = "none") +
geom_line() +
# geom_point() +
geom_label_repel(aes(x = x, y = mean), family = "Roboto Condensed") +
labs(title = "Tracking follower counts for U.S. Governors on Twitter",
subtitle = "Tracking the number of Twitter followers of Governors over time",
x = NULL, y = "Number of followers (logged)",
caption = "\nSource: Data collected via Twitter's REST API using rtweet (http://rtweet.info)") +
ggsave("plots/governors.png", width = 7, height = 9, units = "in")
```