{"id":20388731,"url":"https://github.com/mkearney/cspan_data","last_synced_at":"2025-04-12T10:51:17.106Z","repository":{"id":124078310,"uuid":"123605398","full_name":"mkearney/cspan_data","owner":"mkearney","description":"A repo for tracking the number of followers of Congress, the Cabinet, and Governors","archived":false,"fork":false,"pushed_at":"2019-04-25T05:00:16.000Z","size":1931052,"stargazers_count":17,"open_issues_count":1,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-26T05:41:53.493Z","etag":null,"topics":["congress","dataset","governors","mkearney-dataset","r","rtweet","the-cabinet","twitter-api","twitter-data"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mkearney.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-03-02T16:50:01.000Z","updated_at":"2024-01-03T00:51:00.000Z","dependencies_parsed_at":null,"dependency_job_id":"bb053568-bfa1-4891-b220-7381da0bc4d7","html_url":"https://github.com/mkearney/cspan_data","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mkearney%2Fcspan_data","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mkearney%2Fcspan_data/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mkearney%2Fcspan_data/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mkearney%2Fcspan_data/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mkearney","download_url":"https://codeload.github.com/mkearney/cspan_data/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248557844,"owners_count":21124165,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["congress","dataset","governors","mkearney-dataset","r","rtweet","the-cabinet","twitter-api","twitter-data"],"created_at":"2024-11-15T03:13:06.258Z","updated_at":"2025-04-12T10:51:17.087Z","avatar_url":"https://github.com/mkearney.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_chunk$set(echo = TRUE, collapse = TRUE, comment = \"#\u003e\")\n```\n\n## cspan_data\n\nTracking users-level data of (a) [Members of Congress](https://twitter.com/cspan/lists/members-of-congress), \n(b) [The Cabinet](https://twitter.com/cspan/lists/the-cabinet/), and (c) [Governors](https://twitter.com/cspan/lists/governors) \nusing CSPAN Twitter lists and the [rtweet package](http://rtweet.info).\n\n## \\#dataviz\n\n### Members of Congress\n\n\u003cp align=\"center\"\u003e\u003cimg width=\"75%\" height=\"auto\" src=\"plots/members-of-congress.png\" /\u003e\u003c/p\u003e\n\n\u0026nbsp;\n\n### The Cabinet\n\n\u003cp align=\"center\"\u003e\u003cimg width=\"75%\" height=\"auto\" src=\"plots/the-cabinet.png\" /\u003e\u003c/p\u003e\n\n\u0026nbsp;\n\n### Governors\n\n\u003cp align=\"center\"\u003e\u003cimg width=\"75%\" height=\"auto\" src=\"plots/governors.png\" /\u003e\u003c/p\u003e\n\n\u0026nbsp;\n\n## Data collection script\n\nData collected using [rtweet](http://rtweet.info)\n\n```{r, eval=FALSE}\n## load rtweet and tidyverse\nlibrary(rtweet)\n\n## define function for getting CSPAN Twitter lists data\nget_cspan_list \u003c- function(slug) {\n  ## get users data of list members\n  x \u003c- lists_members(slug = slug, owner_user = \"CSPAN\")\n  ## document slug\n  x$cspan_list \u003c- slug\n  ## timestamp observations\n  x$timestamp \u003c- Sys.time()\n  ## return data\n  x\n}\n\n## cspan lists\ncspan_lists \u003c- c(\"members-of-congress\", \"the-cabinet\", \"governors\")\n\n## members of congress\ncspan_data \u003c- purrr::map(cspan_lists, get_cspan_list)\n\n## merge into single data frame\ncspan_data \u003c- dplyr::bind_rows(cspan_data)\n````\n\n## Data visualization script\n\nPlots created using [ggplot2](http://ggplot2.org/) and [ggrepel](https://github.com/slowkow/ggrepel)\n\n```{r, eval=FALSE}\n## load tidyverse\nsuppressPackageStartupMessages(library(tidyverse))\n\n## read all files\ndata_files \u003c- list.files(\"data\", full.names = TRUE)\ncspan_data \u003c- map(data_files, readRDS)\n\n## merge into single data set\ncspan_data \u003c- bind_rows(cspan_data)\n\n## shortcuts for subsetting into data sets\ncongress_data \u003c- function(cspan_data) filter(\n  cspan_data, cspan_list == \"members-of-congress\")\ncabinet_data \u003c- function(cspan_data) filter(\n  cspan_data, cspan_list == \"the-cabinet\")\ngovernors_data \u003c- function(cspan_data) filter(\n  cspan_data, cspan_list == \"governors\")\n\n## plot most popular congress accounts\nlibrary(ggrepel)\n\n## hacky function for labels\ntimestamp_range \u003c- function(timestamp) {\n  n \u003c- length(unique(timestamp))\n  x \u003c- seq(min(timestamp), max(timestamp), length.out = (length(timestamp) / n))\n  nas \u003c- rep(as.POSIXct(NA_character_), length(x))\n  c(x, rep(nas, n - 1L))\n}\n\n## member of congress\ncspan_data %\u003e%\n  filter(followers_count \u003e 3e5) %\u003e%\n  congress_data() %\u003e%\n  mutate(followers_count = log10(followers_count)) %\u003e%\n  arrange(timestamp) %\u003e%\n  mutate(x = timestamp_range(timestamp)) %\u003e%\n  group_by(screen_name) %\u003e%\n  mutate(mean = mean(followers_count)) %\u003e%\n  ungroup() %\u003e%\n  ggplot(aes(x = timestamp, y = followers_count, colour = screen_name, label = screen_name)) +\n  theme_mwk(base_family = \"Roboto Condensed\") +\n  theme(legend.position = \"none\") +\n  geom_line() +\n#  geom_point() +\n  geom_label_repel(aes(x = x, y = mean), family = \"Roboto Condensed\") +\n  labs(title = \"Tracking follower counts for members of Congress on Twitter\",\n    subtitle = \"Tracking the number of Twitter followers of members of the Congress over time\",\n    x = NULL, y = \"Number of followers (logged)\",\n    caption = \"\\nSource: Data collected via Twitter's REST API using rtweet (http://rtweet.info\") +\n  ggsave(\"plots/members-of-congress.png\", width = 7, height = 13, units = \"in\")\n\n## cabinet members\ncspan_data %\u003e%\n  cabinet_data() %\u003e%\n  mutate(followers_count = log10(followers_count)) %\u003e%\n  arrange(timestamp) %\u003e%\n  mutate(x = timestamp_range(timestamp)) %\u003e%\n  group_by(screen_name) %\u003e%\n  mutate(mean = mean(followers_count)) %\u003e%\n  ungroup() %\u003e%\n  ggplot(aes(x = timestamp, y = followers_count, colour = screen_name, label = screen_name)) +\n  theme_mwk(base_family = \"Roboto Condensed\") +\n  theme(legend.position = \"none\") +\n  geom_line() +\n#  geom_point() +\n  geom_label_repel(aes(x = x, y = mean), family = \"Roboto Condensed\") +\n  labs(title = \"Tracking follower counts for Cabinet members on Twitter\",\n    subtitle = \"Tracking the number of Twitter followers of members of the Cabinet over time\",\n    x = NULL, y = \"Number of followers (logged)\",\n    caption = \"\\nSource: Data collected via Twitter's REST API using rtweet (http://rtweet.info\") +\n  ggsave(\"plots/the-cabinet.png\", width = 7, height = 13, units = \"in\")\n\n## governors\ncspan_data %\u003e%\n  governors_data() %\u003e%\n  mutate(followers_count = log10(followers_count)) %\u003e%\n  arrange(timestamp) %\u003e%\n  mutate(x = timestamp_range(timestamp)) %\u003e%\n  group_by(screen_name) %\u003e%\n  mutate(mean = mean(followers_count)) %\u003e%\n  ungroup() %\u003e%\n  ggplot(aes(x = timestamp, y = followers_count, colour = screen_name, label = screen_name)) +\n  theme_mwk(base_family = \"Roboto Condensed\") +\n  theme(legend.position = \"none\") +\n  geom_line() +\n#  geom_point() +\n  geom_label_repel(aes(x = x, y = mean), family = \"Roboto Condensed\") +\n  labs(title = \"Tracking follower counts for U.S. Governors on Twitter\",\n    subtitle = \"Tracking the number of Twitter followers of Governors over time\",\n    x = NULL, y = \"Number of followers (logged)\",\n    caption = \"\\nSource: Data collected via Twitter's REST API using rtweet (http://rtweet.info)\") +\n  ggsave(\"plots/governors.png\", width = 7, height = 9, units = \"in\")\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmkearney%2Fcspan_data","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmkearney%2Fcspan_data","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmkearney%2Fcspan_data/lists"}