{"id":19487150,"url":"https://github.com/tylerlittlefield/startrek","last_synced_at":"2025-04-25T18:32:23.019Z","repository":{"id":182545130,"uuid":"189504098","full_name":"tylerlittlefield/startrek","owner":"tylerlittlefield","description":"Tidy Star Trek Transcripts (TNG \u0026 DS9)","archived":false,"fork":false,"pushed_at":"2019-06-02T01:53:03.000Z","size":28802,"stargazers_count":4,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-04T01:32:09.530Z","etag":null,"topics":["r","rstats","startrek","transcripts"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tylerlittlefield.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2019-05-31T01:07:27.000Z","updated_at":"2023-05-18T16:18:56.000Z","dependencies_parsed_at":"2023-07-20T12:30:52.854Z","dependency_job_id":null,"html_url":"https://github.com/tylerlittlefield/startrek","commit_stats":null,"previous_names":["tylerlittlefield/startrek"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tylerlittlefield%2Fstartrek","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tylerlittlefield%2Fstartrek/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tylerlittlefield%2Fstartrek/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tylerlittlefield%2Fstartrek/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tylerlittlefield","download_url":"https://codeload.github.com/tylerlittlefield/startrek/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250872313,"owners_count":21500798,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["r","rstats","startrek","transcripts"],"created_at":"2024-11-10T20:44:13.498Z","updated_at":"2025-04-25T18:32:20.548Z","avatar_url":"https://github.com/tylerlittlefield.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n\npkg_size \u003c- function(package) {\n  root \u003c- find.package(package)\n  rel_paths \u003c- list.files(root, all.files = TRUE, recursive = TRUE)\n  abs_paths \u003c- file.path(root, rel_paths)\n  paste0(round(sum(file.info(abs_paths)$size) / 1e6, 2), \" MB\")\n}\n```\n# startrek \u003cimg src=\"man/figures/logo.png\" align=\"right\" height=150/\u003e\n\u003c!-- badges: start --\u003e\n[![Travis build status](https://travis-ci.org/tyluRp/startrek.svg?branch=master)](https://travis-ci.org/tyluRp/startrek)\n[![AppVeyor build status](https://ci.appveyor.com/api/projects/status/github/tyluRp/startrek?branch=master\u0026svg=true)](https://ci.appveyor.com/project/tyluRp/startrek)\n\u003c!-- badges: end --\u003e\n\nThe goal of startrek is to access Star Trek transcripts in a [`data.frame`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/data.frame.html) for easy analysis. All transcripts have been parsed from text files to a [tidy data](http://vita.had.co.nz/papers/tidy-data.html) format. \n\n```{r, echo=FALSE, dpi=300, message=FALSE, warning=FALSE}\nlibrary(startrek)\nlibrary(tibble)\nlibrary(dplyr)\nlibrary(tidyr)\nlibrary(tidytext)\nlibrary(ggplot2)\n\nset.seed(42)\n\nbind_rows(sample(tng, 4), .id = \"episode\") %\u003e% \n  unnest_tokens(word, line) %\u003e% \n  anti_join(get_stopwords()) %\u003e% \n  inner_join(get_sentiments(\"bing\"), by = \"word\") %\u003e% \n  count(episode, index = id %/% 40, sentiment) %\u003e% \n  spread(sentiment, n, fill = 0) %\u003e% \n  mutate(\n    sentiment = positive - negative,\n    color = ifelse(sentiment \u003c= 0, \"a\", \"b\")\n    ) %\u003e% \n  ggplot(aes(index, sentiment, fill = color)) +\n  geom_col(show.legend = FALSE) +\n  geom_hline(yintercept = 0) +\n  facet_wrap(~episode, ncol = 2, scales = \"free_x\") +\n  theme_bw() +\n  theme(\n    text = element_text(family = \"SFProText-Regular\"),\n    panel.grid = element_blank()\n  )\n```\n\n\n## Installation\n\nKeep in mind that this is a data package which stores the data locally. There aren't any functions which scrape data from a reliable source. As of now, the size of this package is ~`r pkg_size(\"startrek\")`. \n\nIf the size isn't a concern, you can install the development version from GitHub:\n\n``` r\ndevtools::install_github(\"tylurp/startrek\")\n```\n\nOr, download the data to disk from the data folder in this repository.\n\n## Example\n\nTo access an episode transcript from The Next Generation series, see the `tng` list:\n\n```{r example, message=FALSE}\nlibrary(startrek)\nlibrary(tibble)\nlibrary(dplyr)\nlibrary(tidyr)\n\ntng$`The Inner Light`\n```\n\nOr access the entire series and play with the data in creative ways. For example, we might infer character specific episodes by counting the number of lines each character has in each episode:\n\n```{r}\ntng %\u003e% \n  bind_rows(.id = \"episode\") %\u003e% \n  select(episode, everything()) %\u003e% \n  group_by(episode) %\u003e% \n  count(character, sort = TRUE)\n```\n\nThe Deep Space Nine series is also available:\n\n```{r}\nds9$Chimera\n```\n\nIf you want both datasets together, one approach might be to created a nested data frame:\n\n```{r}\nall_episodes \u003c- function(.data, series_name) {\n  .data %\u003e% \n    bind_rows(.id = \"episode\") %\u003e% \n    mutate(series = series_name) %\u003e% \n    select(series, everything())\n}\n\ntng_all \u003c- all_episodes(tng, \"TNG\")\nds9_all \u003c- all_episodes(ds9, \"DS9\")\n\nbind_rows(tng_all, ds9_all) %\u003e% \n  group_by(series, episode) %\u003e% \n  nest() \n```\n\nThe columns have been arranged in a specific order to read from left to right or when using `glimpse()`, top to bottom. For example:\n\n```{r}\nds9$Chimera %\u003e% \n  .[5, ] %\u003e% \n  glimpse()\n```\n\nThe raw text files were parsed using the scripts found in the data-raw folder of this repository. Below is a visual explanation:\n\n```{r parse_visual}\nds9$Emissary %\u003e% \n  .[26, ] %\u003e% \n  glimpse()\n```\n\n```{r echo=FALSE, out.width=\"550px\"}\nknitr::include_graphics(\"man/figures/parse-diagram.png\")\n```\n\n## Acknowledgements\n\n* Transcripts were taken from [Star Trek Minutiae](http://www.st-minutiae.com/resources/scripts/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftylerlittlefield%2Fstartrek","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftylerlittlefield%2Fstartrek","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftylerlittlefield%2Fstartrek/lists"}