{"id":13857803,"url":"https://github.com/kgjerde/corporaexplorer","last_synced_at":"2025-10-22T05:50:44.563Z","repository":{"id":35104671,"uuid":"176155039","full_name":"kgjerde/corporaexplorer","owner":"kgjerde","description":"An R package for dynamic exploration of text collections","archived":false,"fork":false,"pushed_at":"2024-09-11T11:27:17.000Z","size":6256,"stargazers_count":65,"open_issues_count":0,"forks_count":4,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-10-22T05:50:43.936Z","etag":null,"topics":["corpora","corpus","r","shiny","text-analysis"],"latest_commit_sha":null,"homepage":"https://kgjerde.github.io/corporaexplorer","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kgjerde.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-03-17T20:13:43.000Z","updated_at":"2025-10-06T09:27:12.000Z","dependencies_parsed_at":"2024-09-11T15:11:39.045Z","dependency_job_id":"2e19eb4e-0aa0-4200-a74e-ecb026431366","html_url":"https://github.com/kgjerde/corporaexplorer","commit_stats":{"total_commits":525,"total_committers":1,"mean_commits":525.0,"dds":0.0,"last_synced_commit":"8e9d10b7ac923a56bd1e62757fda5373a7bfc0b9"},"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"purl":"pkg:github/kgjerde/corporaexplorer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kgjerde%2Fcorporaexplorer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kgjerde%2Fcorporaexplorer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kgjerde%2Fcorporaexplorer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kgjerde%2Fcorporaexplorer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kgjerde","download_url":"https://codeload.github.com/kgjerde/corporaexplorer/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kgjerde%2Fcorporaexplorer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":280389295,"owners_count":26322507,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-22T02:00:06.515Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["corpora","corpus","r","shiny","text-analysis"],"created_at":"2024-08-05T03:01:47.452Z","updated_at":"2025-10-22T05:50:44.535Z","avatar_url":"https://github.com/kgjerde.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"---\noutput:\n    github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r setup, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n# corporaexplorer: An R package for dynamic exploration of text collections\n\n\u003c!-- badges: start --\u003e\n[![CRAN status](https://www.r-pkg.org/badges/version/corporaexplorer)](https://cran.r-project.org/package=corporaexplorer)\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n[![R build\nstatus](https://github.com/kgjerde/corporaexplorer/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/kgjerde/corporaexplorer/actions)\n[![DOI](http://joss.theoj.org/papers/10.21105/joss.01342/status.svg)](https://doi.org/10.21105/joss.01342)\n[![Mentioned in Awesome R](https://awesome.re/mentioned-badge.svg)](https://github.com/qinwf/awesome-R)\n\u003c!-- badges: end --\u003e\n\n\n\u003e **\"I really like the application and its simplicity. It looks great and is very functional. ... a nice addition to text analysis tools.\"**  \n\u003e *--[Kenneth Benoit](https://github.com/kbenoit), creator of [quanteda](https://quanteda.io/), professor of computational social science at [LSE](https://www.lse.ac.uk/Methodology/People/Academic-Staff/Kenneth-Benoit/Kenneth-Benoit)*\n\n\u003e **\"I really enjoyed interacting with corporaexplorer.**\n\u003e **This is exciting work that opens up doors for non-technical users.\"**   \n\u003e *--[Tyler Rinker](https://github.com/trinker), creator of [sentimentr](https://github.com/trinker/sentimentr) and [qdap](https://github.com/trinker/qdap)*\n\n\u003c!-- HTML here, in order to add custom font colour in Github Pages--\u003e\n\u003cblockquote\u003e\n\u003cp style=\"color:green\"\u003e\u003cstrong\u003e– Featured in RStudio’s “R Views” blog’s \u003ca href=\"https://rviews.rstudio.com/2019/10/29/sept-2019-top-40-new-r-packages/\"\u003e\u003cstrong\u003e\u003ci\u003e\"Top 40 New R Packages”\u003c/i\u003e\u003c/strong\u003e\u003c/a\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003c/blockquote\u003e\n\n\u003cblockquote\u003e\n\u003cp style=\"color:green\"\u003e\u003cstrong\u003e– Included in \u003ca href=\"https://CRAN.R-project.org/view=NaturalLanguageProcessing\"\u003e\u003ci\u003eCRAN Task View: Natural Language Processing\u003c/i\u003e\u003c/a\u003e\u003c/strong\u003e\u003c/p\u003e\n\u003c/blockquote\u003e\n\n\u003chr\u003e\n\u003cbr\u003e\n\n```{r, out.width = \"100%\", echo = FALSE}\nknitr::include_graphics(\"https://github.com/kgjerde/corporaexplorer/raw/master/man/figures/readme_illustration.png\")\n```\n\n*^Illustration^ ^screenshots^*\n\n\n## What is corporaexplorer?\n\n**corporaexplorer** is an R package that uses the `Shiny` graphical user interface framework for dynamic exploration of text collections.\n\n**corporaexplorer** is designed for use with a wide range of text collections; one example could be a collection of tens of thousands of documents scraped from a governmental website; another example could be the collected works of a novelist; a third example could be the chapters of a single book.\n\n**corporaexplorer**'s intended primary audience are qualitatively oriented researchers who\nrely on close reading of textual documents as part of their academic activity,\nbut the package should also be a useful supplement for those doing quantitative textual research and wishing to visit the texts under study.\nFinally, by offering a convenient way to explore any character vector, it can also be useful for a wide range of other R users.\n\nWhile collecting and preparing the text collections to be explored requires some familiarity with R programming, using the Shiny apps for exploring and extracting documents from the corpus should be fairly intuitive also for those with no programming knowledge, once the apps have been set up by a collaborator. Thus, the aim is for the package to be useful for anyone with a rudimentary knowledge of R -- or with collaborators who have such knowledge.\n\n\n## Installation\n\nTo install the released version from CRAN,\nsimply run the following from an R console:\n\n``` r\ninstall.packages(\"corporaexplorer\")\n```\n\nAlternatively,\nto install the development version from GitHub,\nrun the following from an R console:\n\n``` r\ninstall.packages(\"devtools\")\ndevtools::install_github(\"kgjerde/corporaexplorer\")\n```\n\n**corporaexplorer** works on Mac OS, Windows and Linux.\n(The Shiny apps look much clunkier on Windows than on the other platforms,\nbut the apps are fully functional.)\n\n\n## How to cite\n\nPlease cite the following paper if you use **corporaexplorer** in your research.\n\n\u003e Gjerde, Kristian Lundby. 2019. \"corporaexplorer: An R package for dynamic exploration of text collections.\" _Journal of Open Source Software_ 4 (38): 1342. [https://doi.org/10.21105/joss.01342](https://doi.org/10.21105/joss.01342).\n\nFor a BibTeX entry, use the output from `citation(package = \"corporaexplorer\")`.\n\n\n## Usage\n\nFor usage instructions and example corpora,\nsee the [package web page](https://kgjerde.github.io/corporaexplorer/).\n\n\n## Demo apps\n\nThe package includes two demo apps.\n\nTo explore Jane Austen's novels\n(data accessed through the\n[**janeaustenr**](https://github.com/juliasilge/janeaustenr) package):\n\n``` r\nlibrary(corporaexplorer)\nrun_janeausten_app()\n```\n\nTo explore the US presidents' State of the Union addresses\n(data accessed through the the\n[**sotu**](https://CRAN.R-project.org/package=sotu) package):\n\n``` r\nlibrary(corporaexplorer)\nrun_sotu_app()\n```\n\nFor more info, see\nhttps://kgjerde.github.io/corporaexplorer/articles/jane_austen.html and\nhttps://kgjerde.github.io/corporaexplorer/articles/sotu.html,\nand also the [function references](https://kgjerde.github.io/corporaexplorer/reference/index.html).\n\n## A note on platforms and encoding\n\n**corporaexplorer** works on Mac OS, Windows and Linux,\nand there are some important differences in how R handles text\non the different platforms.\nIf you are working with plain English text,\nthere will most likely be no issues with encoding on any platform.\nUnfortunately, working with\nnon-[ASCII](https://en.wikipedia.org/wiki/ASCII) encoded text in R\n(e.g. non-English characters), *can* be complicated -- in particular on Windows. \n\n**On Mac OS or Linux**, problems with encoding will likely not arise at all.\nIf problems do arise, they can typically be solved by\nmaking the R \"locale\" unicode-friendly (e.g. `Sys.setlocale(\"LC_ALL\", \"en_US.UTF-8\")`).\nNB! This assumes that the text is UTF-8 encoded,\nso if changing the locale in this way does not help,\nmake sure that the text is encoded as UTF-8 characters.\nAlternatively, if you can ascertain the character encoding, set the locale correspondingly.\n\n**On Windows**, things can be much more complicated.\nThe most important thing is to check carefully that\nthe texts appear as expected in `corporaexplorer`'s apps,\nand that the searches function as expected.\nIf there are problems, a good place to start is a\nblog post with the telling title\n[\"Escaping from character encoding hell in R on Windows\"](https://www.r-bloggers.com/2016/06/escaping-from-character-encoding-hell-in-r-on-windows/).\n\nFor (a lot) more information about encoding, see\n[this informative article](https://kunststube.net/encoding/)\nby David C. Zentgraf.\n\n\n## Contributing\n\nContributions in the form of feedback, bug reports and code are most welcome. Ways to contribute:\n\n*  Contact [me](mailto:klg@nupi.no) by email.\n*  Issues and bug reports: [File a GitHub issue](https://github.com/kgjerde/corporaexplorer/issues).\n*  Fork the source code, modify, and issue a [pull request](https://docs.github.com/articles/creating-a-pull-request-from-a-fork/) through the [project GitHub page](https://github.com/kgjerde/corporaexplorer).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkgjerde%2Fcorporaexplorer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkgjerde%2Fcorporaexplorer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkgjerde%2Fcorporaexplorer/lists"}