{"id":13857767,"url":"https://github.com/ropensci/jstor","last_synced_at":"2025-10-22T04:46:29.909Z","repository":{"id":48276520,"uuid":"114109118","full_name":"ropensci/jstor","owner":"ropensci","description":"Import journal data from DfR (JSTOR)","archived":false,"fork":false,"pushed_at":"2025-04-29T13:31:17.000Z","size":6437,"stargazers_count":47,"open_issues_count":17,"forks_count":10,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-10-17T23:30:52.173Z","etag":null,"topics":["jstor","peer-reviewed","r","r-package","rstats","text-analysis","text-mining"],"latest_commit_sha":null,"homepage":"https://docs.ropensci.org/jstor","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ropensci.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2017-12-13T10:48:30.000Z","updated_at":"2025-10-06T09:23:13.000Z","dependencies_parsed_at":"2025-06-02T19:33:06.530Z","dependency_job_id":null,"html_url":"https://github.com/ropensci/jstor","commit_stats":{"total_commits":743,"total_committers":6,"mean_commits":"123.83333333333333","dds":"0.018842530282637937","last_synced_commit":"4bc5be455a9482f921c2b6172660f759bd8a3d31"},"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/ropensci/jstor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2Fjstor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2Fjstor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2Fjstor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2Fjstor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ropensci","download_url":"https://codeload.github.com/ropensci/jstor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci%2Fjstor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":280360116,"owners_count":26317437,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-21T02:00:06.614Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["jstor","peer-reviewed","r","r-package","rstats","text-analysis","text-mining"],"created_at":"2024-08-05T03:01:46.390Z","updated_at":"2025-10-22T04:46:29.585Z","avatar_url":"https://github.com/ropensci.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r setup, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n# jstor: Import and Analyse Data from Scientific Articles\n\n**Author:** [Thomas Klebel](https://thomasklebel.eu) \u003cbr\u003e\n**License:** [GPL v3.0](https://www.gnu.org/licenses/gpl-3.0.en.html)\n\n[![R-CMD-check](https://github.com/ropensci/jstor/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/ropensci/jstor/actions/workflows/check-standard.yaml)\n[![AppVeyorBuild status](https://ci.appveyor.com/api/projects/status/sry2gtwam7qyfw6l?svg=true)](https://ci.appveyor.com/project/tklebel/jstor)\n[![Coverage status](https://codecov.io/gh/ropensci/jstor/branch/master/graph/badge.svg)](https://codecov.io/github/ropensci/jstor?branch=master)\n[![lifecycle](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://www.tidyverse.org/lifecycle/#maturing)\n[![CRAN status](http://www.r-pkg.org/badges/version/jstor)](https://cran.r-project.org/package=jstor)\n[![CRAN\\_Download\\_Badge](http://cranlogs.r-pkg.org/badges/grand-total/jstor)](https://CRAN.R-project.org/package=jstor)\n[![rOpenSci badge](https://badges.ropensci.org/189_status.svg)](https://github.com/ropensci/onboarding/issues/189)\n[![JOSS badge](http://joss.theoj.org/papers/ba29665c4bff35c37c0ef68cfe356e44/status.svg)](http://joss.theoj.org/papers/ba29665c4bff35c37c0ef68cfe356e44)\n[![Zenodo DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.1169861.svg)](https://doi.org/10.5281/zenodo.1169861)\n  \nThe tool [Data for Research (DfR)](http://www.jstor.org/dfr/) by JSTOR is a\nvaluable source for citation analysis and text mining. `jstor`\nprovides functions and suggests workflows for importing\ndatasets from DfR. It was developed to deal with very large datasets which\nrequire an agreement, but can be used with smaller ones as well.\n\n**Note**: As of 2021, JSTOR has moved changed the way they provide data to a new \nplatform called [Constellate](https://constellate.org/). The package `jstor` has\nnot been adapted to this change, and might therefore only be used for legacy \ndata that was optained from the old DfR platform.\n\nThe most important set of functions is a group of `jst_get_*` functions:\n\n- `jst_get_article`\n- `jst_get_authors`\n- `jst_get_references`\n- `jst_get_footnotes`\n- `jst_get_book`\n- `jst_get_chapters`\n- `jst_get_full_text`\n- `jst_get_ngram`\n\n\nAll functions which are concerned with meta data (therefore excluding\n`jst_get_full_text` and `jst_get_ngram`) operate along the same lines:\n\n1. The file is read with `xml2::read_xml()`.\n2. Content of the file is extracted via XPATH or CSS-expressions.\n3. The resulting data is returned in a `tibble`.\n\n\n## Installation\n\nTo install the package use: \n\n```{r, eval=FALSE}\ninstall.packages(\"jstor\")\n```\n\n\nYou can install the development version from GitHub with:\n\n```{r gh-installation, eval = FALSE}\n# install.packages(\"remotes\")\nremotes::install_github(\"ropensci/jstor\")\n```\n\n## Usage\nIn order to use `jstor`, you first need to load it:\n```{r}\nlibrary(jstor)\nlibrary(magrittr)\n```\n\nThe basic usage is simple: supply one of the `jst_get_*`-functions with a path\nand it will return a tibble with the extracted information.\n```{r, results='asis'}\njst_get_article(jst_example(\"article_with_references.xml\")) %\u003e% knitr::kable()\n\njst_get_authors(jst_example(\"article_with_references.xml\")) %\u003e% knitr::kable()\n```\n\nFurther explanations, especially on how to use jstor's functions for importing\nmany files, can be found in the vignettes.\n\n## Getting started\nIn order to use `jstor`, you need some data from DfR. From the\n[main page](http://www.jstor.org/dfr/) you can create a dataset by searching for\nterms and restricting the search regarding time, subject and content type. After\nyou created an account, you can download your selection. Alternatively,\nyou can download \n[sample datasets](http://www.jstor.org/dfr/about/sample-datasets) with documents\nfrom before 1923 for the US, and before 1870 for all other countries. \n\n## Supported Elements\nIn their [technical specifications](http://www.jstor.org/dfr/about/technical-specifications), \nDfR lists fields which should be reliably present in all articles and books.\n\nThe following table gives an overview, which elements are supported by `jstor`.\n\n### Articles\n|`xml`-field                       |reliably present |supported in `jstor`|\n|:---------------------------------|:----------------|:-------------------|\n|journal-id (type=\"jstor\")         |x                |x                   |\n|journal-id (type=\"publisher-id\")  |x                |x                   |\n|journal-id (type=\"doi\")           |                 |x                   |\n|issn                              |x                |                    |\n|journal-title                     |x                |x                   |\n|publisher-name                    |x                |                    |\n|article-id (type=\"doi\")           |x                |x                   |\n|article-id (type=\"jstor\")         |x                |x                   |\n|article-id (type=\"publisher-id\")  |                 |x                   |\n|article-type                      |                 |x                   |\n|volume                            |                 |x                   |\n|issue                             |                 |x                   |\n|article-categories                |x                |                    |\n|article-title                     |x                |x                   |\n|contrib-group                     |x                |x                   |\n|pub-date                          |x                |x                   |\n|fpage                             |x                |x                   |\n|lpage                             |                 |x                   |\n|page-range                        |                 |x                   |\n|product                           |x                |                    |\n|self-uri                          |x                |                    |\n|kwd-group                         |x                |                    |\n|custom-meta-group                 |x                |x                   |\n|fn-group (footnotes)              |                 |x                   |\n|ref-list (references)             |                 |x                   |\n\n\n\n### Books\n|`xml`-field                       |reliably present |supported in `jstor`|\n|:---------------------------------|:----------------|:-------------------|\n|book-id (type=\"jstor\")            |x                |x                   |\n|discipline                        |x                |x                   |\n|call-number                       |x                |                    |\n|lcsh                              |x                |                    |\n|book-title                        |x                |x                   |\n|book-subtitle                     |                 |x                   |\n|contrib-group                     |x                |x                   |\n|pub-date                          |x                |x                   |\n|isbn                              |x                |x                   |\n|publisher-name                    |x                |x                   |\n|publisher-loc                     |x                |x                   |\n|permissions                       |x                |                    |\n|self-uri                          |x                |                    |\n|counts                            |x                |x                   |\n|custom-meta-group                 |x                |x                   |\n\n\n### Book Chapters\n|`xml`-field                       |reliably present |supported in `jstor`|\n|:---------------------------------|:----------------|:-------------------|\n|book-id (type=\"jstor\")            |x                |x                   |\n|part_id                           |x                |x                   |\n|part_label                        |x                |x                   |\n|part-title                        |x                |x                   |\n|part-subtitle                     |                 |x                   |\n|contrib-group                     |x                |x                   |\n|fpage                             |x                |x                   |\n|abstract                          |x                |x                   |\n\n\n## Code of conduct\nPlease note that this project is released with a \n[Contributor Code of Conduct](CONDUCT.md).\nBy participating in this project you agree to abide by its terms.\n\n## Citation\nTo cite `jstor`, please refer to `citation(package = \"jstor\")`:\n\n```\nKlebel (2018). jstor: Import and Analyse Data from Scientific Texts. Journal of \nOpen Source Software, 3(28), 883, https://doi.org/10.21105/joss.00883\n```\n\n## Acknowledgements\nWork on `jstor` benefited from financial support for the project \"Academic\nSuper-Elites in Sociology and Economics\" by the Austrian Science Fund (FWF), \nproject number \"P 29211 Einzelprojekte\".\n\nSome internal functions regarding file paths and example files were adapted from\nthe package `readr`.\n\n[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fropensci%2Fjstor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fropensci%2Fjstor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fropensci%2Fjstor/lists"}