Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/KTH-Library/openalex
R package to provide data access to OpenAlex by way of REST API
https://github.com/KTH-Library/openalex
Last synced: 9 days ago
JSON representation
R package to provide data access to OpenAlex by way of REST API
- Host: GitHub
- URL: https://github.com/KTH-Library/openalex
- Owner: KTH-Library
- License: other
- Created: 2022-01-25T09:46:11.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2024-11-27T12:28:34.000Z (16 days ago)
- Last Synced: 2024-11-27T13:27:51.781Z (16 days ago)
- Language: R
- Homepage: https://kth-library.github.io/openalex/
- Size: 3.94 MB
- Stars: 11
- Watchers: 4
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
Awesome Lists containing this project
- jimsghstars - KTH-Library/openalex - R package to provide data access to OpenAlex by way of REST API (R)
README
---
output: github_document
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```# openalex
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![R-CMD-check](https://github.com/KTH-Library/openalex/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/KTH-Library/openalex/actions/workflows/R-CMD-check.yaml)The goal of `openalex` is to provide access to data from [OpenAlex](https://openalex.org) - an open and comprehensive catalog of scholarly papers, authors, institutions and more ... - to R through the [Open Alex REST API](https://docs.openalex.org/api)...
## Installation
You can install the current version of `openalex` from [GitHub](https://github.com/kth-library/openalex) with:
``` r
#install.packages("devtools")
devtools::install_github("kth-library/openalex", dependencies = TRUE)
```## Example
This is a basic example which shows you how to get information for papers and authors:
```{r example, eval=TRUE}
library(openalex)
library(dplyr)
suppressPackageStartupMessages(library(purrr))
library(knitr)iid <-
openalex:::openalex_autocomplete(
query = "Royal Institute of Technology",
entity_type = "institution",
format = "table") |>
head(1) |>
pull("id")data <-
openalex_crawl(entity = "works", verbose = TRUE, fmt = "tables",
query = openalex:::openalex_query(filter =
sprintf("institutions.id:%s,publication_year:2024", iid)))res <- data |> map(head) # return only first six rows from each table
res
```## Rate limits and using an API key
By providing an email address you enter the "polite pool" which provides even less of rate limiting for API requests.
You can provide it in `~/.Renviron` by setting `OPENALEX_USERAGENT=http://github.com/hadley/httr (mailto:your_email@your_institution.org)`.
You can also set it just for the session by using a helper fcn `openalex_polite()` to temporarily set or unset the email used in the user agent string when making API requests:
```{r polite}
library(openalex)# set an email to use for the session
openalex_polite("[email protected]")
# unset, and use default user agent string...
openalex_polite("")
```
A premium subscription API key can be used by setting `OPENALEX_KEY=secret_premium_api_key` in your `.Renviron`, or temporarily in a session using:
```{r premium, eval = FALSE}
library(openalex)# temporarily use a premium subscription API key
openalex_key("secret_premium_api_key")# unset to not use the premium subscription API key
openalex_key("")```
This will make it possible to make API calls that return the latest available records, for example based on recent creation dates or recent last modification timestamps.
```{r updates, eval = TRUE}
# we do not require an API key for the publish date
published_since_ <- openalex_works_published_since(since_days = 7)# but an API key is needed when using "from_created_date" and "from_updated_date" fields.
created_since_7d <- openalex_works_created_since(since_days = 7)
updated_since_1h <- openalex_works_updated_since(since_minutes = 60)# first few rows of each of these retrievals
created_since_7d |> _$work_ids |> head() |> knitr::kable()
updated_since_1h |> _$work_ids |> head() |> knitr::kable()```
## Data source attribution
When data from `openalex` is displayed publicly, this attribution also needs to be displayed:
```{r attribution}
library(openalex)
openalex_attribution()
```