An open API service indexing awesome lists of open source software.

https://github.com/hrbrmstr/cdx

🕸 Query Web Archive Crawl Indexes ('CDX')
https://github.com/hrbrmstr/cdx

cdx r r-cyber rstats web-archives

Last synced: about 1 month ago
JSON representation

🕸 Query Web Archive Crawl Indexes ('CDX')

Awesome Lists containing this project

README

        

---
output: rmarkdown::github_document
---

# cdx

Query Web Archive Crawl Indexes ('CDX')

## Description

Methods are provided to retrieve web archive crawl index ('CDX') metadata and directly query the 'CDX' 'API' endpoint to retrieve mementos for a given set of parameters.

## What's Inside The Tin

The following functions are implemented:

- `cdx_query`: Query a CDX index endpoint
- `fetch_collections_index`: Fetch collections index

## Installation

```{r eval=FALSE}
devtools::install_github("hrbrmstr/cdx")
```

```{r message=FALSE, warning=FALSE, error=FALSE, include=FALSE}
options(width=120)
```

## Usage

```{r message=FALSE, warning=FALSE, error=FALSE}
library(cdx)
library(tidyverse)

# current verison
packageVersion("cdx")

```

### Example

```{r message=FALSE, warning=FALSE, error=FALSE}
cidx <- fetch_collections_index()

rprj <- cdx_query(cidx$cdx_api[1], "*.r-project.org")

rprj
```