Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/r-hub/pkgsearch

Search R packages on CRAN
https://github.com/r-hub/pkgsearch

cran package r r-package ranking rstats search-engine

Last synced: about 2 months ago
JSON representation

Search R packages on CRAN

Awesome Lists containing this project

README

        

---
output:
md_document:
variant: markdown_github
toc: true
toc_depth: 3
includes:
before_body: header.md
---

```{r, setup, echo = FALSE, message = FALSE}
knitr::opts_chunk$set(
comment = "#>",
tidy = FALSE,
error = FALSE,
fig.width = 8,
fig.height = 8)
options(width = 90)
options(max.print = 200)
```

## Installation

Install the latest pkgsearch release from CRAN:

```r
install.packages("pkgsearch")
```

The development version is on GitHub:

```r
pak::pak("r-hub/pkgsearch")
```

## Usage

### Search relevant packages

Do you need to find packages solving a particular problem, e.g.
"permutation test"?

```{r search}
library("pkgsearch")
library("pillar") # nicer data frame printing
pkg_search("permutation test")
```

pkgsearch uses an [R-hub](https://docs.r-hub.io) web service and a careful
ranking that puts popular packages before less frequently used ones.

### Do it all *clicking*

For the search mentioned above, and other points of entry to CRAN metadata,
you can use pkgsearch RStudio add-in!

[![Addin screencast](https://raw.githubusercontent.com/r-hub/pkgsearch/main/gifs/addin.gif)](https://vimeo.com/375618736)

Select the "CRAN package search" addin from the menu, or start it with
`pkg_search_addin()`.

### Get package metadata

Do you want to find the dependencies the first versions of `testthat` had
and when each of these versions was released?

```{r history}
cran_package_history("testthat")
```

### Discover packages

Do you want to know what packages are trending on CRAN these days?
`pkgsearch` can help!

```{r trend}
cran_trending()
cran_top_downloaded()
```

### Keep up with CRAN

Are you curious about the latest releases or archivals?

```{r}
cran_events()
```

## Search features

### More details

By default it returns a short summary of the ten best search hits. Their
details can be printed by using the `format = "long"` option of `pkg_search()`,
or just calling `pkg_search()` again, without any arguments, after a search:

```{r}
library(pkgsearch)
pkg_search("C++")
```

```{r}
pkg_search()
```

### Pagination

The `more()` function can be used to display the next batch of search hits,
batches contain ten packages by default. `ps()` is a shorter alias to
`pkg_search()`:

```{r}
ps("google")
```

```{r}
more()
```

### Stemming

The search server uses the stems of the words in the indexed metadata, and
the search phrase. This means that "colour" and "colours" deliver the
exact same result. So do "coloring", "colored", etc. (Unless one is happen
to be an exact package name or match another non-stemmed field.)

```{r}
ps("colour", size = 3)
ps("colours", size = 3)
```

### Ranking

The most important feature of a search engine is the ranking of the results.
The best results should be listed first. pkgsearch uses weighted scoring,
where a match in the package title gets a higher score than a match in the
package description. It also uses the number of reverse dependencies and
the number of downloads to weight the scores:

```{r}
ps("colour")[, c("score", "package", "revdeps", "downloads_last_month")]
```

### Preferring Phrases

The search engine prefers matching whole phrases over single words.
E.g. the search phrase "permutation test" will rank coin higher than
testthat, even though testthat is a much better result for the single word
"test". (In fact, at the time of writing testthat is not even on the first
page of results.)

```{r}
ps("permutation test")
```

If the whole phrase does not match, pkgsearch falls back to individual
matching words. For example, a match from either words is enough here,
to get on the first page of results:

```{r}
ps("test http")
```

### British vs American English

The search engine uses a dictionary to make sure that package metadata
and queries given in British and American English yield the same results.
E.g. note the spelling of colour/color in the results:

```{r}
ps("colour")
ps("color")
```

### Ascii Folding

Especially when searching for package maintainer names, it is convenient
to use the corresponding ASCII letters for non-ASCII characters in search
phrases. E.g. the following two queries yield the same results. Note that
case is also ignored.

```{r}
ps("gabor", size = 5)
ps("Gábor", size = 5)
```

## Configuration

### Options

* `timeout`: pkgsearch follows the `timeout` options for HTTP requests
(i.e. for `pkg_search()` and `advanced_search()`. `timeout` is the limit
for the total time of the HTTP request, and it is in seconds. See
`?options` for details.

## More info

See the [complete documentation](https://r-hub.github.io/pkgsearch/).

## License

MIT @ [Gábor Csárdi](https://github.com/gaborcsardi),
[RStudio](https://github.com/rstudio),
[R Consortium](https://www.r-consortium.org/).