https://github.com/gustavobio/rb
Search specimens and download images from the JBRJ (RB) herbarium.
https://github.com/gustavobio/rb
herbaria herbarium jbrj rb
Last synced: 8 months ago
JSON representation
Search specimens and download images from the JBRJ (RB) herbarium.
- Host: GitHub
- URL: https://github.com/gustavobio/rb
- Owner: gustavobio
- Created: 2017-06-30T18:20:19.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2017-07-06T17:59:01.000Z (almost 9 years ago)
- Last Synced: 2024-03-15T13:21:02.833Z (over 2 years ago)
- Topics: herbaria, herbarium, jbrj, rb
- Language: R
- Size: 38.1 KB
- Stars: 2
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## RB: R package to search for records and download images from the RB (Jardim Botânico do Rio de Janeiro) herbarium
### Installation:
RB is not on CRAN, so please install it from github:
```r
# Install devtools
install.packages("devtools")
# Install finch (YOU MUST INSTALL THE VERSION FROM GITHUB IF USING R ON WINDOWS!)
devtools::install_github("gustavobio/finch")
# Install RB:
devtools::install_github("gustavobio/RB")
```
### Usage:
RB tools have three main usage scenarios:
1. Searching for records in the RB database.
2. Navigating images.
3. Downloading images.
#### 1. Searching for records
The main function here is `search_rb`. When you call it for the first time
the package will try to download the DWCA dataset provided by the Jardim Botânico
do Rio de Janeiro. This dataset is a little over 100M, so it may take a while. Memory usage
is also high and will likely impact performance in computers with lower specs:
```r
> library(RB)
> download_rb_data(encoding = "UTF-8")
trying URL 'http://ipt.jbrj.gov.br/jbrj/archive.do?r=jbrj_rb&v=84.109'
downloaded 103.9 MB
Read 691354 rows and 45 (of 45) columns from 0.352 GB file in 00:00:13
Read 625625 rows and 7 (of 7) columns from 0.153 GB file in 00:00:07
153 GB file in 00:00:07
> miconias <- search_rb("Miconia albicans")
```
This call to `search_rb` returns a data frame with 477 rows and 45 columns, including scientific name, family,
collector, collector notes, dates, determiner, county, state, and so on:
```r
> str(miconias)
'data.frame': 477 obs. of 45 variables:
$ id : chr "urn:catalog:JBRJ:RB:741399" "urn:catalog:JBRJ:RB:1045255" "urn:catalog:JBRJ:RB:1045306" "urn:catalog:JBRJ:RB:1045294" ...
$ type : chr "Collection" "Collection" "Collection" "Collection" ...
$ modified : chr "2014-07-08 00:45:06.721283" "2015-11-30 17:52:29.673752" "2015-11-30 17:52:57.868959" "2015-11-30 19:04:47.934176" ...
$ rightsHolder : chr "" "RB" "RB" "RB" ...
$ institutionCode : chr "RB" "RB" "RB" "RB" ...
$ collectionCode : chr "" "RB" "RB" "RB" ...
$ basisOfRecord : chr "PreservedSpecimen" "PreservedSpecimen" "PreservedSpecimen" "PreservedSpecimen" ...
$ occurrenceID : chr "urn:catalog:JBRJ:RB:741399" "urn:catalog:JBRJ:RB:1045255" "urn:catalog:JBRJ:RB:1045306" "urn:catalog:JBRJ:RB:1045294" ...
$ catalogNumber : chr "RB00741399" "RB01045255" "RB01045306" "RB01045294" ...
$ recordNumber : chr "431" "488" "551" "538" ...
$ recordedBy : chr "N.L. Britton; H.H. Rusby" "P. Rosa; T.S.Pereira; A. Pintor & JF.A. Baumgratz" "P. Rosa; T.S.Pereira; A. Pintor & JF.A. Baumgratz" "P. Rosa; T.S.Pereira; A. Pintor & JF.A. Baumgratz" ...
...
```
The first argument in `search_rb` is a scientific name. Please see the helpfile (`?search_rb`) for a list of all possible arguments. You can combine them to refine your search:
```r
# By scientific name and year
> miconias_2015 <- search_rb("Miconia albicans", year = 2015)
> dim(miconias_2015)
[1] 13 45
```
```r
# By genus and county
> miconias_itirapina <- search_rb(genus = "Miconia", county = "Itirapina")
> dim(miconias_itirapina)
[1] 24 45
```
```r
# By scientific name and collector
> myrcias_van <- search_rb("Myrcia guianensis", collector = "Staggemeier")
> dim(myrcias_van)
[1] 3 45
```
#### 2. Navigating images
You can also open images in the default browser using `open_rb_images` and passing the results from `search_rb`:
```r
> open_rb_images(myrcias_van)
```

This will open all images in your browser in new tabs. The maximum number of images is given by the argument `max`, which defaults to 5.
```r
> open_rb_images(myrcias_van, max = 15)
```
You can also tweak image resolution using the argument `width` (the default width in pixels is 600):
```r
> open_rb_images(myrcias_van, width = 3000)
```
If you are just browsing images (for instance, to check plant characteristics) you can do the search directly in `open_rb_images`:
```r
> open_rb_images(scientific_name = "Miconia albicans")
```
This will open the first 5 images in the database. If you want random 5, use the argument `random`:
```r
> open_rb_images(scientific_name = "Miconia albicans", random = TRUE, width = 3000) # files here will be large
```
#### 3. Downloading images
Images can also be downloaded and stored locally, in folders in your current path. See `getwd()` if you don't know where this is. You can change this with `setwd()`.
```r
> getwd()
[1] "/Users/gustavo"
```
The function `download_rb_images` is the workhorse here. For instance, you can download images from a previous search:
```r
> download_rb_images(myrcias_van)
3 images found. Continue? (y/n): y
|===================================================================| 100%
>
```
This call downloads images in the default resolution (in this case, width = 3000) to a local dir:

Please beware that if your search yields several results, downloading all images can impact RB servers. Use discretion here. You could, for instance,
download all images for a given family:
```r
> download_rb_images(family = "Melastomataceae")
26890 images found. Continue? (y/n): y
Downloading 26890 images to /Users/gustavo/RB_images__30_Jun_17_22_37/:
| | 0%
```