Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/sckott/pdfimager

Extract images from pdfs using the pdfimages tool from poppler https://poppler.freedesktop.org/
https://github.com/sckott/pdfimager

pdf pdfimages poppler r rstats

Last synced: 3 months ago
JSON representation

Extract images from pdfs using the pdfimages tool from poppler https://poppler.freedesktop.org/

Awesome Lists containing this project

README

        

pdfimager
=========

```{r echo=FALSE}
knitr::opts_chunk$set(
warning = FALSE,
message = FALSE,
collapse = TRUE,
comment = "#>"
)
```

[![R-check](https://github.com/sckott/pdfimager/workflows/R-check/badge.svg)](https://github.com/sckott/pdfimager/actions/)

`pdfimager` - Extract images from pdfs

Docs:

This packages uses `sys` R package to "shell out" to pdfimages. Apparently pdfimages is not in poppler cpp, so is not in pdftools R pkg

## Install pdfimages

pdfimages is installed when you install poppler

Installation instructions can be found at

## Install pdfimager

```{r eval=FALSE}
# install.packages("pak")
pak::pak("sckott/pdfimager")
```

```{r}
library("pdfimager")
```

## Set the path

Some users may need to manually set the path to `pdfimages`.

You can do so with a function in this package like

```r
pdimg_set_path()
```

or set the path for pdfimages before starting R with an env var like:

```
PDFIMAGER_PATH=C:/some/path/to/poppler/24/bin/pdfimages.exe R
```

Or set within R like:

```r
Sys.setenv(PDFIMAGER_PATH="C:/some/path/to/poppler/24/bin/pdfimages.exe")
```

## help info

```{r}
pdimg_help()
```

## pdf image metadata

```{r}
x <- system.file("examples/BachmanEtal2020.pdf", package="pdfimager")
pdimg_meta(x)
```

## pdf images

```{r}
x <- system.file("examples/BachmanEtal2020.pdf", package="pdfimager")
pdimg_images(x)
```

## filter images

does a variety of thing to filter images by their metadata, some are configureable

```{r}
x1 <- system.file("examples/Tierney2017JOSS.pdf", package="pdfimager")
x2 <- system.file("examples/vanGemert2018.pdf", package="pdfimager")
res <- pdimg_images(c(x1, x2))
vapply(res, NROW, 1)
out <- pdimg_filter(res)
vapply(out, NROW, 1)
```

## Meta

* Please [report any issues or bugs](https://github.com/sckott/pdfimager/issues)
* License: MIT
* Please note that the pdfimager project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.