Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/flyte/chromium-pdf-api
A server which uses headless Chromium to visit any URL and create a PDF from it. Uses a simple JSON API to set the URL and options for PDF creation.
https://github.com/flyte/chromium-pdf-api
cdp chrome chromium json-api pdf pdf-converter pdf-generation python python3
Last synced: about 1 month ago
JSON representation
A server which uses headless Chromium to visit any URL and create a PDF from it. Uses a simple JSON API to set the URL and options for PDF creation.
- Host: GitHub
- URL: https://github.com/flyte/chromium-pdf-api
- Owner: flyte
- Created: 2019-05-20T21:57:20.000Z (over 5 years ago)
- Default Branch: develop
- Last Pushed: 2022-08-23T17:48:22.000Z (over 2 years ago)
- Last Synced: 2024-10-27T23:22:36.325Z (3 months ago)
- Topics: cdp, chrome, chromium, json-api, pdf, pdf-converter, pdf-generation, python, python3
- Language: Python
- Homepage: https://hub.docker.com/r/flyte/chromium-pdf-api
- Size: 253 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Chromium PDF API
================A server which uses headless Chromium to visit any URL and create a PDF from it. Uses a simple JSON API to set the URL and options for PDF creation.
Usage
-----Run the container:
```
docker run -ti --rm -p 8080:8080 flyte/chromium-pdf-api
```Make a request:
```
curl -X POST \
--header "Content-Type: application/json" \
--data '{"url": "https://www.google.com"}' \
localhost:8080
```Example request:
```json
{
"url": "https://www.google.com"
}
```Example response:
```json
{
"url": "https://www.google.com",
"pdf": "",
}
```## API
Everything except for the `url` parameter is optional.
```json
{
"url": "",
"max_size": "",
"timeout": "",
"load_timeout": "",
"status_timeout": "",
"print_timeout": "",
"options": {
# This is passed directly through to Chromium as options to the Page.printToPDF
# function. You may omit this entirely, or use any of the options from this URL:
# https://chromedevtools.github.io/devtools-protocol/tot/Page/#method-printToPDF
"landscape": true,
"scale": 0.8,
... etc ...
}
}
```### Errors
PDF (or other CDP response) exceeded the `max_size` set
```json
{
"url": "https://www.google.com",
"max_size": "",
"error": "PDF exceeded maximum size"
}
```Timeout waiting for Chromium to reply with the 'printed' PDF
```json
{
"url": "https://www.google.com",
"print_timeout": "",
"error": "Timeout printing PDF"
}
```## Healthcheck
You may HTTP GET the `/healthcheck/` endpoint to have the server perform a cursory healthcheck to ensure it can communicate with Chromium's DevTools API.
Returns a status code of `200` if communication is successful and `500` if not.
## Memory and concurrency
Chrom(e|ium) has a tendency to guzzle as much memory as it can get its hands on. You may find that this docker image crashes with an error along the lines of:
```
FATAL:memory.cc(22)] Out of memory. size=262144
```In this case, you will need to increase the shared memory size using the `--shm-size=512M` command. The default is only `64M` so you may want to experiment with what size suits you, based on how many tabs you're likely to have open at once.
Another potential issue is how many tabs you really want to have open at once. This is by default limited to 10, but you can set this to whatever you like, using the `PDF_CONCURRENCY` environment variable:
```
docker run -ti --rm -p 8080:8080 -e PDF_CONCURRENCY=2 flyte/chromium-pdf-api
```This can help to plan for the amount of memory your container is going to use, although it really depends how much memory the site you're PDFing uses as well.
## Cooperative Loading
Sometimes you'll want to make sure that any asynchronous content has completed loading before creating your PDF. You may do this by adding an HTML input element to the page with class `pdfloading`, then changing its value to `loaded` using JavaScript once your content is fully ready.
For example:
```html
My PDF
setTimeout(function() {
document.getElementById("loading1").value = "loaded"
}, 5000)
setTimeout(function() {
document.getElementById("loading2").value = "loaded"
}, 8000)
```