{"id":13393329,"url":"https://github.com/alvarcarto/url-to-pdf-api","last_synced_at":"2025-05-14T05:10:27.350Z","repository":{"id":38272612,"uuid":"105162117","full_name":"alvarcarto/url-to-pdf-api","owner":"alvarcarto","description":"Web page PDF/PNG rendering done right. Self-hosted service for rendering receipts, invoices, or any content.","archived":false,"fork":false,"pushed_at":"2024-01-18T12:49:11.000Z","size":5134,"stargazers_count":7054,"open_issues_count":58,"forks_count":784,"subscribers_count":123,"default_branch":"master","last_synced_at":"2025-04-11T00:01:56.233Z","etag":null,"topics":["chrome","headless","headless-chrome","heroku","heroku-button","html","invoice","pdf","puppeteer","receipt"],"latest_commit_sha":null,"homepage":"","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alvarcarto.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-09-28T14:56:57.000Z","updated_at":"2025-04-09T08:29:09.000Z","dependencies_parsed_at":"2024-01-06T01:15:23.507Z","dependency_job_id":"c815d44c-5f78-4e70-8a6b-46281c211c1d","html_url":"https://github.com/alvarcarto/url-to-pdf-api","commit_stats":{"total_commits":141,"total_committers":31,"mean_commits":4.548387096774194,"dds":0.3971631205673759,"last_synced_commit":"2fa83fd9f88613b76ba452fe25830e98d844f3b0"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alvarcarto%2Furl-to-pdf-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alvarcarto%2Furl-to-pdf-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alvarcarto%2Furl-to-pdf-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alvarcarto%2Furl-to-pdf-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alvarcarto","download_url":"https://codeload.github.com/alvarcarto/url-to-pdf-api/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254076849,"owners_count":22010611,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chrome","headless","headless-chrome","heroku","heroku-button","html","invoice","pdf","puppeteer","receipt"],"created_at":"2024-07-30T17:00:50.422Z","updated_at":"2025-05-14T05:10:27.305Z","avatar_url":"https://github.com/alvarcarto.png","language":"HTML","funding_links":[],"categories":["Opensource projects","HTML","JAVASCRIPT","📦 Legacy \u0026 Inactive Projects","服务","APIs","chrome","Services"],"sub_categories":["贡献","Misc/Multi-language"],"readme":"[![Deploy](https://www.herokucdn.com/deploy/button.svg)](https://heroku.com/deploy?template=https://github.com/alvarcarto/url-to-pdf-api)\n\n[![Build Status](https://travis-ci.org/alvarcarto/url-to-pdf-api.svg?branch=master)](https://travis-ci.org/alvarcarto/url-to-pdf-api)\n\n# URL to PDF Microservice\n\n\u003e Web page PDF rendering done right. Microservice for rendering receipts, invoices, or any content. Packaged to an easy API.\n\n![Logo](docs/logo.png)\n\n**⚠️ WARNING ⚠️** *Don't serve this API publicly to the internet unless you are aware of the\nrisks. It allows API users to run any JavaScript code inside a Chrome session on the server.\nIt's fairly easy to expose the contents of files on the server. You have been warned!. See https://github.com/alvarcarto/url-to-pdf-api/issues/12 for background.*\n\n**⭐️ Features:**\n\n* Converts any URL or HTML content to a PDF file or an image (PNG/JPEG)\n* Rendered with Headless Chrome, using [Puppeteer](https://github.com/GoogleChrome/puppeteer). The PDFs should match to the ones generated with a desktop Chrome.\n* Sensible defaults but everything is configurable.\n* Single-page app (SPA) support. Waits until all network requests are finished before rendering.\n* Easy deployment to Heroku. We love Lambda but...Deploy to Heroku button.\n* Renders lazy loaded elements. *(scrollPage option)*\n* Supports optional `x-api-key` authentication. *(`API_TOKENS` env var)*\n\nUsage is as simple as https://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com. There's also a `POST /api/render` if you prefer to send options in the body.\n\n**🔍 Why?**\n\nThis microservice is useful when you need to automatically produce PDF files\nfor whatever reason. The files could be receipts, weekly reports, invoices,\nor any content.\n\nPDFs can be generated in many ways, but one of them is to convert HTML+CSS\ncontent to a PDF. This API does just that.\n\n**🚀 Shortcuts:**\n\n* [Examples](#examples)\n* [API](#api)\n* [I want to run this myself](#development)\n\n## How it works\n\n![](docs/heroku.png)\n\nLocal setup is identical except Express API is running on your machine\nand requests are direct connections to it.\n\n### Good to know\n\n* **By default, page's `@media print` CSS rules are ignored**. We set Chrome to emulate `@media screen` to make the default PDFs look more like actual sites. To get results closer to desktop Chrome, add `\u0026emulateScreenMedia=false` query parameter. See more at [Puppeteer API docs](https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagepdfoptions).\n\n* Chrome is launched with `--no-sandbox --disable-setuid-sandbox` flags to fix usage in Heroku. See [this issue](https://github.com/GoogleChrome/puppeteer/issues/290).\n\n* Heavy pages may cause Chrome to crash if the server doesn't have enough RAM.\n\n* Docker image for this can be found here: https://github.com/restorecommerce/pdf-rendering-srv\n\n\n## Examples\n\n**⚠️ Restrictions ⚠️:**\n\n* For security reasons the urls have been restricted and HTML rendering is disabled. For full demo, run this app locally or deploy to Heroku.\n* The demo Heroku app runs on a free dyno which sleep after idle. A request to sleeping dyno may take even 30 seconds.\n\n\n\n**The most minimal example, render google.com**\n\nhttps://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com\n\n**The most minimal example, render google.com as PNG image**\n\nhttps://url-to-pdf-api.herokuapp.com/api/render?output=screenshot\u0026url=http://google.com\n\n\n**Use the default @media print instead of @media screen.**\n\nhttps://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com\u0026emulateScreenMedia=false\n\n**Use scrollPage=true which tries to reveal all lazy loaded elements. Not perfect but better than without.**\n\nhttps://url-to-pdf-api.herokuapp.com/api/render?url=http://www.andreaverlicchi.eu/lazyload/demos/lazily_load_lazyLoad.html\u0026scrollPage=true\n\n**Render only the first page.**\n\nhttps://url-to-pdf-api.herokuapp.com/api/render?url=https://en.wikipedia.org/wiki/Portable_Document_Format\u0026pdf.pageRanges=1\n\n**Render A5-sized PDF in landscape.**\n\nhttps://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com\u0026pdf.format=A5\u0026pdf.landscape=true\n\n**Add 2cm margins to the PDF.**\n\nhttps://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com\u0026pdf.margin.top=2cm\u0026pdf.margin.right=2cm\u0026pdf.margin.bottom=2cm\u0026pdf.margin.left=2cm\n\n**Wait for extra 1000ms before render.**\n\nhttps://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com\u0026waitFor=1000\n\n\n**Download the PDF with a given attachment name**\n\nhttps://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com\u0026attachmentName=google.pdf\n\n**Wait for an element matching the selector `input` appears.**\n\nhttps://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com\u0026waitFor=input\n\n**Render HTML sent in JSON body**\n\n*NOTE: Demo app has disabled html rendering for security reasons.*\n\n```bash\ncurl -o html.pdf -XPOST -d'{\"html\": \"\u003cbody\u003etest\u003c/body\u003e\"}' -H\"content-type: application/json\" http://localhost:9000/api/render\n```\n\n**Render HTML sent as text body**\n\n*NOTE: Demo app has disabled html rendering for security reasons.*\n\n```bash\ncurl -o html.pdf -XPOST -d@test/resources/large.html -H\"content-type: text/html\" http://localhost:9000/api/render\n```\n\n## API\n\nTo understand the API options, it's useful to know how [Puppeteer](https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md)\nis internally used by this API. The [render code](https://github.com/alvarcarto/url-to-pdf-api/blob/master/src/core/render-core.js)\nis quite simple, check it out. Render flow:\n\n1. **`page.setViewport(options)`** where options matches `viewport.*`.\n2. *Possibly* **`page.emulateMedia('screen')`** if `emulateScreenMedia=true` is set.\n3. Render url **or** html.\n\n    If `url` is defined, **`page.goto(url, options)`** is called and options match `goto.*`.\n    Otherwise **`page.setContent(html, options)`** is called where html is taken from request body, and options match `goto.*`.\n\n4. *Possibly* **`page.waitFor(numOrStr)`** if e.g. `waitFor=1000` is set.\n5. *Possibly* **Scroll the whole page** to the end before rendering if e.g. `scrollPage=true` is set.\n\n    Useful if you want to render a page which lazy loads elements.\n\n6. Render the output\n\n  * If output is `pdf` rendering is done with **`page.pdf(options)`** where options matches `pdf.*`.\n  * Else if output is `screenshot` rendering is done with **`page.screenshot(options)`** where options matches `screenshot.*`.\n\n\n### GET /api/render\n\nAll options are passed as query parameters.\nParameter names match [Puppeteer options](https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md).\n\nThese options are exactly the same as its `POST` counterpart, but options are\nexpressed with the dot notation. E.g. `?pdf.scale=2` instead of `{ pdf: { scale: 2 }}`.\n\nThe only required parameter is `url`.\n\nParameter | Type | Default | Description\n----------|------|---------|------------\nurl | string | - | URL to render as PDF. (required)\noutput | string | pdf | Specify the output format. Possible values: `pdf` , `screenshot` or `html`.\nemulateScreenMedia | boolean | `true` | Emulates `@media screen` when rendering the PDF.\nenableGPU | boolean | `false` | When set, enables chrome GPU. For windows user, this will always return false. See https://developers.google.com/web/updates/2017/04/headless-chrome\nignoreHttpsErrors | boolean | `false` | Ignores possible HTTPS errors when navigating to a page.\nscrollPage | boolean | `false` | Scroll page down before rendering to trigger lazy loading elements.\nwaitFor | number or string | - | Number in ms to wait before render or selector element to wait before render.\nattachmentName | string | - | When set, the `content-disposition` headers are set and browser will download the PDF instead of showing inline. The given string will be used as the name for the file.\nviewport.width | number | `1600` | Viewport width.\nviewport.height | number | `1200` | Viewport height.\nviewport.deviceScaleFactor | number | `1` | Device scale factor (could be thought of as dpr).\nviewport.isMobile | boolean | `false` | Whether the meta viewport tag is taken into account.\nviewport.hasTouch | boolean | `false` | Specifies if viewport supports touch events.\nviewport.isLandscape | boolean | `false` | Specifies if viewport is in landscape mode.\ncookies[0][name] | string | - | Cookie name (required)\ncookies[0][value] | string | - | Cookie value (required)\ncookies[0][url] | string | - | Cookie url\ncookies[0][domain] | string | - | Cookie domain\ncookies[0][path] | string | - | Cookie path\ncookies[0][expires] | number | - | Cookie expiry in unix time\ncookies[0][httpOnly] | boolean | - | Cookie httpOnly\ncookies[0][secure] | boolean | - | Cookie secure\ncookies[0][sameSite] | string | - | `Strict` or `Lax`\ngoto.timeout | number | `30000` |  Maximum navigation time in milliseconds, defaults to 30 seconds, pass 0 to disable timeout.\ngoto.waitUntil | string | `networkidle0` | When to consider navigation succeeded. Options: `load`, `domcontentloaded`, `networkidle0`, `networkidle2`. `load` - consider navigation to be finished when the load event is fired. `domcontentloaded` - consider navigation to be finished when the `DOMContentLoaded` event is fired. `networkidle0` - consider navigation to be finished when there are no more than 0 network connections for at least `500` ms. `networkidle2` - consider navigation to be finished when there are no more than 2 network connections for at least `500` ms.\npdf.scale | number | `1` | Scale of the webpage rendering.\npdf.printBackground | boolean | `false`| Print background graphics.\npdf.displayHeaderFooter | boolean | `false` | Display header and footer.\npdf.headerTemplate | string | - | HTML template to use as the header of each page in the PDF. **Currently Puppeteer basically only supports a single line of text and you must use pdf.margins+CSS to make the header appear!** See https://github.com/alvarcarto/url-to-pdf-api/issues/77.\npdf.footerTemplate | string | - | HTML template to use as the footer of each page in the PDF. **Currently Puppeteer basically only supports a single line of text and you must use pdf.margins+CSS to make the footer appear!** See https://github.com/alvarcarto/url-to-pdf-api/issues/77.\npdf.landscape | boolean | `false` | Paper orientation.\npdf.pageRanges | string | - | Paper ranges to print, e.g., '1-5, 8, 11-13'. Defaults to the empty string, which means print all pages.\npdf.format | string | `A4` | Paper format. If set, takes priority over width or height options.\npdf.width | string | - | Paper width, accepts values labeled with units.\npdf.height | string | - | Paper height, accepts values labeled with units.\npdf.fullPage | boolean | - | Create PDF in a single page\npdf.margin.top | string | - | Top margin, accepts values labeled with units.\npdf.margin.right | string | - | Right margin, accepts values labeled with units.\npdf.margin.bottom | string | - | Bottom margin, accepts values labeled with units.\npdf.margin.left | string | - | Left margin, accepts values labeled with units.\nscreenshot.fullPage | boolean | `true` | When true, takes a screenshot of the full scrollable page.\nscreenshot.type | string | `png` | Screenshot image type. Possible values: `png`, `jpeg`\nscreenshot.quality | number | - | The quality of the JPEG image, between 0-100. Only applies when `screenshot.type` is `jpeg`.\nscreenshot.omitBackground | boolean | `false` | Hides default white background and allows capturing screenshots with transparency.\nscreenshot.clip.x | number | - | Specifies x-coordinate of top-left corner of clipping region of the page.\nscreenshot.clip.y | number | - | Specifies y-coordinate of top-left corner of clipping region of the page.\nscreenshot.clip.width | number | - | Specifies width of clipping region of the page.\nscreenshot.clip.height | number | - | Specifies height of clipping region of the page.\nscreenshot.selector | string | - | Specifies css selector to clip the screenshot to.\n\n\n**Example:**\n\n```bash\ncurl -o google.pdf https://url-to-pdf-api.herokuapp.com/api/render?url=http://google.com\n```\n\n\n### POST /api/render - (JSON)\n\nAll options are passed in a JSON body object.\nParameter names match [Puppeteer options](https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md).\n\nThese options are exactly the same as its `GET` counterpart.\n\n**Body**\n\nThe only required parameter is `url`.\n\n```js\n{\n  // Url to render. Either url or html is required\n  url: \"https://google.com\",\n\n  // Either \"pdf\" or \"screenshot\"\n  output: \"pdf\",\n\n  // HTML content to render. Either url or html is required\n  html: \"\u003chtml\u003e\u003chead\u003e\u003c/head\u003e\u003cbody\u003eYour content\u003c/body\u003e\u003c/html\u003e\",\n\n  // If we should emulate @media screen instead of print\n  emulateScreenMedia: true,\n\n  // If we should ignore HTTPS errors\n  ignoreHttpsErrors: false,\n\n  // If true, page is scrolled to the end before rendering\n  // Note: this makes rendering a bit slower\n  scrollPage: false,\n\n  // Passed to Puppeteer page.waitFor()\n  waitFor: null,\n\n  // Passsed to Puppeteer page.setCookies()\n  cookies: [{ ... }]\n\n  // Passed to Puppeteer page.setViewport()\n  viewport: { ... },\n\n  // Passed to Puppeteer page.goto() as the second argument after url\n  goto: { ... },\n\n  // Passed to Puppeteer page.pdf()\n  pdf: { ... },\n\n  // Passed to Puppeteer page.screenshot()\n  screenshot: { ... },\n}\n```\n\n**Example:**\n\n```bash\ncurl -o google.pdf -XPOST -d'{\"url\": \"http://google.com\"}' -H\"content-type: application/json\" http://localhost:9000/api/render\n```\n\n```bash\ncurl -o html.pdf -XPOST -d'{\"html\": \"\u003cbody\u003etest\u003c/body\u003e\"}' -H\"content-type: application/json\" http://localhost:9000/api/render\n```\n\n### POST /api/render - (HTML)\n\nHTML to render is sent in body. All options are passed in query parameters.\nSupports exactly the same query parameters as `GET /api/render`, except `url`\nparemeter.\n\n*Remember that relative links do not work.*\n\n**Example:**\n\n```bash\ncurl -o receipt.html https://rawgit.com/wildbit/postmark-templates/master/templates_inlined/receipt.html\ncurl -o html.pdf -XPOST -d@receipt.html -H\"content-type: text/html\" http://localhost:9000/api/render?pdf.scale=1\n```\n\n### GET /healthcheck\n\nHealth check endpoint used for monitoring if the service is still up and running.\n\n```bash\ncurl -XGET http://localhost:9000/healthcheck\n```\n\n## Development\n\nTo get this thing running, you have two options: run it in Heroku, or locally.\n\nThe code requires Node 8+ (async, await).\n\n#### 1. Heroku deployment\n\nScroll this readme up to the Deploy to Heroku -button. Click it and follow\ninstructions.\n\n**WARNING:** *Heroku dynos have a very low amount of RAM. Rendering heavy pages\nmay cause Chrome instance to crash inside Heroku dyno. 512MB should be\nenough for most real-life use cases such as receipts. Some news sites may need\neven 2GB of RAM.*\n\n\n#### 2. Local development\n\nFirst, clone the repository and cd into it.\n\n* `cp .env.sample .env`\n* Fill in the blanks in `.env`\n\n* `npm install`\n* `npm start` Start express server locally\n* Server runs at http://localhost:9000 or what `$PORT` env defines\n\n\n### Techstack\n\n* Node 8+ (async, await), written in ES7\n* [Express.js](https://expressjs.com/) app with a nice internal architecture, based on [these conventions](https://github.com/kimmobrunfeldt/express-example).\n* Hapi-style Joi validation with [express-validation](https://github.com/andrewkeig/express-validation)\n* Heroku + [Puppeteer buildpack](https://github.com/jontewks/puppeteer-heroku-buildpack)\n* [Puppeteer](https://github.com/GoogleChrome/puppeteer) to control Chrome\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falvarcarto%2Furl-to-pdf-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falvarcarto%2Furl-to-pdf-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falvarcarto%2Furl-to-pdf-api/lists"}