Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/b4dnewz/webpage-capture
:camera: Capture the web using headless browser with many options
https://github.com/b4dnewz/webpage-capture
automation puppeteer screenshot screenshot-utility webpage-capture
Last synced: 3 months ago
JSON representation
:camera: Capture the web using headless browser with many options
- Host: GitHub
- URL: https://github.com/b4dnewz/webpage-capture
- Owner: b4dnewz
- License: mit
- Created: 2017-09-22T23:31:44.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2019-05-13T06:45:44.000Z (over 5 years ago)
- Last Synced: 2024-10-28T13:11:23.181Z (3 months ago)
- Topics: automation, puppeteer, screenshot, screenshot-utility, webpage-capture
- Language: JavaScript
- Homepage:
- Size: 889 KB
- Stars: 7
- Watchers: 2
- Forks: 3
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# webpage-capture
> Capture the web in many ways using headless chrome
[![NPM version][npm-image]][npm-url] [![Build Status][travis-image]][travis-url] [![Dependency Status][daviddm-image]][daviddm-url] [![Coverage percentage][coveralls-image]][coveralls-url]
This program is an overlay of [puppeteer](https://github.com/GoogleChrome/puppeteer) which is designed to allow the easy extraction of single or multiple pages or sections, in multiple formats and in the fastest way possible.
## Installation as module
If you want to use it inside your scripts, save and install it in your project dependencies.
```
npm install --save webpage-capture
```Once it has done you can import **webpage-capture** in your scripts and start using it, refer to the [usage](#usage) section.
## Installation as CLI
You can also use it from the command line using the [cli module](https://github.com/b4dnewz/webpage-capture-cli), installing it globally:
```
npm install -g webpage-capture-cli
```Than you can start playing around with __webcapture__ command and with the options using the built-in help typing: `--help`
---
## Usage
First you have to import __webpage-capture__ and initializire a new capturer with default or custom options.
```js
import WebCapture from 'webpage-capture'
const capturer = new WebCapture()(async () => {
// Single input
await capturer.capture('https://google.it')// Multiple inputs
const res = await capturer.capture([
'https://github.com/b4dnewz',
'https://github.com/b4dnewz/webpage-capture'
])
console.log(res);})().catch(console.log)
.then(capturer.close())
```Don't forget to __close__ the capturer once you have done, otherwise the headless browser instance will not disconnect correctly.
### Instance Options
The constructor can also take options to set default values for all the subsequent captures:
```js
const capturer = new WebCapture({
// default options
})
```#### outputDir
Type: `String`
Default value: `process.cwd()`Specify a custom directory where place the captured files.
#### timeout
Type: `Number`
Default value: `30000`Specify the default page timeout to load a page (in ms).
#### headers
Type: `Object`
Set the headers to be used during all the requests.
#### viewport
Type: `String, Object`
Set the default viewport that is used when not specified.
#### debug
Type: `Boolean`
Default value: `false`Run the script in __headfull__ mode with a delay of 1 second so you can see what is doing.
#### launchArgs
Type: `Array`
Default value: `[]`Custom launch arguments to be passed to Chromium instance, if you are a linux user
and you are experiencing some issues, you may want to [disable sandbox](https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md#setting-up-chrome-linux-sandbox) or setup differently.---
## Methods
### file(input, outputFilePath, [options])
Capture a screnshot of the given `input` and save it to the given `outputFilePath`.
Returns a `Promise` that resolves when the screenshot is written.
### buffer(input, [options])
Capture a screnshot of the given `input`.
Returns a `Promise` with the screenshot as binary.
### base64(input, [options])
Capture a screnshot of the given `input`.
Returns a `Promise` with the screenshot as [Base64](https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding).
### capture(inputs, [options])
Capture one or multiple `input` using the given options.
For a complete list of options see below.Returns a `Promise` that resolves when the screenshot is written.
---
## Capture Options
#### type
Type: `String`
Default value: `png`
Possible values: `png, jpeg, pdf, html, buffer, base64`Select the capture output type.
#### timeout
Type: `Number`
The timeout to use when capturing input.
#### headers
Type: `Object`
An object of headers to use when capturing input.
```js
await capturer.capture('https://github.com/b4dnewz/webpage-capture', {
headers: {
'x-powered-by': 'webpage-capture'
}
})
```#### waitFor
Type: `Number`
Wait for the specified time before capturing the page. (in milliseconds)
#### waitUntil
Type: `String`
Wait until the selector is visible into page.
#### fullPage
Type: `Boolean`
Capture the full scrollable page, not just the visible viewport.
#### scripts
Type: `String[]`
Inject scripts into the page.
```js
await capturer.capture('https://github.com/b4dnewz/webpage-capture', {
scripts: [
'http://example.com/some/remote/file.js',
'./local-file.js',
'document.body.style.backgroundColor = "hotpink";'
]
})
```#### styles
Type: `String[]`
Inject styles into the page.
```js
await capturer.capture('https://github.com/b4dnewz/webpage-capture', {
styles: [
'http://example.com/some/remote/file.css',
'./local-file.css',
'body { background-color: "hotpink"; }'
]
})
```#### viewport
Type: `String, String[]`
One or more viewport to capture.
```js
// capture single viewport
await capturer.capture('https://github.com/b4dnewz/webpage-capture', {
viewport: 'nexus-5'
})// capture multiple viewports
await capturer.capture('https://github.com/b4dnewz/webpage-capture', {
viewport: ['nexus-5', 'nexus-10']
})
```#### viewportCategory
Type: `String`
Default value: `false`Capture all viewports that match the category name.
```js
// capture all mobile viewports
await capturer.capture('https://github.com/b4dnewz/webpage-capture', {
viewportCategory: 'mobile'
})
```---
## License
MIT © [Filippo Conti](LICENSE)
## Contributing
1. Create an issue and describe your idea
2. Fork the project ()
3. Create your feature branch (`git checkout -b my-new-feature`)
4. Commit your changes (`git commit -am 'Add some feature'`)
5. Write some tests and run it (`npm run test'`)
6. Publish the branch (`git push origin my-new-feature`)
7. Create a new Pull Request[npm-image]: https://badge.fury.io/js/webpage-capture.svg
[npm-url]: https://npmjs.org/package/webpage-capture
[travis-image]: https://travis-ci.org/b4dnewz/webpage-capture.svg?branch=master
[travis-url]: https://travis-ci.org/b4dnewz/webpage-capture
[daviddm-image]: https://david-dm.org/b4dnewz/webpage-capture.svg?theme=shields.io
[daviddm-url]: https://david-dm.org/b4dnewz/webpage-capture
[coveralls-image]: https://coveralls.io/repos/b4dnewz/webpage-capture/badge.svg
[coveralls-url]: https://coveralls.io/r/b4dnewz/webpage-capture