Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rickypc/puppeteer-page-pool

A Page resource pool for Puppeteer.
https://github.com/rickypc/puppeteer-page-pool

browser chrome concurrency construction crawl functionality headless instance limitation model page parallel pool puppeteer resource reuse spawn throttle worker

Last synced: about 18 hours ago
JSON representation

A Page resource pool for Puppeteer.

Awesome Lists containing this project

README

        

[![Version](https://img.shields.io/npm/v/puppeteer-page-pool)](https://bit.ly/2KHg9rB)
[![Downloads](https://img.shields.io/npm/dt/puppeteer-page-pool)](https://bit.ly/2KHg9rB)
[![Dependency Status](https://img.shields.io/david/rickypc/puppeteer-page-pool)](https://bit.ly/2Tjx5YT)
[![Dev Dependency Status](https://img.shields.io/david/dev/rickypc/puppeteer-page-pool)](https://bit.ly/2YBkjev)
[![Code Style](https://img.shields.io/badge/code%20style-Airbnb-red)](https://bit.ly/2JYN1gk)
[![Build](https://img.shields.io/travis/rickypc/puppeteer-page-pool)](https://bit.ly/2YSR7il)
[![Coverage](https://img.shields.io/codecov/c/github/rickypc/puppeteer-page-pool)](https://bit.ly/2TmdMhN)
[![Vulnerability](https://img.shields.io/snyk/vulnerabilities/github/rickypc/puppeteer-page-pool)](https://bit.ly/2YLUoRq)
[![Dependabot](https://api.dependabot.com/badges/status?host=github&repo=rickypc/puppeteer-page-pool)](https://bit.ly/2KIM5vs)
[![License](https://img.shields.io/npm/l/puppeteer-page-pool)](https://bit.ly/2yi7gyO)

Puppeteer Page Pool
===================

A [Page](https://bit.ly/2Z2NKFK) resource [pool](https://bit.ly/2ZNrNav) for [Puppeteer](https://bit.ly/2KqtMwd). It can be used to reuse or throttle usage of the Puppeteer Page resource.

Installation
-

```bash
$ npm install --save puppeteer-page-pool
```

API Reference
-
Provide Puppeteer Page resource pool.

**See**

- [Pool Options](https://bit.ly/2GXZbUR)
- [Puppeteer Options](https://bit.ly/2M6kVCd)

**Example**
```js
// use PagePool directly.
const PagePool = require('puppeteer-page-pool');

// Instantiate PagePool with default options.
const pagePool = new PagePool();
// Launch the browser and proceed with pool creation.
await pagePool.launch();
// Acquire and release the page seamlessly.
await pagePool.process(async (page) => {
// Any page actions...
await page.goto('https://angular.io');
});
// All done.
await pagePool.destroy();
```
**Example**
```js
// create subclass as a child of PagePool.
class MyPagePool extends PagePool {
constructor (options) {
super(options);
this.mine = true;
}

async takeOff () {
// Launch the browser and proceed with pool creation.
await this.launch();
// Acquire and release the page seamlessly.
await this.process(async (page) => {
// Any page actions...
await page.goto('https://angular.io');
});
// All done.
await this.destroy();
}
}

// Instantiate MyPagePool with default options.
const myPagePool = new MyPagePool();
// Custom action.
await myPagePool.takeOff();
```
**Example**
```js
// use different puppeter library.
const puppeteer = require('puppeteer-extra');
// See https://bit.ly/32X27uf
puppeteer.use(require('puppeteer-extra-plugin-angular')());
const customPagePool = new MyPagePool({
puppeteer,
});
// Custom action.
await customPagePool.takeOff();
```
**Example**
```js
// instantiate with customized options.
const optionsPagePool = new MyPagePool({
// See factory section of https://github.com/coopernurse/node-pool#createPool
async onPageCreated (page) {
// Bound function that will be called after page is created.
},
async onPageDestroy (page) {
// Bound function that will be called right before page is destroyed.
},
async onValidate (page) {
// Bound function that will be called to validate the validity of the page.
},
// See opts section of https://bit.ly/2GXZbUR
poolOptions: {
log: true,
},
puppeteer,
// See https://bit.ly/2M6kVCd
puppeteerOptions: {
// I want to see all the actions :)
headless: false,
},
});
// Custom action.
await optionsPagePool.takeOff();
```
**Example**
```js
// parallel processes.
const parallelPagePool = new PagePool({
// See opts section of https://bit.ly/2GXZbUR
poolOptions: {
max: 3,
},
puppeteer,
// See https://bit.ly/2M6kVCd
puppeteerOptions: {
headless: false,
},
});
// Launch the browser and proceed with pool creation.
await parallelPagePool.launch();

const promises = [
'https://angular.io',
'https://www.chromium.org',
'https://santatracker.google.com',
].map((url) => {
// Acquire and release the page seamlessly.
return parallelPagePool.process(async (page, data) => {
// Navigate to given Url and wait until Angular is ready
// if it's an angular page.
await page.navigateUntilReady(data.url);
await page.screenshot({
fullPage: true,
path: `${data.url.replace(/https?:|\//g, '')}-screenshot.png`,
});
}, { url });
});

// Wait until it's all done.
await Promise.all(promises);

// All done.
await parallelPagePool.destroy();
```

* [puppeteer-page-pool](#module_puppeteer-page-pool)
* [PagePool](#exp_module_puppeteer-page-pool--PagePool) ⏏
* [new PagePool(options)](#new_module_puppeteer-page-pool--PagePool_new)
* _instance_
* [.destroy()](#module_puppeteer-page-pool--PagePool+destroy) ⇒ null
* [.launch()](#module_puppeteer-page-pool--PagePool+launch)
* [.process(handler, ...args)](#module_puppeteer-page-pool--PagePool+process)
* _inner_
* [~PoolEventHandler](#module_puppeteer-page-pool--PagePool..PoolEventHandler) : function
* [~Options](#module_puppeteer-page-pool--PagePool..Options) : Object
* [~ActionHandler](#module_puppeteer-page-pool--PagePool..ActionHandler) : function

### PagePool ⏏
**Kind**: Exported class
**See**

- [Pool Options](https://bit.ly/2GXZbUR)
- [Puppeteer Options](https://bit.ly/2M6kVCd)

#### new PagePool(options)
Instantiate PagePool class instance.

| Param | Type | Description |
| --- | --- | --- |
| options | Options | PagePool options. |

**Example**
```js
const PagePool = require('puppeteer-page-pool');
const pagePool = new PagePool({});
```

#### pagePool.destroy() ⇒ null
Close and release all page resources, as well as clean up after itself.

**Kind**: instance method of [PagePool](#exp_module_puppeteer-page-pool--PagePool)
**Returns**: null - Null value.
**Example**
```js
let pagePool = new PagePool();
pagePool = await pagePool.destroy();
```

#### pagePool.launch()
Launch the browser and create all page resources.

**Kind**: instance method of [PagePool](#exp_module_puppeteer-page-pool--PagePool)
**Example**
```js
const pagePool = new PagePool();
await pagePool.launch();
```

#### pagePool.process(handler, ...args)
Process given args using provided handler.

**Kind**: instance method of [PagePool](#exp_module_puppeteer-page-pool--PagePool)

| Param | Type | Description |
| --- | --- | --- |
| handler | ActionHandler | Action handler. |
| ...args | \* | Action handler arguments. |

**Example**
```js
const args = { key: 'value' };
const pagePool = new PagePool();
await pagePool.process((page, data) => {}, args);
```

#### PagePool~PoolEventHandler : function
Pool factory event handler.

**Kind**: inner typedef of [PagePool](#exp_module_puppeteer-page-pool--PagePool)

| Param | Type | Description |
| --- | --- | --- |
| page | Object | The page resource. |

**Example**
```js
const poolEventHandler = (page) => {
// Do something...
};
```

#### PagePool~Options : Object
PagePool instantiation options.

**Kind**: inner typedef of [PagePool](#exp_module_puppeteer-page-pool--PagePool)
**See**

- [Pool Options](https://bit.ly/2GXZbUR)
- [Puppeteer Options](https://bit.ly/2M6kVCd)

| Param | Type | Default | Description |
| --- | --- | --- | --- |
| [onPageDestroy] | PoolEventHandler | | The function that would be called before page is destroyed. |
| [onPageCreated] | PoolEventHandler | | The function that would be called after page created. |
| [onValidate] | PoolEventHandler | | The function that would be called to validate page resource validity. |
| [poolOptions] | Object | {} | The pool instantiation options. See https://bit.ly/2GXZbUR |
| [puppeteer] | Object | require('puppeteer') | Puppeteer library to be use. |
| [puppeteerOptions] | Object | {} | Puppeteer launch options. See https://bit.ly/2M6kVCd |

**Example**
```js
const options = {
async onPageDestroy (page) {},
async onPageCreated(page) {},
async onValidate(page) {},
poolOptions: {},
puppeteer,
puppeteerOptions: {},
};
```

#### PagePool~ActionHandler : function
Action handler function that is executed with page resource from the pool.

**Kind**: inner typedef of [PagePool](#exp_module_puppeteer-page-pool--PagePool)

| Param | Type | Description |
| --- | --- | --- |
| page | Object | The page resource. |
| ...args | \* | Action handler arguments. |

**Example**
```js
const actionHandler = (page, ...args) => {
// Do something...
};
```

Development Dependencies
-
You will need to install [Node.js](https://bit.ly/2SMCGXK) as a local development dependency. The `npm` package manager comes bundled with all recent releases of `Node.js`.

`npm install` will attempt to resolve any `npm` module dependencies that have been declared in the project’s `package.json` file, installing them into the `node_modules` folder.

```bash
$ npm install
```

Run Linter
-
To make sure we followed code style best practice, run:

```bash
$ npm run lint
```

Run Unit Tests
-
To make sure we did not break anything, let's run:

```bash
$ npm test
```

Contributing
-
If you would like to contribute code to Puppeteer Page Pool repository you can do so
through GitHub by forking the repository and sending a pull request.

If you do not agree to [Contribution Agreement](CONTRIBUTING.md), do not
contribute any code to Puppeteer Page Pool repository.

When submitting code, please make every effort to follow existing conventions
and style in order to keep the code as readable as possible. Please also include
appropriate test cases.

That's it! Thank you for your contribution!

License
-
Copyright (c) 2018 - 2020 Richard Huang.

This module is free software, licensed under: [GNU Affero General Public License (AGPL-3.0)](https://bit.ly/2yi7gyO).

Documentation and other similar content are provided under [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](https://bit.ly/2SMCRlS).