Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/johansatge/psi-report
Crawls a website, gets PageSpeed Insights data for each page, and exports an HTML report.
https://github.com/johansatge/psi-report
cli crawler html-report pagespeed-insights
Last synced: about 2 months ago
JSON representation
Crawls a website, gets PageSpeed Insights data for each page, and exports an HTML report.
- Host: GitHub
- URL: https://github.com/johansatge/psi-report
- Owner: johansatge
- License: mit
- Created: 2015-11-14T11:55:19.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2018-01-23T12:16:59.000Z (almost 7 years ago)
- Last Synced: 2024-10-22T13:28:58.338Z (2 months ago)
- Topics: cli, crawler, html-report, pagespeed-insights
- Language: JavaScript
- Homepage:
- Size: 603 KB
- Stars: 11
- Watchers: 4
- Forks: 4
- Open Issues: 3
-
Metadata Files:
- Readme: readme.md
- License: license.md
Awesome Lists containing this project
README
![Version](https://img.shields.io/npm/v/psi-report.svg)
![Downloads](https://img.shields.io/npm/dm/psi-report.svg)
![Dependencies](https://img.shields.io/david/johansatge/psi-report.svg)
![devDependencies](https://img.shields.io/david/dev/johansatge/psi-report.svg)# psi-report
Crawls a website or get URLs from a sitemap.xml or a file, gets [PageSpeed Insights](https://developers.google.com/speed/pagespeed/insights/) data for each page, and exports an HTML report.
![](screenshot.png)
---
* [Installation](#installation)
* [CLI usage](#cli-usage)
* [Programmatic usage](#programmatic-usage)
* [Crawler behavior](#crawler-behavior)
* [Changelog](#changelog)
* [License](#license)
* [Credits](#credits)## Installation
Install with [npm](https://www.npmjs.com/):
```bash
$ npm install psi-report --global
# --global isn't required if you plan to use the node module
```## CLI usage
```bash
$ psi-report [options]
```Options:
```bash
-V, --version output the version number
--urls-from-sitemap [name] Get the list of URLs from sitemap.xml (don't crawl)
--urls-from-file [name] Get the list of URLs from a file, one url per line (don't crawl)
-h, --help output usage information
```Example:
```bash
$ psi-report daringfireball.net/projects/markdown /Users/johan/Desktop/report.html
```## Programmatic usage
```javascript
// Basic usagevar PSIReport = require('psi-report');
var psi_report = new PSIReport({baseurl: 'http://domain.org'}, onComplete);
psi_report.start();function onComplete(baseurl, data, html)
{
console.log('Report for: ' + baseurl);
console.log(data); // An array of pages with their PSI results
console.log(html); // The HTML report (as a string)
}// The "fetch_url" and "fetch_psi" events allow to monitor the crawling process
psi_report.on('fetch_url', onFetchURL);
function onFetchURL(error, url)
{
console.log((error ? 'Error with URL: ' : 'Fetched URL: ') + url);
}psi_report.on('fetch_psi', onFetchPSI);
function onFetchPSI(error, url, strategy)
{
console.log((error ? 'Error with PSI for ' : 'PSI data (' + strategy + ') fetched for ') + url);
}
```## Crawler behavior
The base URL is used as a root when crawling the pages.
For instance, using the URL `https://daringfireball.net/` will crawl the entire website.
However, `https://daringfireball.net/projects/markdown/` will crawl only:
* `https://daringfireball.net/projects/markdown/`
* `https://daringfireball.net/projects/markdown/basics`
* `https://daringfireball.net/projects/markdown/syntax`
* `https://daringfireball.net/projects/markdown/license`
* And so on*This may be useful to crawl only one part of a website: everything starting with `/en`, for instance.*
## URLs from a sitemap.xml or a file
Instead of crawling the website, you can set the URL list with a sitemap.xml or a file.
* `--urls-from-sitemap https://example.com/sitemap.xml`
* `--urls-from-file /path/to/urls.txt`Only the URLs inside this file will be processed.
## Changelog
This project uses [semver](http://semver.org/).
| Version | Date | Notes |
| --- | --- | --- |
| `2.2.1` | 2018-01-19 | Fix missing source files on NPM (@blaryjp)|
| `2.2.0` | 2017-11-27 | Prepend baseurl if not present, for each urls in file (@blaryjp)|
| `2.1.0` | 2017-11-19 | Add `--urls-from-sitemap` and `--urls-from-file` (@blaryjp)|
| `2.0.0` | 2016-04-02 | Deep module rewrite (New module API, updated CLI usage) |
| `1.0.1` | 2016-01-15 | Fix call on obsolete package |
| `1.0.0` | 2015-12-01 | Initial version |## License
This project is released under the [MIT License](license.md).
## Credits
* [async](https://github.com/caolan/async)
* [colors](https://github.com/Marak/colors.js)
* [request](https://github.com/request/request)
* [crawler](https://github.com/sylvinus/node-crawler)
* [commander](https://github.com/tj/commander.js)
* [sitemapper](https://github.com/hawaiianchimp/sitemapper)