{"id":20499982,"url":"https://github.com/devopsgroup-io/siteshooter","last_synced_at":"2025-04-13T18:52:05.691Z","repository":{"id":57362045,"uuid":"54218489","full_name":"devopsgroup-io/siteshooter","owner":"devopsgroup-io","description":":camera: Automate full website screenshots and PDF generation with multiple viewport support.","archived":false,"fork":false,"pushed_at":"2019-05-15T19:46:11.000Z","size":508,"stargazers_count":67,"open_issues_count":8,"forks_count":13,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-04-14T10:03:01.774Z","etag":null,"topics":["pdf-generation","phantomjs","salesforce","screenshot","seo","sitemap","web-crawler"],"latest_commit_sha":null,"homepage":"https://devopsgroup.io","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/devopsgroup-io.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-03-18T17:10:28.000Z","updated_at":"2024-04-12T18:09:22.000Z","dependencies_parsed_at":"2022-09-26T16:40:33.903Z","dependency_job_id":null,"html_url":"https://github.com/devopsgroup-io/siteshooter","commit_stats":null,"previous_names":[],"tags_count":73,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devopsgroup-io%2Fsiteshooter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devopsgroup-io%2Fsiteshooter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devopsgroup-io%2Fsiteshooter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devopsgroup-io%2Fsiteshooter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/devopsgroup-io","download_url":"https://codeload.github.com/devopsgroup-io/siteshooter/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248765984,"owners_count":21158296,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["pdf-generation","phantomjs","salesforce","screenshot","seo","sitemap","web-crawler"],"created_at":"2024-11-15T18:19:24.750Z","updated_at":"2025-04-13T18:52:05.672Z","avatar_url":"https://github.com/devopsgroup-io.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Siteshooter \n\u003cimg src=\"https://cdn.rawgit.com/devopsgroup-io/siteshooter/master/siteshooter.svg\" alt=\"Siteshooter\" width=\"100\" \u003e\n\n[![NPM version](https://img.shields.io/npm/v/siteshooter.svg)](https://www.npmjs.com/package/siteshooter) [![Build Status](https://img.shields.io/travis/devopsgroup-io/siteshooter.svg?branch=master)](https://travis-ci.org/devopsgroup-io/siteshooter)\n[![dependencies](https://david-dm.org/devopsgroup-io/siteshooter.svg)](https://david-dm.org/devopsgroup-io/siteshooter#info=dependencies\u0026view=tables)\n[![](https://img.shields.io/twitter/follow/devopsgroup_io.svg?style=social\u0026label=@devopsgroup_io)](https://twitter.com/devopsgroup_io)\n\n\u003e Automate full website screen shots and PDF generation with multiple view port support\n\n### Features\n\n* Crawls specified host and generates a `sitemap.xml` on the fly\n* Generates entire website screen shots based on `sitemap.xml`\n* Define multiple view ports\n* Automated PDF generation\n* Includes crawled meta data in generated PDF\n* Reports on broken website links (404 http response)\n* Supports [HTTP basic authentication](https://en.wikipedia.org/wiki/Basic_access_authentication)\n* Supports Microsoft Online 3 step authentication\n* Supports [Salesforce Visualforce](https://developer.salesforce.com/page/Visualforce) 3 step authentication\n* Supports site maps with HTTP, HTTPS, and FTP protocol URLs\n* Follows HTTP 301 redirects\n* [Custom JavaScript inject file](#custom-javascript-inject-file) - injects into page prior to screen shooting\n* Trigger page events by passing querystring values to custom inject.js file\n\n---\n\u003e##### Do you need a website and workflow management platform?\n\u003e \u003cimg src=\"https://cdn.rawgit.com/devopsgroup-io/catapult/master/repositories/apache/_default_/svg/catapult.svg\" alt=\"Catapult website and workflow management platform\" width=\"30\"\u003e **[Give Catapult a shot](https://github.com/devopsgroup-io/catapult)**\n---\n\n**In This Documentation**\n\n1. [Getting Started](#getting-started)\n2. [Siteshooter Configuration File](#create-a-siteshooter-configuration-file)\n2. [CLI Options](#cli-options)\n3. [Tests](#tests)\n4. [Troubleshooting \u0026 FAQ](#troubleshooting-and-faq)\n\n## Getting Started ##\n\n#### Dependencies\n\nInstall the following prerequisite on your development machine:\n\n* [Node.js - **version \u003e= 6.0.0**](http://nodejs.org)\n\n#### Notable npm Modules\n\n* [PDFKit](https://github.com/devongovett/pdfkit)\n* [PhantomJS](https://github.com/ariya/phantomjs)\n* [Simple Web Crawler](https://github.com/cgiffard/node-simplecrawler)\n\n\n### Quick Start\n```\n$ npm install siteshooter --global\n```\nIf siteshooter is installed, make sure you have the latest version by running:\n```\n$ npm update siteshooter --global\n```\n* You may need to run these commands with elevated privileges, e.g. `sudo`, you will be prompted to do so if needed.\n* Installing with the `--global` flag affords you the `siteshooter` command on your machine's command line at any path.\n* Read more about the `--global` flag [here](https://docs.npmjs.com/files/folders).\n\n### Create a Siteshooter Configuration File ###\n```\n$ siteshooter --init\n```\n\n### Update Siteshooter Configuration File\n\n[View the full siteshooter.yml example](https://github.com/devopsgroup-io/siteshooter/tree/master/siteshooter.yml)\n\nInside `siteshooter.yml`, add additional options. \n\n* All [Simple Web Crawler options](https://github.com/cgiffard/node-simplecrawler#configuration) can be added to `sitecrawler_options` and will pass through to the crawler process\n* Generated screenshot image files are optimized using [imagemin](https://www.npmjs.com/package/imagemin) and [imagemin-pngquant](https://www.npmjs.com/package/imagemin-pngquant) modules, which reduce the overall size of generated PDFs. To adjust the [image quality](https://www.npmjs.com/package/imagemin-pngquant#quality), update the **image_quality** option in your siteshooter.yml file.\n\n\n\n```yml\ndomain:\n  name: https://www.devopsgroup.io\n  auth:\n    user:\n    pwd:\n\npdf_options:\n excludeMeta: true\n\nscreenshot_options:\n  delay: 2000\n  image_quality: '60-80'\n  transparent_background: false\n\nsitecrawler_options:\n  exclude:\n   - \"pdf\"\n  stripQuerystring: false\n  ignoreInvalidSSL: true\n\nviewports:\n - viewport: desktop-large\n   width: 1600\n   height: 1200\n - viewport: tablet-landscape\n   width: 1024\n   height: 768\n - viewport: iPhone5\n   width: 320\n   height: 568\n - viewport: iPhone6\n   width: 375\n   height: 667\n\n```\n\n## CLI Options\n\n```bash\n\n$ siteshooter --help\n\nUsage: siteshooter [options]\n\nOPTIONS\n_______________________________________________________________________________________\n-c --config            Show configuration\n-C --cwd               Set working directory, which will load a siteshooter.yml file in the specified path\n-e --debug             Output exceptions\n-h --help              Print this help\n-i --init              Create siteshooter.yml template file in working directory\n-p --pdf               Generate PDFs, by defined view ports, based on screen shots created via Siteshooter\n-q --quiet             Only return final output\n-s --screenshots       Generate screen shots, by view ports, based on sitemap.xml file\n-S --sitemap           Crawl domain name specified in siteshooter.yml file and generate a local sitemap.xml file\n-v --version           Print version number\n-V --verbose           Verbose output\n-w --website           Report on website information based on Siteshooter crawled results\n```\n\nWhen running a `siteshooter` command without any options, the following options will run in order by default:\n\n* `--sitemap`\n* `--screenshots`\n* `--pdf`\n\n\n\n### Custom JavaScript Inject File\n\nTo manipulate the DOM, prior to the screen shot process, add a `inject.js` file in the same working directory as the `siteshooter.yml`. \n\n**Example:** inject.js file\n\n```javascript\n\n/**\n * @file:            inject.js\n * @description:     used to inject custom JavaScript into a web page prior to a screen shot. \n */\n\nconsole.log('JavaScript injected into page.');\n\nif ( typeof(jQuery) !== \"undefined\" ) {\n\n    jQuery(document).ready(function() {\n        console.log('jQuery loaded.');\n    });\n}\n```\n\n#### Trigger JavaScript Events\n\nWhen using the optional `inject.js` file, events can be triggered based on the following querystring parameter - **pevent**\n\n```javascript\n\n // Add URL with pevent querystring parameter in the generated sitemap.xml\n\u003curl\u003e\n    \u003cloc\u003ehttps://www.devopsgroup.io?pevent=open-privacy-overlay\u003c/loc\u003e\n    \u003cchangefreq\u003eweekly\u003c/changefreq\u003e\n\u003c/url\u003e\n```\n\n**Example:** Event detection \u0026 triggering\n\n```javascript\n/**\n * @file:            inject.js\n * @description:     used to inject custom JavaScript into a web page prior to a screen shot. \n */\n\n\nfunction getQueryVariable(variable) {\n    var query = window.location.search.substring(1);\n    var vars = query.split('\u0026');\n    for (var i = 0; i \u003c vars.length; i++) {\n        var pair = vars[i].split('=');\n        if (decodeURIComponent(pair[0]) == variable) {\n            return decodeURIComponent(pair[1]);\n        }\n    }\n}\n\nif ( typeof(jQuery) !== \"undefined\" ) {\n\n    jQuery(document).ready(function() {\n        var pageName = window.location.pathname.replace('/', ''),\n            pageEvent = getQueryVariable('pevent');\n\n        console.log('document ready.');\n        console.log('userAgent', navigator.userAgent);\n        console.log('Page: ', pageName);\n        console.log('Event: ', pageEvent);\n\n        switch (pageName) {\n\n            // home\n            case '':\n\n                switch (pageEvent) {\n                    case 'open-privacy-overlay':\n\n                        jQuery('a[data-target~=\"#modal-privacy\"]').trigger('click');\n                        break;\n                }\n\n                break;\n        }\n\n    });\n}\n```\n\n## Tests\n\nTests are written with [Mocha](https://github.com/mochajs/mocha) and can be run with `npm test`.\n\n## Troubleshooting\n\nIf you're having issues with Siteshooter, [submit a GitHub Issue](https://github.com/devopsgroup-io/siteshooter/issues/new).\n\n* Make sure you have a `siteshooter.yml` file in your working directory and the [yaml file is well formatted](http://www.yamllint.com/)\n* Experiencing font-loading issues? Try increasing the delay setting in your siteshooter.yml file\n\n```yml\nscreenshot_options:\n  delay: 2000\n```\n\n* Trying to take a screenshot of a page with a video? Unfortunately, [PhantomJS does not support videos](http://phantomjs.org/supported-web-standards.html). As such, here's one approach to showing a video's poster image. \n\n```javascript\n\n/**\n * @file:            inject.js\n * @description:     used to display a video's poster image\n */\n\nif( jQuery('video').length \u003e0 ){\n    jQuery('video').parent().prepend('\u003cimg src=\"'+jQuery('video').attr('poster')+'\"/\u003e');\n    jQuery('video').remove();\n}\n```\n\n* SimpleCrawler TypeError: The header content contains invalid characters\n    * Try setting the acceptCookies option to false\n\n```yml\nsitecrawler_options:\n  acceptCookies: false\n```\n\n## Code of Conduct\n\nTake a moment to read or [Code of Conduct](CODE_OF_CONDUCT.md)\n\n## Contributing to the project\n\nWe are always looking for quality contributions! Please check the [CONTRIBUTING.md](CONTRIBUTING.md) for contribution guidelines.\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevopsgroup-io%2Fsiteshooter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdevopsgroup-io%2Fsiteshooter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevopsgroup-io%2Fsiteshooter/lists"}