Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/cburgdorf/grunt-html-snapshot

A grunt task that takes html snapshots from websites. Useful to make ajax sites crawlable
https://github.com/cburgdorf/grunt-html-snapshot

Last synced: about 1 month ago
JSON representation

A grunt task that takes html snapshots from websites. Useful to make ajax sites crawlable

Host: GitHub
URL: https://github.com/cburgdorf/grunt-html-snapshot
Owner: cburgdorf
License: mit
Archived: true
Created: 2013-04-30T13:11:16.000Z (almost 12 years ago)
Default Branch: master
Last Pushed: 2016-10-10T16:24:23.000Z (over 8 years ago)
Last Synced: 2024-08-19T11:32:54.316Z (5 months ago)
Language: JavaScript
Size: 330 KB
Stars: 230
Watchers: 13
Forks: 45
Open Issues: 20
Metadata Files:
- Readme: README.md
- License: LICENSE.md

Awesome Lists containing this project

README

        # grunt-html-snapshot

> Makes it easy to provide html snapshots for client side applications so that they can be indexed by web crawlers

## Getting Started

This plugin requires Grunt `~0.4.0`

If you haven't used [Grunt](http://gruntjs.com/) before, be sure to check out the [Getting Started](http://gruntjs.com/getting-started) guide, as it explains how to create a [Gruntfile](http://gruntjs.com/sample-gruntfile) as well as install and use Grunt plugins. Once you're familiar with that process, you may install this plugin with this command:

```shell

npm install grunt-html-snapshot --save-dev

```

Once the plugin has been installed, it may be enabled inside your Gruntfile with this line of JavaScript:

```js

grunt.loadNpmTasks('grunt-html-snapshot');

```

## htmlSnapshot task

_Run this task with the `grunt htmlSnapshot` command._

## configuring the htmlSnapshot task

```js

    grunt.initConfig({

        htmlSnapshot: {

            all: {

              options: {

                //that's the path where the snapshots should be placed

                //it's empty by default which means they will go into the directory

                //where your Gruntfile.js is placed

                snapshotPath: 'snapshots/',

                //This should be either the base path to your index.html file

                //or your base URL. Currently the task does not use it's own

                //webserver. So if your site needs a webserver to be fully

                //functional configure it here.

                sitePath: 'http://localhost:8888/my-website/',

                //you can choose a prefix for your snapshots

                //by default it's 'snapshot_'

                fileNamePrefix: 'sp_',

                //by default the task waits 500ms before fetching the html.

                //this is to give the page enough time to to assemble itself.

                //if your page needs more time, tweak here.

                msWaitForPages: 1000,

                //sanitize function to be used for filenames. Converts '#!/' to '_' as default

                //has a filename argument, must have a return that is a sanitized string

                sanitize: function (requestUri) {

                    //returns 'index.html' if the url is '/', otherwise a prefix

                    if (/\/$/.test(requestUri)) {

                      return 'index.html';

                    } else {

                      return requestUri.replace(/\//g, 'prefix-');

                    }

                },

                //if you would rather not keep the script tags in the html snapshots

                //set `removeScripts` to true. It's false by default

                removeScripts: true,

                //set `removeLinkTags` to true. It's false by default

                removeLinkTags: true,

                //set `removeMetaTags` to true. It's false by default

                removeMetaTags: true,

                //Replace arbitrary parts of the html

                replaceStrings:[

                    {'this': 'will get replaced by this'},

                    {'/old/path/': '/new/path'}

                ],

                // allow to add a custom attribute to the body

                bodyAttr: 'data-prerendered',

                //here goes the list of all urls that should be fetched

                urls: [

                  '',

                  '#!/en-gb/showcase'

                ],

                // a list of cookies to be put into the phantomjs cookies jar for the visited page

                cookies: [

                  {"path": "/", "domain": "localhost", "name": "lang", "value": "en-gb"}

                ],

				// options for phantomJs' page object

				// see http://phantomjs.org/api/webpage/ for available options

				pageOptions: {

					viewportSize : {

						width: 1200,

						height: 800

					}

				}

              }

            }

        }

    });

```

## Release History

- 0.6.1 - trigger warnings with grunt.warn(msg, 6) instead of grunt.log(msg)

- 0.6.0 - Provide a function hook for the file name sanitization (by @mrgamer)

- 0.5.0 - Add option to set cookies. Also fixed a bug for scenarios where multiple instances of the tasks are being used in parallel.

- 0.4.0 - Add more sophisticated replace functionality to transform the html output (thanks to @okcoker)

- 0.3.0 - Escape tabs & introduced new option bodyAttr to place a custom attribute on the body

- 0.2.1 - fixed a bug where quotes where missing from the html

- 0.2.0 - added option to remove script tags from the output

- 0.1.0 - Initial release