Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/pluralsight/htmlTagValidator

HTML Tag validator that does not rely on the DOM
https://github.com/pluralsight/htmlTagValidator

Last synced: 3 months ago
JSON representation

HTML Tag validator that does not rely on the DOM

Awesome Lists containing this project

README

        

# html-tag-validator

---

## This Repository is Archived

Originally developed for the Code School interactive learning platform, this project served as the backbone for numerous in-browser coding challenges. Since the integration and sunset of Code School in 2018, it has been unmaintained.

As 2021 comes to a close, I've decided to archive this repository rather than leave it in an unmaintained state. I want to personally thank all of the talented developers from Code School who supported this library and the learning experiences it powered. Thank you as well to everyone who created issues, PRs, and support over the years.

This project has been licensed under the MIT license since 2014, and we hope the community will use it to do great things.

– AJ Foster, Software Engineer at Pluralsight

---

This library takes some HTML source code, provided as a string, and generates an
AST. An error will be generated describing what is malformed in the source document
if the AST cannot be generated.

Note: this project is work-in-progress and is **not fully spec-compliant**.

See `todo.md` for plans for current and future releases.

The parser implements the basic components of the HTML 5 spec, such as:
- `doctype` definition
- HTML 5 elements
- HTML 5 attributes
- Enhanced validation for `script`, `style`, `link` and `meta` elements
- Basic support for `iframe` elements
- HTML comments

``` html

```

- Conditional comments

``` html

```

- Allowed `` element `type` values and the attributes supported by each `type`
- Hierarchal rules, such as: a properly-formed HTML 5 document should have a `title` element with contents within the `head` tag
- Void *elements*

``` html

```

- Void *attributes*

``` html

```

- Normal *attributes*

``` html


```

## Install

```
npm install html-tag-validator
```

## Usage

The library exposes a single function that accepts two arguments: a string
containing HTML, and a callback function. The callback should be in the form:

``` javascript
function (err, ast) {
if (err) {
// View the error generated by the parser
console.log(err.message);
} else {
// View a the AST generated by the parser
console.log(ast);
}
}
```

### Default syntax

``` javascript
var htmlTagValidator = require('html-tag-validator'),
sampleHtml = "" +
"hello world" +
"

my cool page

" +
"";

// Turn a HTML string into an AST
htmlTagValidator(sampleHtml, function (err, ast) {
if (err) {
throw err;
} else {
console.log(ast);
}
});
```

Produces the following AST:

```
doctype: null
document:
-
type: element
void: false
name: html
attributes: {}

children:
-
type: element
void: false
name: head
attributes: {}

children:
-
type: title
attributes: {}

contents: hello world
-
type: element
void: false
name: body
attributes: {}

children:
-
type: element
void: false
name: p
attributes:
style: color: pink;
children:
-
type: text
contents: my cool page
```

### Passing in options

Currently, you can provide custom attribute names to merge with the default
values, custom validation rules, and global settings such as the output format
for the validation messages.

``` javascript
var htmlTagValidator = require('html-tag-validator'),
sampleHtml = "" +
"hello world" +
"" +
"

" +
"my cool page" +
"

" +
"" +
"";

/*
* Allow Angular 2 style attributes on all elements. The key '_' means match
* on ANY tag, but you could also specific specific tag names (e.g.:
* 'my-custom-tag'). Custom attributes for existing HTML 5 tags will be merged
* with the official list of allowed tags. The key 'mixed' means normal or void
* attributes for the given tag name. You can also specify to target all 'normal'
* and/or 'void' attributes.
*
* This options object says the following:
* for all existing HTML 5 tag names '_' ...
* allow the following types of attribute names
* 1) *ngSomething
* 2) (something)
* 3) [something]
* 4) [(something)]
* for void (e.g.: async) attributes OR
* normal attributes (e.g.: checked="checked") ...
* in addition to the standard HTML 5 attributes for the element.
* Also, this adds a new normal (not self-closing) tag named
* template to support Angular 2 tags.
*/
htmlTagValidator(sampleHtml, {
settings: {
// Set output format for validation error messages
format: 'plain', // 'plain', 'html', or 'markdown'
/* Setting verbose to true will generate an AST with additional
* details such as whether tag attributes are unquoted */
verbose: false, // default: false
/* Set preserveCase to true to preserve the original case of tag and
* attribute names so that you can support case-sensitive Angular 2
* attribute names such as *ngFor and [ngModel] */
preserveCase: true // default: false
},
tags: {
normal: [ 'template' ]
},
attributes: {
'_': {
mixed: /^((\*ng)|(^\[[\S]+\]$)|(^\([\S]+\)$))|(^\[\([\S]+\)\]$)/
}
}
}, function (err, ast) {
if (err) {
throw err;
} else {
console.log(ast);
}
});
```

``` javascript
var htmlTagValidator = require('html-tag-validator'),
sampleHtml = "" +
"hello world" +
"

my cool page

" +
"";

/*
* Allow old-style HTML table attributes on specific elements.
*
* This options object adds some old HTML attributes for tables, to
* the 'table' and 'td' elements, in addition to the standard HTML 5
* attributes. Because the key is 'normal', these attributes are
* validated as normal attributes that should have a defined value.
* One
* Two
*/
htmlTagValidator(sampleHtml, {
'settings': {
'format': 'plain'
},
'attributes': {
'table': {
'normal': [
'align', 'bgcolor', 'border', 'cellpadding', 'cellspacing',
'frame', 'rules', 'summary', 'width'
]
},
'td': {
'normal': [
'height', 'width', 'bgcolor'
]
}
}
}, function (err, ast) {
if (err) {
throw err;
} else {
console.log(ast);
}
});
```

## Contributing

Once the dependencies are installed, start development with the following command:

`grunt test` - Automatically compile the parser and run the tests in `test/index-spec.js`.

`grunt debug` - Run tests with --inspect flag and extended output

`grunt watch debug` - Get extended output and start a file watcher.

## Publishing to npm
Publishing master as normal works for pure html implementations, but sometimes
a variation is needed, for example a PHP flavor that supports inline PHP tags.

Any variations should be on their own branch and named appropriately. These should
be published separately as well, this can be done using npm tags. First change
the version number in the package.json to include the language prefix, so for PHP
that would be something like: `1.5.0-php` then when publishing to npm do:
`npm publish --tag php`. Doing this will allow you to reference this variation in
your package.json like: `"html-tag-validator": "1.5.0-php"`

## Note on validator variations
Anything that pertains to vanilla HTML should be implemented on master and merged
into variation branches.

### Writing tests

Tests refer to an HTML test file in `test/html/` and the test name is a
reference to the filename of the test file. For example `super test 2`
as a test name points to the file `test/html/superTest2.html`.

There are three options for the test helpers exposed by `tree`:
- `tree.ok(this, done)` to assert that the test file successfully generates an AST
- `tree.equals(ast, this, done)` to assert that the test file generates an AST that exactly matches `ast`
- `tree.error()` to assert that a test throws an error
- `tree.error("This is the error message", this, done)` assert an error `message`
- `tree.error({'line': 2}, this, done)` assert an object of properties that each exist in the error

You can pass in an `options` object as the _2nd-to-last argument_ in each method:

``` javascript
var options = {
'settings': {
'format': 'html'
}
};
tree.ok(this, options, done);
```

``` javascript
// test/html/basicSelfClosing.html
it('basic self closing', function(done) {
tree.ok(this, done);
});

// test/html/basicListItems.html
it('basic list items', function(done) {
tree.error({
'message': 'li is not a valid self closing tag',
'line': 5
}, this, done);
});
```