Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/remram44/bleached

Validate HTML against a small subset (for example generated by bleach)
https://github.com/remram44/bleached

bleach html html-sanitization sanitizer

Last synced: 21 days ago
JSON representation

Validate HTML against a small subset (for example generated by bleach)

Awesome Lists containing this project

README

        

# bleached

This is a small HTML checker. It can validate that HTML code is safe.

It does not aim to support the entire HTML spec, rather it focuses on checking HTML that has been run through a sanitizer (such as [bleach](https://github.com/mozilla/bleach)).

## How to use?

```
$ pip install bleached
$ python3
>>> import bleached
>>> bleached.is_html_bleached('

Hello world

')
True
>>> bleached.is_html_bleached('alert("Hello world");')
False
>>> bleached.check_html('

Hello world

')
>>> bleached.check_html('alert("Hello world");')
Traceback (most recent call last):
File "", line 1, in
bleached.UnsafeInput: Line 1 character 8 (input index 7): Found forbidden opening tag 'script'
```

## Why use this?

[bleach](https://github.com/mozilla/bleach) is a great library for sanitizing untrusted HTML. You should use it instead of this where possible.

However, it offers no way to check that a piece of HTML has been sanitized. Running the HTML through bleach again will only work if you have the exact same version, as bleach makes no guarantee of stability of their input. This is where bleached is useful.

## Warnings

* No validation of attributes is performed. If you choose to allow an attribute, it is up to you to validate the values.
* This accepts a much smaller subset of HTML than web browsers. Be ready for false negatives if you use this to validate HTML documents.