Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/remram44/bleached
Validate HTML against a small subset (for example generated by bleach)
https://github.com/remram44/bleached
bleach html html-sanitization sanitizer
Last synced: 21 days ago
JSON representation
Validate HTML against a small subset (for example generated by bleach)
- Host: GitHub
- URL: https://github.com/remram44/bleached
- Owner: remram44
- Created: 2023-04-04T22:46:36.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2023-04-05T00:02:16.000Z (over 1 year ago)
- Last Synced: 2024-12-07T02:44:57.186Z (27 days ago)
- Topics: bleach, html, html-sanitization, sanitizer
- Language: Python
- Homepage:
- Size: 9.77 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# bleached
This is a small HTML checker. It can validate that HTML code is safe.
It does not aim to support the entire HTML spec, rather it focuses on checking HTML that has been run through a sanitizer (such as [bleach](https://github.com/mozilla/bleach)).
## How to use?
```
$ pip install bleached
$ python3
>>> import bleached
>>> bleached.is_html_bleached('Hello world
')
True
>>> bleached.is_html_bleached('alert("Hello world");')
False
>>> bleached.check_html('Hello world
')
>>> bleached.check_html('alert("Hello world");')
Traceback (most recent call last):
File "", line 1, in
bleached.UnsafeInput: Line 1 character 8 (input index 7): Found forbidden opening tag 'script'
```## Why use this?
[bleach](https://github.com/mozilla/bleach) is a great library for sanitizing untrusted HTML. You should use it instead of this where possible.
However, it offers no way to check that a piece of HTML has been sanitized. Running the HTML through bleach again will only work if you have the exact same version, as bleach makes no guarantee of stability of their input. This is where bleached is useful.
## Warnings
* No validation of attributes is performed. If you choose to allow an attribute, it is up to you to validate the values.
* This accepts a much smaller subset of HTML than web browsers. Be ready for false negatives if you use this to validate HTML documents.