{"id":21262799,"url":"https://github.com/michaelfranzl/sanitize-dom","last_synced_at":"2025-07-11T04:31:07.603Z","repository":{"id":26889312,"uuid":"111403286","full_name":"michaelfranzl/sanitize-dom","owner":"michaelfranzl","description":"Isomorphic library for recursive manipulation of live WHATWG DOMs.","archived":false,"fork":false,"pushed_at":"2023-03-04T05:52:42.000Z","size":408,"stargazers_count":5,"open_issues_count":8,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-08-09T03:53:49.160Z","etag":null,"topics":["dom","html","recursive-algorithm","sanitization","sanitize-html","sanitizer","whatwg-dom"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/michaelfranzl.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-11-20T11:44:10.000Z","updated_at":"2023-03-12T19:59:30.000Z","dependencies_parsed_at":"2023-01-14T05:29:57.398Z","dependency_job_id":null,"html_url":"https://github.com/michaelfranzl/sanitize-dom","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michaelfranzl%2Fsanitize-dom","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michaelfranzl%2Fsanitize-dom/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michaelfranzl%2Fsanitize-dom/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/michaelfranzl%2Fsanitize-dom/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/michaelfranzl","download_url":"https://codeload.github.com/michaelfranzl/sanitize-dom/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225674898,"owners_count":17506272,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dom","html","recursive-algorithm","sanitization","sanitize-html","sanitizer","whatwg-dom"],"created_at":"2024-11-21T04:59:25.108Z","updated_at":"2024-11-21T04:59:25.644Z","avatar_url":"https://github.com/michaelfranzl.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# sanitize-dom\n\n![Test](https://github.com/michaelfranzl/sanitize-dom/workflows/Test/badge.svg?branch=master)\n\nRecursive sanitizer/filter to manipulate live [WHATWG DOM](https://dom.spec.whatwg.org)s rather than HTML, for the browser and Node.js.\n\n## Rationale\n\nDirect DOM manipulation has gotten a bad reputation in the last decade of web development. From Ruby on Rails to React, the DOM was seen as something to gloriously destroy and re-render from the server or even from the browser. Never mind that the browser already exerted a lot of effort parsing HTML and constructing this tree! Mind-numbingly complex HTML string regular expression tests and manipulations had to deal with low-level details of the HTML syntax to insert, delete and change elements, sometimes on every keystroke! Contrasting to that, functions like `createElement`, `remove` and `insertBefore` from the DOM world were largely unknown and unused, except perhaps in jQuery.\n\nProcessing of HTML is **destructive**: The original DOM is destroyed and garbage collected with a certain time delay. Attached event handlers are detached and garbage collected. A completely new DOM is created from parsing new HTML set via `.innerHTML =`. Event listeners will have to be re-attached from the user-land (this is no issue when using `on*` HTML attributes, but this has disadvantages as well).\n\n*It doesn't have to be this way. Do not eliminate, but manipulate!*\n\n### Save the (DOM) trees!\n\n`sanitize-dom` crawls a DOM subtree (beginning from a given node, all the way down to its ancestral leaves) and filters and manipulates it non-destructively. This is very efficient: The browser doesn't have to re-render everything; it only re-renders what has been *changed* (sound familiar from React?).\n\nThe benefits of direct DOM manipulation:\n\n* Nodes stay alive.\n* References to nodes (i.e. stored in a `Map` or `WeakMap`) stay alive.\n* Already attached event handlers stay alive.\n* The browser doesn't have to re-render entire sections of a page; thus no flickering, no scroll jumping, no big CPU spikes.\n* CPU cycles for repeatedly parsing and dumping of HTML are eliminated.\n\n`sanitize-dom`s further advantages:\n\n* No dependencies.\n* Small footprint (only about 7 kB minimized).\n* Faster than other HTML sanitizers because there is no HTML parsing and serialization.\n\n## Use cases\n\nAside from the browser, `sanitize-dom` can also be used in Node.js by supplying WHATWG DOM implementations like [jsdom](https://github.com/tmpvar/jsdom).\n\nThe [test file](test/run-tests.js) describes additional usage patterns and features.\n\nFor the usage examples below, I'll use `sanitizeHtml` just to be able to illustrate the HTML output.\n\nBy default, all tags are 'flattened', i.e. only their inner text is kept:\n\n```javascript\nsanitizeHtml(document, '\u003cdiv\u003e\u003cp\u003eabc \u003cb\u003edef\u003c/b\u003e\u003c/p\u003e\u003c/div\u003e');\n\"abc def\"\n```\n\nSelective joining of same-tag siblings:\n\n```javascript\n// Joins the two I tags.\nsanitizeHtml(document, '\u003ci\u003eHello\u003c/i\u003e \u003ci\u003eworld!\u003c/i\u003e \u003cem\u003eGoodbye\u003c/em\u003e \u003cem\u003eworld!\u003c/em\u003e', {\n  allow_tags_deep: { '.*': '.*' },\n  join_siblings: ['I'],\n});\n\"\u003ci\u003eHello world!\u003c/i\u003e \u003cem\u003eGoodbye\u003c/em\u003e \u003cem\u003eworld!\u003c/em\u003e\"\n```\n\nRemoval of redundant nested nodes (ubiquitous when using a WYSIWYG `contenteditable` editor):\n\n```javascript\nsanitizeHtml(document, '\u003ci\u003e\u003ci\u003eH\u003ci\u003e\u003c/i\u003eello\u003c/i\u003e \u003ci\u003eworld! \u003ci\u003eGood\u003ci\u003ebye\u003c/i\u003e\u003c/i\u003e world!\u003c/i\u003e', {\n  allow_tags_deep: { '.*': '.*' },\n  flatten_tags_deep: { i: 'i' },\n});\n\"\u003ci\u003eHello  world! Goodbye world!\u003c/i\u003e\"\n```\n\nRemove redundant empty tags:\n\n```javascript\nsanitizeHtml(document, 'H\u003ci\u003e\u003c/i\u003eello world!', {\n  allow_tags_deep: { '.*': '.*' },\n  remove_empty: true,\n});\n\"Hello world!\"\n```\n\nBy default, all classes and attributes are removed:\n\n```javascript\n// Keep all nodes, but remove all of their attributes and classes:\nsanitizeHtml(document, '\u003cdiv\u003e\u003cp\u003eabc \u003cb class=\"green\" data-type=\"test\"\u003edef\u003c/b\u003e\u003c/p\u003e\u003c/div\u003e', {\n  allow_tags_deep: { '.*': '.*' },\n});\n\"\u003cdiv\u003e\u003cp\u003eabc \u003cb\u003edef\u003c/b\u003e\u003c/p\u003e\u003c/div\u003e\"\n```\n\nKeep all nodes and all their attributes and classes:\n\n```javascript\nsanitizeHtml(document, '\u003cdiv\u003e\u003cp class=\"red green\"\u003eabc \u003cb class=\"green\" data-type=\"test\"\u003edef\u003c/b\u003e\u003c/p\u003e\u003c/div\u003e', {\n  allow_tags_deep: { '.*': '.*' },\n  allow_attributes_by_tag: { '.*': '.*' },\n  allow_classes_by_tag: { '.*': '.*' },\n});\n'\u003cdiv\u003e\u003cp class=\"red green\"\u003eabc \u003cb class=\"green\" data-type=\"test\"\u003edef\u003c/b\u003e\u003c/p\u003e\u003c/div\u003e'\n```\n\nWhite-listing of classes and attributes:\n\n```javascript\n// Keep only data- attributes and 'green' classes\nsanitizeHtml(document, '\u003cdiv\u003e\u003cp class=\"red green\"\u003eabc \u003cb class=\"green\" data-type=\"test\"\u003edef\u003c/b\u003e\u003c/p\u003e\u003c/div\u003e', {\n  allow_tags_deep: { '.*': '.*' },\n  allow_attributes_by_tag: { '.*': 'data-.*' },\n  allow_classes_by_tag: { '.*': 'green' },\n});\n'\u003cdiv\u003e\u003cp class=\"green\"\u003eabc \u003cb class=\"green\" data-type=\"test\"\u003edef\u003c/b\u003e\u003c/p\u003e\u003c/div\u003e'\n```\n\nWhite-listing of node tags to keep:\n\n```javascript\n// Keep only B tags anywhere in the document.\nsanitizeHtml(document, '\u003ci\u003eabc\u003c/i\u003e \u003cb\u003edef\u003c/b\u003e \u003cem\u003eghi\u003c/em\u003e', {\n  allow_tags_deep: { '.*': '^b$' },\n});\n\"abc \u003cb\u003edef\u003c/b\u003e ghi\"\n\n// Keep only DIV children of BODY and I children of DIV.\nsanitizeHtml(document, '\u003cdiv\u003e \u003ci\u003eabc\u003c/i\u003e \u003cem\u003edef\u003c/em\u003e\u003c/div\u003e \u003ci\u003eghi\u003c/i\u003e', {\n  allow_tags_direct: {\n    body: 'div',\n    div: '^i',\n  },\n});\n\"\u003cdiv\u003e \u003ci\u003eabc\u003c/i\u003e def\u003c/div\u003e ghi\"\n```\n\nSelective flattening of nodes:\n\n```javascript\n// Flatten only EM children of DIV.\nsanitizeHtml(document, '\u003cdiv\u003e \u003ci\u003eabc\u003c/i\u003e \u003cem\u003edef\u003c/em\u003e\u003c/div\u003e \u003ci\u003eghi\u003c/i\u003e', {\n  allow_tags_deep: { '.*': '.*' },\n  flatten_tags_direct: {\n    div: 'em',\n  },\n});\n\"\u003cdiv\u003e \u003ci\u003eabc\u003c/i\u003e def\u003c/div\u003e \u003ci\u003eghi\u003c/i\u003e\"\n\n// Flatten I tags anywhere in the document.\nsanitizeHtml(document, '\u003cdiv\u003e \u003ci\u003eabc\u003c/i\u003e \u003cem\u003edef\u003c/em\u003e\u003c/div\u003e \u003ci\u003eghi\u003c/i\u003e', {\n  allow_tags_deep: { '.*': '.*' },\n  flatten_tags_deep: {\n    '.*': '^i',\n  },\n});\n\"\u003cdiv\u003e abc \u003cem\u003edef\u003c/em\u003e\u003c/div\u003e ghi\"\n```\n\nSelective removal of tags:\n\n```javascript\n// Remove I children of DIVs.\nsanitizeHtml(document, '\u003cdiv\u003e \u003ci\u003eabc\u003c/i\u003e \u003cem\u003edef\u003c/em\u003e\u003c/div\u003e \u003ci\u003eghi\u003c/i\u003e', {\n  allow_tags_deep: { '.*': '.*' },\n  remove_tags_direct: {\n    'div': 'i',\n  },\n});\n\"\u003cdiv\u003e  \u003cem\u003edef\u003c/em\u003e\u003c/div\u003e \u003ci\u003eghi\u003c/i\u003e\"\n```\n\nThen, sometimes there are more than one way to accomplish the same, as shown in this advanced\nexample:\n\n```javascript\n// Keep all tags except B, anywhere in the document. Two different solutions:\n\nsanitizeHtml(document, '\u003cdiv\u003e \u003ci\u003eabc\u003c/i\u003e \u003cb\u003edef\u003c/b\u003e \u003cem\u003eghi\u003c/em\u003e \u003c/div\u003e', {\n  allow_tags_deep: { '.*': '.*' },\n  flatten_tags_deep: { '.*': 'B' },\n});\n\"\u003cdiv\u003e \u003ci\u003eabc\u003c/i\u003e def \u003cem\u003eghi\u003c/em\u003e \u003c/div\u003e\"\n\nsanitizeHtml(document, '\u003cdiv\u003e \u003ci\u003eabc\u003c/i\u003e \u003cb\u003edef\u003c/b\u003e \u003cem\u003eghi\u003c/em\u003e \u003c/div\u003e', {\n  allow_tags_deep: { '.*': '^((?!b).)*$' }\n});\n\"\u003cdiv\u003e \u003ci\u003eabc\u003c/i\u003e def \u003cem\u003eghi\u003c/em\u003e \u003c/div\u003e\"\n```\n\nAnd finally, filter functions allow ultimate flexibility:\n\n```javascript\n// change B node to EM node with contextual inner text; attach an event listener.\nsanitizeHtml(document, '\u003cp\u003eabc \u003ci\u003e\u003cb\u003edef\u003c/b\u003e \u003cb\u003eghi\u003c/b\u003e\u003c/i\u003e\u003c/p\u003e', {\n  allow_tags_direct: {\n    '.*': '.*',\n  },\n  filters_by_tag: {\n    B: [\n      function changesToEm(node, { parentNodes, parentNodenames, siblingIndex }) {\n        const em = document.createElement('em');\n        const text = `${parentNodenames.join(', ')} - ${siblingIndex}`;\n        em.innerHTML = text;\n        em.addEventListener('click', () =\u003e alert(text));\n        return em;\n      },\n    ],\n  },\n});\n// In a browser, the EM tags would be clickable and an alert box would pop up.\n\"\u003cp\u003eabc \u003ci\u003e\u003cem\u003eI, P, BODY - 0\u003c/em\u003e \u003cem\u003eI, P, BODY - 2\u003c/em\u003e\u003c/i\u003e\u003c/p\u003e\"\n```\n\n## Tests\n\nRun in Node.js:\n\n```sh\nnpm test\n```\n\nFor the browser, run:\n\n```sh\ncd sanitize-dom\nnpm i -g jspm@2.0.0-beta.7 http-server\njspm install @jspm/core@1.1.0\nhttp-server\n```\n\nThen, in a browser which supports `\u003cscript type=\"importmap\"\u003e\u003c/script\u003e` (e.g. Google Chrome\nversion \u003e= 81), browse to http://127.0.0.1:8080/test\n\n# API Reference\n\n## Functions\n\n\u003cdl\u003e\n\u003cdt\u003e\u003ca href=\"#sanitizeNode\"\u003esanitizeNode(doc, node, [opts], [nodePropertyMap])\u003c/a\u003e\u003c/dt\u003e\n\u003cdd\u003e\u003cp\u003eSimple wrapper for \u003ca href=\"#sanitizeDom\"\u003esanitizeDom\u003c/a\u003e. Processes the node and its childNodes recursively.\u003c/p\u003e\n\u003c/dd\u003e\n\u003cdt\u003e\u003ca href=\"#sanitizeChildNodes\"\u003esanitizeChildNodes(doc, node, [opts], [nodePropertyMap])\u003c/a\u003e\u003c/dt\u003e\n\u003cdd\u003e\u003cp\u003eSimple wrapper for \u003ca href=\"#sanitizeDom\"\u003esanitizeDom\u003c/a\u003e. Processes only the node\u0026#39;s childNodes recursively, but not\nthe node itself.\u003c/p\u003e\n\u003c/dd\u003e\n\u003cdt\u003e\u003ca href=\"#sanitizeHtml\"\u003esanitizeHtml(doc, html, [opts], [isDocument], [nodePropertyMap])\u003c/a\u003e ⇒ \u003ccode\u003eString\u003c/code\u003e\u003c/dt\u003e\n\u003cdd\u003e\u003cp\u003eSimple wrapper for \u003ca href=\"#sanitizeDom\"\u003esanitizeDom\u003c/a\u003e. Instead of a DomNode, it takes an HTML string.\u003c/p\u003e\n\u003c/dd\u003e\n\u003cdt\u003e\u003ca href=\"#sanitizeDom\"\u003esanitizeDom(doc, contextNode, [opts], [childrenOnly], [nodePropertyMap])\u003c/a\u003e\u003c/dt\u003e\n\u003cdd\u003e\u003cp\u003eThis function is not exported: Please use the wrapper functions instead:\u003c/p\u003e\n\u003cp\u003e\u003ca href=\"#sanitizeHtml\"\u003esanitizeHtml\u003c/a\u003e, \u003ca href=\"#sanitizeNode\"\u003esanitizeNode\u003c/a\u003e, and \u003ca href=\"#sanitizeChildNodes\"\u003esanitizeChildNodes\u003c/a\u003e.\u003c/p\u003e\n\u003cp\u003eRecursively processes a tree with \u003ccode\u003enode\u003c/code\u003e at the root.\u003c/p\u003e\n\u003cp\u003eIn all descriptions, the term \u0026quot;flatten\u0026quot; means that a node is replaced with the node\u0026#39;s childNodes.\nFor example, if the B node in \u003ccode\u003e\u0026lt;i\u0026gt;abc\u0026lt;b\u0026gt;def\u0026lt;u\u0026gt;ghi\u0026lt;/u\u0026gt;\u0026lt;/b\u0026gt;\u0026lt;/i\u0026gt;\u003c/code\u003e is flattened, the result is\n\u003ccode\u003e\u0026lt;i\u0026gt;abcdef\u0026lt;u\u0026gt;ghi\u0026lt;/u\u0026gt;\u0026lt;/i\u0026gt;\u003c/code\u003e.\u003c/p\u003e\n\u003cp\u003eEach node is processed in the following sequence:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eFilters matching the \u003ccode\u003eopts.filters_by_tag\u003c/code\u003e spec are called. If the filter returns \u003ccode\u003enull\u003c/code\u003e, the\nnode is removed and processing stops (see \u003ca href=\"#filter\"\u003efilter\u003c/a\u003es).\u003c/li\u003e\n\u003cli\u003eIf the \u003ccode\u003eopts.remove_tags_*\u003c/code\u003e spec matches, the node is removed and processing stops.\u003c/li\u003e\n\u003cli\u003eIf the \u003ccode\u003eopts.flatten_tags_*\u003c/code\u003e spec matches, the node is flattened and processing stops.\u003c/li\u003e\n\u003cli\u003eIf the \u003ccode\u003eopts.allow_tags_*\u003c/code\u003e spec matches:\u003cul\u003e\n\u003cli\u003eAll attributes not matching \u003ccode\u003eopts.allow_attributes_by_tag\u003c/code\u003e are removed.\u003c/li\u003e\n\u003cli\u003eAll class names not matching \u003ccode\u003eopts.allow_classes_by_tag\u003c/code\u003e are removed.\u003c/li\u003e\n\u003cli\u003eThe node is kept and processing stops.\u003c/li\u003e\n\u003c/ul\u003e\n\u003c/li\u003e\n\u003cli\u003eThe node is flattened.\u003c/li\u003e\n\u003c/ol\u003e\n\u003c/dd\u003e\n\u003c/dl\u003e\n\n## Typedefs\n\n\u003cdl\u003e\n\u003cdt\u003e\u003ca href=\"#DomDocument\"\u003eDomDocument\u003c/a\u003e : \u003ccode\u003eObject\u003c/code\u003e\u003c/dt\u003e\n\u003cdd\u003e\u003cp\u003eImplements the WHATWG DOM Document interface.\u003c/p\u003e\n\u003cp\u003eIn the browser, this is \u003ccode\u003ewindow.document\u003c/code\u003e. In Node.js, this may for example be\n\u003ca href=\"https://github.com/tmpvar/jsdom\"\u003enew JSDOM().window.document\u003c/a\u003e.\u003c/p\u003e\n\u003c/dd\u003e\n\u003cdt\u003e\u003ca href=\"#DomNode\"\u003eDomNode\u003c/a\u003e : \u003ccode\u003eObject\u003c/code\u003e\u003c/dt\u003e\n\u003cdd\u003e\u003cp\u003eImplements the WHATWG DOM Node interface.\u003c/p\u003e\n\u003cp\u003eCustom properties for each node can be stored in a \u003ccode\u003eWeakMap\u003c/code\u003e passed as option \u003ccode\u003enodePropertyMap\u003c/code\u003e\nto one of the sanitize functions.\u003c/p\u003e\n\u003c/dd\u003e\n\u003cdt\u003e\u003ca href=\"#Tagname\"\u003eTagname\u003c/a\u003e : \u003ccode\u003estring\u003c/code\u003e\u003c/dt\u003e\n\u003cdd\u003e\u003cp\u003eNode tag name.\u003c/p\u003e\n\u003cp\u003eEven though in the WHATWG DOM text nodes (nodeType 3) have a tag name \u003ccode\u003e#text\u003c/code\u003e,\nthese are referred to by the simpler string \u0026#39;TEXT\u0026#39; for convenience.\u003c/p\u003e\n\u003c/dd\u003e\n\u003cdt\u003e\u003ca href=\"#Regex\"\u003eRegex\u003c/a\u003e : \u003ccode\u003estring\u003c/code\u003e\u003c/dt\u003e\n\u003cdd\u003e\u003cp\u003eA string which is compiled to a case-insensitive regular expression \u003ccode\u003enew RegExp(regex, \u0026#39;i\u0026#39;)\u003c/code\u003e.\nThe regular expression is used to match a \u003ca href=\"#Tagname\"\u003eTagname\u003c/a\u003e.\u003c/p\u003e\n\u003c/dd\u003e\n\u003cdt\u003e\u003ca href=\"#ParentChildSpec\"\u003eParentChildSpec\u003c/a\u003e : \u003ccode\u003eObject.\u0026lt;Regex, Array.\u0026lt;Regex\u0026gt;\u0026gt;\u003c/code\u003e\u003c/dt\u003e\n\u003cdd\u003e\u003cp\u003eProperty names are matched against a (direct or ancestral) parent node\u0026#39;s \u003ca href=\"#Tagname\"\u003eTagname\u003c/a\u003e.\nAssociated values are matched against the current nodes \u003ca href=\"#Tagname\"\u003eTagname\u003c/a\u003e.\u003c/p\u003e\n\u003c/dd\u003e\n\u003cdt\u003e\u003ca href=\"#TagAttributeNameSpec\"\u003eTagAttributeNameSpec\u003c/a\u003e : \u003ccode\u003eObject.\u0026lt;Regex, Array.\u0026lt;Regex\u0026gt;\u0026gt;\u003c/code\u003e\u003c/dt\u003e\n\u003cdd\u003e\u003cp\u003eProperty names are matched against the current nodes \u003ca href=\"#Tagname\"\u003eTagname\u003c/a\u003e. Associated values are\nused to match its attribute names.\u003c/p\u003e\n\u003c/dd\u003e\n\u003cdt\u003e\u003ca href=\"#TagClassNameSpec\"\u003eTagClassNameSpec\u003c/a\u003e : \u003ccode\u003eObject.\u0026lt;Regex, Array.\u0026lt;Regex\u0026gt;\u0026gt;\u003c/code\u003e\u003c/dt\u003e\n\u003cdd\u003e\u003cp\u003eProperty names are matched against the current nodes \u003ca href=\"#Tagname\"\u003eTagname\u003c/a\u003e. Associated values are used\nto match its class names.\u003c/p\u003e\n\u003c/dd\u003e\n\u003cdt\u003e\u003ca href=\"#FilterSpec\"\u003eFilterSpec\u003c/a\u003e : \u003ccode\u003eObject.\u0026lt;Regex, Array.\u0026lt;filter\u0026gt;\u0026gt;\u003c/code\u003e\u003c/dt\u003e\n\u003cdd\u003e\u003cp\u003eProperty names are matched against node \u003ca href=\"#Tagname\"\u003eTagname\u003c/a\u003es. Associated values\nare the \u003ca href=\"#filter\"\u003efilter\u003c/a\u003es which are run on the node.\u003c/p\u003e\n\u003c/dd\u003e\n\u003cdt\u003e\u003ca href=\"#filter\"\u003efilter\u003c/a\u003e ⇒ \u003ccode\u003e\u003ca href=\"#DomNode\"\u003eDomNode\u003c/a\u003e\u003c/code\u003e | \u003ccode\u003e\u003ca href=\"#DomNode\"\u003eArray.\u0026lt;DomNode\u0026gt;\u003c/a\u003e\u003c/code\u003e | \u003ccode\u003enull\u003c/code\u003e\u003c/dt\u003e\n\u003cdd\u003e\u003cp\u003eFilter functions can either...\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003ereturn the same node (the first argument),\u003c/li\u003e\n\u003cli\u003ereturn a single, or an Array of, newly created \u003ca href=\"#DomNode\"\u003eDomNode\u003c/a\u003e(s), in which case \u003ccode\u003enode\u003c/code\u003e is\nreplaced with the new node(s),\u003c/li\u003e\n\u003cli\u003ereturn \u003ccode\u003enull\u003c/code\u003e, in which case \u003ccode\u003enode\u003c/code\u003e is removed.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eNote that newly generated \u003ca href=\"#DomNode\"\u003eDomNode\u003c/a\u003e(s) are processed by running \u003ca href=\"#sanitizeDom\"\u003esanitizeDom\u003c/a\u003e\non them, as if they had been part of the original tree. This has the following implication:\u003c/p\u003e\n\u003cp\u003eIf a filter returns a newly generated \u003ca href=\"#DomNode\"\u003eDomNode\u003c/a\u003e with the same \u003ca href=\"#Tagname\"\u003eTagname\u003c/a\u003e as \u003ccode\u003enode\u003c/code\u003e, it\nwould cause the same filter to be called again, which may lead to an infinite loop if the filter\nis always returning the same result (this would be a badly behaved filter). To protect against\ninfinite loops, the author of the filter must acknowledge this circumstance by setting a boolean\nproperty called \u0026#39;skip_filters\u0026#39; for the \u003ca href=\"#DomNode\"\u003eDomNode\u003c/a\u003e) (in a \u003ccode\u003eWeakMap\u003c/code\u003e which the caller must\nprovide to one of the sanitize functions as the argument \u003ccode\u003enodePropertyMap\u003c/code\u003e). If \u0026#39;skip_filters\u0026#39; is\nnot set, an error is thrown. With well-behaved filters it is possible to continue subsequent\nprocessing of the returned node without causing an infinite loop.\u003c/p\u003e\n\u003c/dd\u003e\n\u003c/dl\u003e\n\n\u003ca name=\"sanitizeNode\"\u003e\u003c/a\u003e\n\n## sanitizeNode(doc, node, [opts], [nodePropertyMap])\nSimple wrapper for [sanitizeDom](#sanitizeDom). Processes the node and its childNodes recursively.\n\n**Kind**: global function  \n\n| Param | Type | Default | Description |\n| --- | --- | --- | --- |\n| doc | [\u003ccode\u003eDomDocument\u003c/code\u003e](#DomDocument) |  |  |\n| node | [\u003ccode\u003eDomNode\u003c/code\u003e](#DomNode) |  |  |\n| [opts] | \u003ccode\u003eObject\u003c/code\u003e | \u003ccode\u003e{}\u003c/code\u003e |  |\n| [nodePropertyMap] | \u003ccode\u003eWeakMap.\u0026lt;DomNode, Object\u0026gt;\u003c/code\u003e | \u003ccode\u003enew WeakMap()\u003c/code\u003e | Additional node properties |\n\n\u003ca name=\"sanitizeChildNodes\"\u003e\u003c/a\u003e\n\n## sanitizeChildNodes(doc, node, [opts], [nodePropertyMap])\nSimple wrapper for [sanitizeDom](#sanitizeDom). Processes only the node's childNodes recursively, but not\nthe node itself.\n\n**Kind**: global function  \n\n| Param | Type | Default | Description |\n| --- | --- | --- | --- |\n| doc | [\u003ccode\u003eDomDocument\u003c/code\u003e](#DomDocument) |  |  |\n| node | [\u003ccode\u003eDomNode\u003c/code\u003e](#DomNode) |  |  |\n| [opts] | \u003ccode\u003eObject\u003c/code\u003e | \u003ccode\u003e{}\u003c/code\u003e |  |\n| [nodePropertyMap] | \u003ccode\u003eWeakMap.\u0026lt;DomNode, Object\u0026gt;\u003c/code\u003e | \u003ccode\u003enew WeakMap()\u003c/code\u003e | Additional node properties |\n\n\u003ca name=\"sanitizeHtml\"\u003e\u003c/a\u003e\n\n## sanitizeHtml(doc, html, [opts], [isDocument], [nodePropertyMap]) ⇒ \u003ccode\u003eString\u003c/code\u003e\nSimple wrapper for [sanitizeDom](#sanitizeDom). Instead of a DomNode, it takes an HTML string.\n\n**Kind**: global function  \n**Returns**: \u003ccode\u003eString\u003c/code\u003e - The processed HTML  \n\n| Param | Type | Default | Description |\n| --- | --- | --- | --- |\n| doc | [\u003ccode\u003eDomDocument\u003c/code\u003e](#DomDocument) |  |  |\n| html | \u003ccode\u003estring\u003c/code\u003e |  |  |\n| [opts] | \u003ccode\u003eObject\u003c/code\u003e | \u003ccode\u003e{}\u003c/code\u003e |  |\n| [isDocument] | \u003ccode\u003eBoolean\u003c/code\u003e | \u003ccode\u003efalse\u003c/code\u003e | Set this to `true` if you are passing an entire HTML document (beginning with the \u003chtml\u003e tag). The context node name will be HTML. If `false`, then the context node name will be BODY. |\n| [nodePropertyMap] | \u003ccode\u003eWeakMap.\u0026lt;DomNode, Object\u0026gt;\u003c/code\u003e | \u003ccode\u003enew WeakMap()\u003c/code\u003e | Additional node properties |\n\n\u003ca name=\"sanitizeDom\"\u003e\u003c/a\u003e\n\n## sanitizeDom(doc, contextNode, [opts], [childrenOnly], [nodePropertyMap])\nThis function is not exported: Please use the wrapper functions instead:\n\n[sanitizeHtml](#sanitizeHtml), [sanitizeNode](#sanitizeNode), and [sanitizeChildNodes](#sanitizeChildNodes).\n\nRecursively processes a tree with `node` at the root.\n\nIn all descriptions, the term \"flatten\" means that a node is replaced with the node's childNodes.\nFor example, if the B node in `\u003ci\u003eabc\u003cb\u003edef\u003cu\u003eghi\u003c/u\u003e\u003c/b\u003e\u003c/i\u003e` is flattened, the result is\n`\u003ci\u003eabcdef\u003cu\u003eghi\u003c/u\u003e\u003c/i\u003e`.\n\nEach node is processed in the following sequence:\n\n1. Filters matching the `opts.filters_by_tag` spec are called. If the filter returns `null`, the\n   node is removed and processing stops (see [filter](#filter)s).\n2. If the `opts.remove_tags_*` spec matches, the node is removed and processing stops.\n3. If the `opts.flatten_tags_*` spec matches, the node is flattened and processing stops.\n4. If the `opts.allow_tags_*` spec matches:\n    * All attributes not matching `opts.allow_attributes_by_tag` are removed.\n    * All class names not matching `opts.allow_classes_by_tag` are removed.\n    * The node is kept and processing stops.\n5. The node is flattened.\n\n**Kind**: global function  \n\n| Param | Type | Default | Description |\n| --- | --- | --- | --- |\n| doc | [\u003ccode\u003eDomDocument\u003c/code\u003e](#DomDocument) |  | The document |\n| contextNode | [\u003ccode\u003eDomNode\u003c/code\u003e](#DomNode) |  | The root node |\n| [opts] | \u003ccode\u003eObject\u003c/code\u003e | \u003ccode\u003e{}\u003c/code\u003e | Options for processing. |\n| [opts.filters_by_tag] | [\u003ccode\u003eFilterSpec\u003c/code\u003e](#FilterSpec) | \u003ccode\u003e{}\u003c/code\u003e | Matching filters are called with the node. |\n| [opts.remove_tags_direct] | [\u003ccode\u003eParentChildSpec\u003c/code\u003e](#ParentChildSpec) | \u003ccode\u003e{}\u003c/code\u003e | Matching nodes which are a direct child of the matching parent node are removed. |\n| [opts.remove_tags_deep] | [\u003ccode\u003eParentChildSpec\u003c/code\u003e](#ParentChildSpec) | \u003ccode\u003e{\u0026#x27;.*\u0026#x27;: [\u0026#x27;style\u0026#x27;,\u0026#x27;script\u0026#x27;,\u0026#x27;textarea\u0026#x27;,\u0026#x27;noscript\u0026#x27;]}\u003c/code\u003e | Matching nodes which are anywhere below the matching parent node are removed. |\n| [opts.flatten_tags_direct] | [\u003ccode\u003eParentChildSpec\u003c/code\u003e](#ParentChildSpec) | \u003ccode\u003e{}\u003c/code\u003e | Matching nodes which are a direct child of the matching parent node are flattened. |\n| [opts.flatten_tags_deep] | [\u003ccode\u003eParentChildSpec\u003c/code\u003e](#ParentChildSpec) | \u003ccode\u003e{}\u003c/code\u003e | Matching nodes which are anywhere below the matching parent node are flattened. |\n| [opts.allow_tags_direct] | [\u003ccode\u003eParentChildSpec\u003c/code\u003e](#ParentChildSpec) | \u003ccode\u003e{}\u003c/code\u003e | Matching nodes which are a direct child of the matching parent node are kept. |\n| [opts.allow_tags_deep] | [\u003ccode\u003eParentChildSpec\u003c/code\u003e](#ParentChildSpec) | \u003ccode\u003e{}\u003c/code\u003e | Matching nodes which are anywhere below the matching parent node are kept. |\n| [opts.allow_attributes_by_tag] | [\u003ccode\u003eTagAttributeNameSpec\u003c/code\u003e](#TagAttributeNameSpec) | \u003ccode\u003e{}\u003c/code\u003e | Matching attribute names of a matching node are kept. Other attributes are removed. |\n| [opts.allow_classes_by_tag] | [\u003ccode\u003eTagClassNameSpec\u003c/code\u003e](#TagClassNameSpec) | \u003ccode\u003e{}\u003c/code\u003e | Matching class names of a matching node are kept. Other class names are removed. If no class names are remaining, the class attribute is removed. |\n| [opts.remove_empty] | \u003ccode\u003eboolean\u003c/code\u003e | \u003ccode\u003efalse\u003c/code\u003e | Remove nodes which are completely empty |\n| [opts.join_siblings] | [\u003ccode\u003eArray.\u0026lt;Tagname\u0026gt;\u003c/code\u003e](#Tagname) | \u003ccode\u003e[]\u003c/code\u003e | Join same-tag sibling nodes of given tag names, unless they are separated by non-whitespace textNodes. |\n| [childrenOnly] | \u003ccode\u003eBool\u003c/code\u003e | \u003ccode\u003efalse\u003c/code\u003e | If false, then the node itself and its descendants are processed recursively. If true, then only the children and its descendants are processed recursively, but not the node itself (use when `node` is `BODY` or `DocumentFragment`). |\n| [nodePropertyMap] | \u003ccode\u003eWeakMap.\u0026lt;DomNode, Object\u0026gt;\u003c/code\u003e | \u003ccode\u003enew WeakMap()\u003c/code\u003e | Additional properties for a [DomNode](#DomNode) can be stored in an object and will be looked up in this map. The properties of the object and their meaning: `skip`: If truthy, disables all processing for this node. `skip_filters`: If truthy, disables all filters for this node. `skip_classes`: If truthy, disables processing classes of this node.  `skip_attributes`: If truthy, disables processing attributes of this node. See tests for usage details. |\n\n\u003ca name=\"DomDocument\"\u003e\u003c/a\u003e\n\n## DomDocument : \u003ccode\u003eObject\u003c/code\u003e\nImplements the WHATWG DOM Document interface.\n\nIn the browser, this is `window.document`. In Node.js, this may for example be\n[new JSDOM().window.document](https://github.com/tmpvar/jsdom).\n\n**Kind**: global typedef  \n**See**: [https://dom.spec.whatwg.org/#interface-document](https://dom.spec.whatwg.org/#interface-document)  \n\u003ca name=\"DomNode\"\u003e\u003c/a\u003e\n\n## DomNode : \u003ccode\u003eObject\u003c/code\u003e\nImplements the WHATWG DOM Node interface.\n\nCustom properties for each node can be stored in a `WeakMap` passed as option `nodePropertyMap`\nto one of the sanitize functions.\n\n**Kind**: global typedef  \n**See**: [https://dom.spec.whatwg.org/#interface-node](https://dom.spec.whatwg.org/#interface-node)  \n\u003ca name=\"Tagname\"\u003e\u003c/a\u003e\n\n## Tagname : \u003ccode\u003estring\u003c/code\u003e\nNode tag name.\n\nEven though in the WHATWG DOM text nodes (nodeType 3) have a tag name `#text`,\nthese are referred to by the simpler string 'TEXT' for convenience.\n\n**Kind**: global typedef  \n**Example**  \n```js\n'DIV'\n'H1'\n'TEXT'\n```\n\u003ca name=\"Regex\"\u003e\u003c/a\u003e\n\n## Regex : \u003ccode\u003estring\u003c/code\u003e\nA string which is compiled to a case-insensitive regular expression `new RegExp(regex, 'i')`.\nThe regular expression is used to match a [Tagname](#Tagname).\n\n**Kind**: global typedef  \n**Example**  \n```js\n'.*'           // matches any tag\n'DIV'          // matches DIV\n'(DIV|H[1-3])' // matches DIV, H1, H2 and H3\n'P'            // matches P and SPAN\n'^P$'          // matches P but not SPAN\n'TEXT'         // matches text nodes (nodeType 3)\n```\n\u003ca name=\"ParentChildSpec\"\u003e\u003c/a\u003e\n\n## ParentChildSpec : \u003ccode\u003eObject.\u0026lt;Regex, Array.\u0026lt;Regex\u0026gt;\u0026gt;\u003c/code\u003e\nProperty names are matched against a (direct or ancestral) parent node's [Tagname](#Tagname).\nAssociated values are matched against the current nodes [Tagname](#Tagname).\n\n**Kind**: global typedef  \n**Example**  \n```js\n{\n  '(DIV|SPAN)': ['H[1-3]', 'B'], // matches H1, H2, H3 and B within DIV or SPAN\n  'STRONG': ['.*'] // matches all tags within STRONG\n}\n```\n\u003ca name=\"TagAttributeNameSpec\"\u003e\u003c/a\u003e\n\n## TagAttributeNameSpec : \u003ccode\u003eObject.\u0026lt;Regex, Array.\u0026lt;Regex\u0026gt;\u0026gt;\u003c/code\u003e\nProperty names are matched against the current nodes [Tagname](#Tagname). Associated values are\nused to match its attribute names.\n\n**Kind**: global typedef  \n**Example**  \n```js\n{\n  'H[1-3]': ['id', 'class'], // matches 'id' and 'class' attributes of all H1, H2 and H3 nodes\n  'STRONG': ['data-.*'] // matches all 'data-.*' attributes of STRONG nodes.\n}\n```\n\u003ca name=\"TagClassNameSpec\"\u003e\u003c/a\u003e\n\n## TagClassNameSpec : \u003ccode\u003eObject.\u0026lt;Regex, Array.\u0026lt;Regex\u0026gt;\u0026gt;\u003c/code\u003e\nProperty names are matched against the current nodes [Tagname](#Tagname). Associated values are used\nto match its class names.\n\n**Kind**: global typedef  \n**Example**  \n```js\n{\n  'DIV|SPAN': ['blue', 'red'] // matches 'blue' and 'red' class names of all DIV and SPAN nodes\n}\n```\n\u003ca name=\"FilterSpec\"\u003e\u003c/a\u003e\n\n## FilterSpec : \u003ccode\u003eObject.\u0026lt;Regex, Array.\u0026lt;filter\u0026gt;\u0026gt;\u003c/code\u003e\nProperty names are matched against node [Tagname](#Tagname)s. Associated values\nare the [filter](#filter)s which are run on the node.\n\n**Kind**: global typedef  \n\u003ca name=\"filter\"\u003e\u003c/a\u003e\n\n## filter ⇒ [\u003ccode\u003eDomNode\u003c/code\u003e](#DomNode) \\| [\u003ccode\u003eArray.\u0026lt;DomNode\u0026gt;\u003c/code\u003e](#DomNode) \\| \u003ccode\u003enull\u003c/code\u003e\nFilter functions can either...\n\n1. return the same node (the first argument),\n2. return a single, or an Array of, newly created [DomNode](#DomNode)(s), in which case `node` is\nreplaced with the new node(s),\n3. return `null`, in which case `node` is removed.\n\nNote that newly generated [DomNode](#DomNode)(s) are processed by running [sanitizeDom](#sanitizeDom)\non them, as if they had been part of the original tree. This has the following implication:\n\nIf a filter returns a newly generated [DomNode](#DomNode) with the same [Tagname](#Tagname) as `node`, it\nwould cause the same filter to be called again, which may lead to an infinite loop if the filter\nis always returning the same result (this would be a badly behaved filter). To protect against\ninfinite loops, the author of the filter must acknowledge this circumstance by setting a boolean\nproperty called 'skip_filters' for the [DomNode](#DomNode)) (in a `WeakMap` which the caller must\nprovide to one of the sanitize functions as the argument `nodePropertyMap`). If 'skip_filters' is\nnot set, an error is thrown. With well-behaved filters it is possible to continue subsequent\nprocessing of the returned node without causing an infinite loop.\n\n**Kind**: global typedef  \n\n| Param | Type | Description |\n| --- | --- | --- |\n| node | [\u003ccode\u003eDomNode\u003c/code\u003e](#DomNode) | Currently processed node |\n| opts | \u003ccode\u003eObject\u003c/code\u003e |  |\n| opts.parents | [\u003ccode\u003eArray.\u0026lt;DomNode\u0026gt;\u003c/code\u003e](#DomNode) | The parent nodes of `node`. |\n| opts.parentNodenames | [\u003ccode\u003eArray.\u0026lt;Tagname\u0026gt;\u003c/code\u003e](#Tagname) | The tag names of the parent nodes |\n| opts.siblingIndex | \u003ccode\u003eInteger\u003c/code\u003e | The number of the current node amongst its siblings |\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmichaelfranzl%2Fsanitize-dom","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmichaelfranzl%2Fsanitize-dom","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmichaelfranzl%2Fsanitize-dom/lists"}