{"id":22280913,"url":"https://github.com/owasp/java-html-sanitizer","last_synced_at":"2025-04-29T18:32:33.366Z","repository":{"id":31118259,"uuid":"34677782","full_name":"OWASP/java-html-sanitizer","owner":"OWASP","description":"Takes third-party HTML and produces HTML that is safe to embed in your web application.  Fast and easy to configure.","archived":false,"fork":false,"pushed_at":"2024-09-19T16:18:55.000Z","size":28483,"stargazers_count":888,"open_issues_count":134,"forks_count":220,"subscribers_count":33,"default_branch":"main","last_synced_at":"2025-04-09T22:09:29.933Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"francogarcia/GD-PDF","license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OWASP.png","metadata":{"files":{"readme":"README.md","changelog":"change_log.md","contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-04-27T16:26:27.000Z","updated_at":"2025-04-07T14:57:35.000Z","dependencies_parsed_at":"2024-01-15T18:24:27.899Z","dependency_job_id":"d9f25e41-525b-4bc7-a9af-f2615915ffbf","html_url":"https://github.com/OWASP/java-html-sanitizer","commit_stats":null,"previous_names":[],"tags_count":25,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OWASP%2Fjava-html-sanitizer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OWASP%2Fjava-html-sanitizer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OWASP%2Fjava-html-sanitizer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OWASP%2Fjava-html-sanitizer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OWASP","download_url":"https://codeload.github.com/OWASP/java-html-sanitizer/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251560200,"owners_count":21609161,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-03T16:09:57.614Z","updated_at":"2025-04-29T18:32:33.348Z","avatar_url":"https://github.com/OWASP.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# OWASP Java HTML Sanitizer\n\n[![Java CI with Maven](https://github.com/OWASP/java-html-sanitizer/actions/workflows/maven.yml/badge.svg)](https://github.com/OWASP/java-html-sanitizer/actions/workflows/maven.yml) [![Coverage Status](https://coveralls.io/repos/github/OWASP/java-html-sanitizer/badge.svg?branch=main)](https://coveralls.io/github/OWASP/java-html-sanitizer?branch=main) [![CII Best Practices](https://bestpractices.coreinfrastructure.org/projects/2602/badge)](https://bestpractices.coreinfrastructure.org/projects/2602) [![Maven Central](https://maven-badges.herokuapp.com/maven-central/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/badge.png?style=plastic)](https://search.maven.org/artifact/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer)\n\n\nA fast and easy to configure HTML Sanitizer written in Java which lets\nyou include HTML authored by third-parties in your web application while\nprotecting against XSS.\n\nThe existing dependency is on JSR 305. The other jars\nare only needed by the test suite.  The JSR 305 dependency is a\ncompile-only dependency, only needed for annotations.\n\nThis code was written with security best practices in mind, has an\nextensive test suite, and has undergone\n[adversarial security review](docs/attack_review_ground_rules.md).\n\n## Table Of Contents\n\n*  [Getting Started](#getting-started)\n*  [Prepackaged Policies](#prepackaged-policies)\n*  [Crafting a policy](#crafting-a-policy)\n*  [Custom policies](#custom-policies)\n*  [Preprocessors](#preprocessors)\n*  [Telemetry](#telemetry)\n*  [Questions\\?](#questions)\n*  [Contributing](#contributing)\n*  [Credits](#credits)\n\n## Getting Started\n\n[Getting Started](docs/getting_started.md) includes instructions on\nhow to get started with or without Maven.\n\n## Prepackaged Policies\n\nYou can use\n[prepackaged policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20240325.1/org/owasp/html/Sanitizers.html):\n\n```Java\nPolicyFactory policy = Sanitizers.FORMATTING.and(Sanitizers.LINKS);\nString safeHTML = policy.sanitize(untrustedHTML);\n```\n\n## Crafting a policy\n\nThe\n[tests](https://github.com/OWASP/java-html-sanitizer/blob/main/src/test/java/org/owasp/html/HtmlPolicyBuilderTest.java)\nshow how to configure your own\n[policy](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20240325.1/org/owasp/html/HtmlPolicyBuilder.html):\n\n```Java\nPolicyFactory policy = new HtmlPolicyBuilder()\n    .allowElements(\"a\")\n    .allowUrlProtocols(\"https\")\n    .allowAttributes(\"href\").onElements(\"a\")\n    .requireRelNofollowOnLinks()\n    .toFactory();\nString safeHTML = policy.sanitize(untrustedHTML);\n```\n\n## Custom Policies\n\nYou can write\n[custom policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20240325.1/org/owasp/html/ElementPolicy.html)\nto do things like changing `h1`s to `div`s with a certain class:\n\n```Java\nPolicyFactory policy = new HtmlPolicyBuilder()\n    .allowElements(\"p\")\n    .allowElements(\n        (String elementName, List\u003cString\u003e attrs) -\u003e {\n          // Add a class attribute.\n          attrs.add(\"class\");\n          attrs.add(\"header-\" + elementName);\n          // Return elementName to include, null to drop.\n          return \"div\";\n        }, \"h1\", \"h2\", \"h3\", \"h4\", \"h5\", \"h6\")\n    .toFactory();\nString safeHTML = policy.sanitize(untrustedHTML);\n```\n\nPlease note that the elements \"a\", \"font\", \"img\", \"input\" and \"span\"\nneed to be explicitly whitelisted using the `allowWithoutAttributes()`\nmethod if you want them to be allowed through the filter when these\nelements do not include any attributes.\n\n[Attribute policies](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20240325.1/org/owasp/html/AttributePolicy.html) allow running custom code too.  Adding an attribute policy will not water down any default policy like `style` or URL attribute checks.\n\n```Java\nnew HtmlPolicyBuilder = new HtmlPolicyBuilder()\n    .allowElement(\"div\", \"span\")\n    .allowAttributes(\"data-foo\")\n        .matching(\n            (String elementName, String attributeName, String value) -\u003e {\n              // Return value for the attribute or null to drop.\n            })\n        .onElements(\"div\", \"span\")\n    .build()\n```\n\n## Preprocessors\n\nPreprocessors allow inserting text and large scale structural changes.\n\n```Java\nnew HtmlPolicyBuilder = new HtmlPolicyBuilder()\n    // Use a preprocessor to be backwards compatible with the\n    // \u003cplaintext\u003e element which \n    .withPreprocessor(\n        (HtmlStreamEventReceiver r) -\u003e {\n          // Provide user with info about links before they click.\n          // Before:                       \u003ca href=\"https://example.com/...\"\u003e\n          // After:  (https://example.com) \u003ca href=\"https://example.com/...\"\u003e\n          return new HtmlStreamEventReceiverWrapper(r) {\n            @Override public void openTag(String elementName, List\u003cString\u003e attrs) {\n              if (\"a\".equals(elementName)) {\n                for (int i = 0, n = attrs.size(); i \u003c n; i += 2) {\n                  if (\"href\".equals(attrs.get(i)) {\n                    String url = attrs.get(i + 1);\n                    String origin;\n                    try {\n                      URI uri = new URI(url);\n                      String scheme = uri.getScheme();\n                      String authority = uri.getRawAuthority();\n                      if (scheme == null \u0026\u0026 authority == null) {\n                        origin = null;\n                      } else {\n                        origin = (scheme != null ? scheme + \":\" : \"\")\n                               + (authority != null ? \"//\" + authority : \"\");\n                      }\n                    } catch (URISyntaxException ex) {\n                      origin = \"about:invalid\";\n                    }\n                    if (origin != null) {\n                      text(\" (\" + origin + \") \");\n                    }\n                  }\n                }\n              }\n              super.openTag(elementName, attrs);\n            }\n          };\n        }\n    .allowElement(\"a\")\n    ...\n    .build()\n\n```\n\nPreprocessing happens before a policy is applied, so cannot affect the security\nof the output.\n\n## Telemetry\n\nWhen a policy rejects an element or attribute it notifies an [HtmlChangeListener](https://static.javadoc.io/com.googlecode.owasp-java-html-sanitizer/owasp-java-html-sanitizer/20240325.1/org/owasp/html/HtmlChangeListener.html).\n\nYou can use this to keep track of policy violation trends and find out when someone\nis making an effort to breach your security.\n\n```Java\nPolicyFactory myPolicyFactory = ...;\n// If you need to associate reports with some context, you can do so.\nMyContextClass myContext = ...;\n\nString sanitizedHtml = myPolicyFactory.sanitize(\n    unsanitizedHtml,\n    new HtmlChangeListener\u003cMyContextClass\u003e() {\n      @Override\n      public void discardedTag(MyContextClass context, String elementName) {\n        // ...\n      }\n      @Override\n      public void discardedAttributes(\n          MyContextClass context, String elementName, String... attributeNames) {\n        // ...\n      }\n    },\n    myContext);\n```\n\n**Note**: If a string sanitizes with no change notifications, it is not the case\nthat the input string is necessarily safe to use. Only use the output of the sanitizer.\n\nThe sanitizer ensures that the output is in a sub-set of HTML that commonly\nused HTML parsers will agree on the meaning of, but the absence of\nnotifications does not mean that the input is in such a sub-set,\nonly that it does not contain elements or attributes that were removed.\n\nSee [\"Why sanitize when you can validate\"](https://github.com/OWASP/java-html-sanitizer/blob/main/docs/html-validation.md) for more on this topic.\n\n## Questions?\n\nIf you wish to report a vulnerability, please see\n[AttackReviewGroundRules](docs/attack_review_ground_rules.md).\n\nSubscribe to the\n[mailing list](http://groups.google.com/group/owasp-java-html-sanitizer-support)\nto be notified of known [Vulnerabilities](docs/vulnerabilities.md) and important updates.\n\n## Contributing\n\nIf you would like to contribute, please ping [@mvsamuel](https://twitter.com/mvsamuel) or [@manicode](https://twitter.com/manicode).\n\nWe welcome [issue reports](https://github.com/OWASP/java-html-sanitizer/issues) and PRs.\nPRs that change behavior or that add functionality should include both positive and\n[negative tests](https://www.guru99.com/negative-testing.html).\n\nPlease be aware that contributions fall under the [Apache 2.0 License](https://github.com/OWASP/java-html-sanitizer/blob/main/COPYING).\n\n## Credits\n\n[Thanks to everyone who has helped with criticism and code](docs/credits.md)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fowasp%2Fjava-html-sanitizer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fowasp%2Fjava-html-sanitizer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fowasp%2Fjava-html-sanitizer/lists"}