{"id":23058562,"url":"https://github.com/pentesttoolscom/html-sanitizers","last_synced_at":"2026-02-02T16:35:26.005Z","repository":{"id":261576763,"uuid":"884715001","full_name":"pentesttoolscom/html-sanitizers","owner":"pentesttoolscom","description":"Research on fuzzing HTML sanitizers in popular programming languages","archived":false,"fork":false,"pushed_at":"2024-12-16T15:58:06.000Z","size":5292,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-02-08T20:13:08.718Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pentesttoolscom.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-07T09:02:24.000Z","updated_at":"2024-12-17T21:50:35.000Z","dependencies_parsed_at":"2024-11-07T10:29:59.524Z","dependency_job_id":null,"html_url":"https://github.com/pentesttoolscom/html-sanitizers","commit_stats":null,"previous_names":["pentesttoolscom/html-sanitizers"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pentesttoolscom%2Fhtml-sanitizers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pentesttoolscom%2Fhtml-sanitizers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pentesttoolscom%2Fhtml-sanitizers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pentesttoolscom%2Fhtml-sanitizers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pentesttoolscom","download_url":"https://codeload.github.com/pentesttoolscom/html-sanitizers/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246944383,"owners_count":20858772,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-16T02:16:15.020Z","updated_at":"2026-02-02T16:35:20.982Z","avatar_url":"https://github.com/pentesttoolscom.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Server side sanitizers\nResearch on fuzzing HTML sanitizers in popular programming languages. You can read the accompanying presention [here](https://docs.google.com/presentation/d/1QrGNP6-8VD9hPVxJo6EoDH1eS4PfHmR5NyFhCKzdPr8/edit#slide=id.g213dfcddbe2_0_14).\n\n## Project Structure\n\nThe project is split in three parts:\n\n1. Under `web/` you will find all the code to deploy docker images of the sanitizers you want to fuzz.\n2. Under `fuzz/` you will find all the code used for fuzzing the sanitizers\n3. Under `github/` you will find a script to search through Github for repositories that use a vulnerable code pattern.\n\n## Build\n\nPlace your fuzz targets in a separate directory each under `web/docker/backends`. Each subfolder there will be used as\na separate docker image and reachable at `http://localhost/subdir-name` at runtime. Currently we use one subfolder per\nlanguage. Each different sanitizer will be reachable at a different route as `http://localhost/subdir-name/sanitizer-name`\n\n## Fuzz\n\nThe `fuzz` directory contains a basic fuzzer that tries to inject control characters inside a given fuzz template. Each fuzz template\nis a file containing two sections:\n\n- `[template]` contains the the text where the control characters will be injected. Each `_FUZZ_` string inside the text will be replaced\nwith a control character\n- `[canary]` defines the string that we expect to find if the injection is succesful\n\nAn example is given in `fuzz/templates/example.fuzz`:\n```\n[template]\n\u003ca href='_FUZZ_java_FUZZ_script_FUZZ_:alert()'\u003eG\u003c/a\u003e\n\n[canary]\njavascript:alert()\n```\n\n\n## Search\n\n`github/search.py` is a poor-man's REPL for searching Github for code: you give it a search query, it returns repos that\nmatch that query, sorted by the number of stars. For each repo, you get back: the name, the number of stars, and the exact\nline that matched. The search works just like the web interface, so you can give it modifiers like `path:*.py language:Python`.\nYou can find a reference [here](https://docs.github.com/en/search-github/github-code-search/understanding-github-code-search-syntax#about-code-search-query-structure).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpentesttoolscom%2Fhtml-sanitizers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpentesttoolscom%2Fhtml-sanitizers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpentesttoolscom%2Fhtml-sanitizers/lists"}