{"id":23529919,"url":"https://github.com/f34nk/elixir_html_tools","last_synced_at":"2026-03-11T11:37:12.054Z","repository":{"id":213514575,"uuid":"123902167","full_name":"f34nk/elixir_html_tools","owner":"f34nk","description":"Overview of available html tools in Elixir","archived":false,"fork":false,"pushed_at":"2024-02-14T11:11:15.000Z","size":4574,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-22T17:28:39.352Z","etag":null,"topics":["benchmark","css","elixir","html","html-parser","html-parser-library","tools"],"latest_commit_sha":null,"homepage":"https://elixirforum.com/t/overview-of-available-html-tools-in-elixir/12905","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/f34nk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-03-05T10:17:39.000Z","updated_at":"2024-02-14T11:11:19.000Z","dependencies_parsed_at":"2023-12-21T13:37:02.585Z","dependency_job_id":"29f19106-b6f3-4c2f-8316-5e3d8e285e5e","html_url":"https://github.com/f34nk/elixir_html_tools","commit_stats":null,"previous_names":["f34nk/elixir_html_tools"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/f34nk/elixir_html_tools","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/f34nk%2Felixir_html_tools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/f34nk%2Felixir_html_tools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/f34nk%2Felixir_html_tools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/f34nk%2Felixir_html_tools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/f34nk","download_url":"https://codeload.github.com/f34nk/elixir_html_tools/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/f34nk%2Felixir_html_tools/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30380000,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-11T06:09:32.197Z","status":"ssl_error","status_checked_at":"2026-03-11T06:09:17.086Z","response_time":84,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","css","elixir","html","html-parser","html-parser-library","tools"],"created_at":"2024-12-25T21:14:16.867Z","updated_at":"2026-03-11T11:37:12.039Z","avatar_url":"https://github.com/f34nk.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"# This project is not maintained anymore\n\n*I do not intend to give a complete analysis here. If something is missing or plain wrong send me a message or feel invited to participate in the [forum discussion](https://elixirforum.com/t/overview-of-available-html-tools-in-elixir/12905) or create an **Issue** or submit a **PR**.*\n\n# Html tools in Elixir\n\nThe landscape of available Elixir packages for html tooling is overseeable but in that sense also very focused. Each library is there for a distinct use case.\n\n|   | **[Floki](https://github.com/philss/floki)** | **[Meeseeks](https://github.com/mischov/meeseeks)** | **[Myhtmlex](https://github.com/Overbryd/myhtmlex)** | **[ModestEx](https://github.com/f34nk/modest_ex)** | **[TidyEx](https://github.com/f34nk/tidy_ex)** | **[HtmlSanitizeEx](https://github.com/rrrene/html_sanitize_ex)** |\n|  :------ | :------ | :------ | :------ | :------ | :------ | :------ |\n|  **First Commit** | Nov 2014 | Feb 2017 | Aug 2017 | Feb 2018 | April 2018 | July 2015 |\n|  **HTML5 compliant** | no with default parser; yes with [html5ever](https://github.com/servo/html5ever) parser (*) | yes with [meeseeks_html5ever](https://github.com/mischov/meeseeks_html5ever) (*) | yes, as a binding to [myhtml](https://github.com/lexborisov/myhtml) | yes, as a binding to [Modest](https://github.com/lexborisov/Modest) library | yes, as a binding to [tidy-html5](https://github.com/htacg/tidy-html5) library | [no](https://github.com/rrrene/html_sanitize_ex/blob/master/lib/html_sanitize_ex/scrubber/html5.ex#L2) |\n|  **Can parse XML** |  | yes |  |  | | |\n|  **Supports XPath selectors** |  | yes |  |  | | |\n|  **Supports common CSS selectors** | yes (22) | yes (27) |  | yes (36) | | |\n|  **Supports custom CSS selectors** | [non-standard selector implemented](https://github.com/philss/floki#supported-selectors) | yes, flexible [Api for custom selectors](https://github.com/mischov/meeseeks#custom-selectors) |  | [non-standard selector implemented](https://github.com/f34nk/modest_ex/blob/master/SELECTORS.md) | | |\n|  **Can manipulate nodes** | yes, but limited |  |  | yes | | |\n|  **Parser return type** | `{tag_name, attributes, children_nodes}` | `Meeseeks.Document` | `{tag_name, attributes, children_nodes}` | `String` | `String` | `String` |\n|  **Use Case** | parse and select | supports HTML and XML; custom selectors; CSS and XPath | fast HTML decode/encode | pipeable string transformations; provides 16 functions to manipulate HTML | corrects and cleans up HTML content by fixing markup errors | sanitizer user input |\n\n(*) There is also a **separate** benchmark availbale for [Meeseeks vs. Floki Performance](https://github.com/mischov/meeseeks_floki_bench).\n\n\n## Test\n\n\tgit clone\n\tmix deps.get\n\nThe `test` folder contains examples of the library features side by side.\n\n\tmix test\n\n## Benchmark\n\nTested versions:\n\n```\n{:floki, \"~\u003e 0.20.0\"}\n{:meeseeks, \"0.7.6\"}\n{:myhtmlex, \"~\u003e 0.2.0\"}\n{:modest_ex, \"~\u003e 1.0.3\"}\n{:tidy_ex, \"~\u003e 1.0.0\"}\n{:html_sanitize_ex, \"~\u003e 1.3.0-rc3\"}\n```\n\nRun benchmarks with:\n\n\tMIX_ENV=prod mix bench\n\nand\n\n\tMIX_ENV=prod mix benchee\n\nOn my AMD FX-8300 Eight-Core Processor, 15 Gb Ram, Ubuntu 14.04, the benchmarks looks something like this:\n\n```\n## FlokiParseBench\nbench iterations   average time \n0.2k       50000   50.18 µs/op\n0.5k       20000   86.37 µs/op\n1k          5000   304.72 µs/op\n2k          5000   654.28 µs/op\n5k          1000   1585.65 µs/op\n10k          500   3843.19 µs/op\n50k          100   16846.18 µs/op\n100k          50   31044.22 µs/op\n200k          20   80808.60 µs/op\n350k          10   209489.90 µs/op\n\n## MeeseeksParseBench\nbench iterations   average time \n0.2k       20000   74.05 µs/op\n0.5k       20000   78.40 µs/op\n1k          5000   722.47 µs/op\n2k          1000   1525.72 µs/op\n5k          1000   2733.66 µs/op\n10k          500   4770.79 µs/op\n50k          100   11930.73 µs/op\n100k         100   18903.71 µs/op\n200k          50   31757.00 µs/op\n350k          50   60043.98 µs/op\n\n## MyhtmlexParseBench\nbench iterations   average time \n0.5k        5000   401.32 µs/op\n0.2k        5000   412.80 µs/op\n1k          5000   515.46 µs/op\n2k          5000   737.43 µs/op\n5k          1000   1021.32 µs/op\n10k         1000   1644.85 µs/op\n50k         1000   2944.80 µs/op\n100k         500   4749.36 µs/op\n200k         200   7786.63 µs/op\n350k         100   18435.59 µs/op\n\n## ModestExParseBench\nbench iterations   average time \n1k         10000   181.77 µs/op\n0.2k       10000   216.83 µs/op\n0.5k       10000   221.71 µs/op\n2k          5000   319.47 µs/op\n5k          5000   353.81 µs/op\n10k         5000   731.99 µs/op\n50k         1000   1599.91 µs/op\n100k        1000   2951.25 µs/op\n200k         500   5285.43 µs/op\n350k         100   11944.52 µs/op\n\n## TidyExParseBench\nbench iterations   average time \n0.2k       10000   173.74 µs/op\n0.5k       10000   201.40 µs/op\n1k          5000   307.77 µs/op\n2k          5000   442.71 µs/op\n5k          1000   1452.07 µs/op\n10k         1000   2687.98 µs/op\n50k          200   8373.23 µs/op\n100k         100   10168.21 µs/op\n200k         100   19607.18 µs/op\n\n## HtmlSanitizeExParseBench\nbench iterations   average time \n0.2k       10000   173.68 µs/op\n0.5k       10000   227.71 µs/op\n1k          2000   765.60 µs/op\n2k          1000   1791.06 µs/op\n5k           500   3970.00 µs/op\n10k          200   9017.30 µs/op\n50k           50   39859.24 µs/op\n100k          20   75973.80 µs/op\n200k          10   178685.10 µs/op\n```\n\n## Conclusions\n\nThe ecosystem of tools is still quite young. There is more to come.\n\nAs [mentioned in the forum](https://elixirforum.com/t/html-tools-in-elixir/12905/16): in this test, Floki does **not** use the html5 compliant parser, since it is not supported by the latest Erlang version.\n\nNonetheless, a very rough user guideline could be:\n\nIf you are looking for parsing speed of *smallish* (up to 1kB) html strings, `Floki` and `Meeseeks` are the fastest.\n\n`Floki` offers all common CSS selectors and some [limited features](https://hexdocs.pm/floki/Floki.html#map/2) to manipulate nodes.\n\n`Meeseeks` provides a [flexible Api for custom selectors](https://github.com/mischov/meeseeks#custom-selectors). It can also parse **XML** and supports **XPath** selectors.\n\nIf you are looking for a good performance distribution over many file sizes you can use `Myhtmlex`. With that you can encode and decode html super fast.\n\nHowever, if you need to do complex manipulations on the html string you can use `ModestEx`. With that you get [**36**](https://github.com/f34nk/modest_ex/blob/master/SELECTORS.md) CSS selectors and [**16**](https://github.com/f34nk/modest_ex/blob/master/FEATURES.md) methods to transform html strings.\n\nFor html5 spec accuracy or user input sanitation there are [TidyEx](https://github.com/f34nk/tidy_ex) amd [HtmlSanitizeEx](https://github.com/rrrene/html_sanitize_ex).\n\nAll in all, I would say, the focused nature of the tools makes it easy for the user to pick the right tool for the job.\n\nBest, f34nk\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ff34nk%2Felixir_html_tools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ff34nk%2Felixir_html_tools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ff34nk%2Felixir_html_tools/lists"}