{"id":15559699,"url":"https://github.com/felixschwarz/htmlcompare","last_synced_at":"2025-04-09T16:51:49.262Z","repository":{"id":62569577,"uuid":"270980722","full_name":"FelixSchwarz/htmlcompare","owner":"FelixSchwarz","description":"library to compare HTML while ignoring non-functional differences","archived":false,"fork":false,"pushed_at":"2023-05-24T06:52:07.000Z","size":49,"stargazers_count":2,"open_issues_count":1,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-23T18:54:22.603Z","etag":null,"topics":["html5","python","python3","testing"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FelixSchwarz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-06-09T11:12:18.000Z","updated_at":"2023-05-23T17:01:39.000Z","dependencies_parsed_at":"2024-12-10T16:51:48.409Z","dependency_job_id":"f97d6dff-113e-4591-af6d-cfffb072a25f","html_url":"https://github.com/FelixSchwarz/htmlcompare","commit_stats":{"total_commits":44,"total_committers":1,"mean_commits":44.0,"dds":0.0,"last_synced_commit":"f29add5f319683917b6ba2e777bbf363906f72b2"},"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FelixSchwarz%2Fhtmlcompare","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FelixSchwarz%2Fhtmlcompare/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FelixSchwarz%2Fhtmlcompare/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FelixSchwarz%2Fhtmlcompare/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FelixSchwarz","download_url":"https://codeload.github.com/FelixSchwarz/htmlcompare/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248073094,"owners_count":21043365,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["html5","python","python3","testing"],"created_at":"2024-10-02T15:55:41.258Z","updated_at":"2025-04-09T16:51:49.241Z","avatar_url":"https://github.com/FelixSchwarz.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"htmlcompare\n=============\n\nA Python library to ensure two HTML documents are \"equal\". Currently the functionality is very limited but the idea is that the library should ignore differences automatically when these are not relevant for HTML semantics (e.g. `\u003cimg style=\"\"\u003e` is the same as `\u003cimg\u003e`, `style=\"color: black; font-weight: bold\"` is equal to `style=\"font-weight:bold;color:black;\"`).\n\nUsage\n--------------\n\n```python\nimport htmlcompare\n\ndiff = htmlcompare.compare('\u003cdiv\u003e', '\u003cp\u003e')\nis_same = bool(diff)\n```\n\nTo ease testing the library provides some helpers\n\n```python\nfrom htmlcompare import assert_different_html, assert_same_html\n\nassert_different_html('\u003cbr\u003e', '\u003cp\u003e')\nassert_same_html('\u003cdiv /\u003e', '\u003cdiv\u003e\u003c/div\u003e')\n```\n\nImplemented Features\n----------------------\n\n- ignores whitespace between HTML tags\n- `\u003cdiv /\u003e` is treated like `\u003cdiv\u003e\u003c/div\u003e`\n- ordering of HTML attributes does not matter: `\u003cdiv class=\"…\" style=\"…\" /\u003e` is treated equal to `\u003cdiv style=\"…\" class=\"…\" /\u003e`\n- HTML comments are ignored (yes, also [conditional comments](https://en.wikipedia.org/wiki/Conditional_comment) unfortunately)\n- ordering of CSS classes inside `class` attribute does not matter: `\u003cdiv class=\"foo bar\" /\u003e` is the same as `\u003cdiv class=\"bar foo\" /\u003e`.\n- a `style` or `class` attribute with empty content (e.g. `style=\"\"`) is considered the same as an absent `style`/`class` attribute.\n- inline style declarations are parsed with an actual CSS parser: ordering, whitespace and trailing semicolons do not matter (Python 3.5+ only)\n- `0px` is considered equal to `0` in inline CSS.\n\n\nLimitations / Plans\n----------------------\n**Only basic CSS support**. Declarations in `style` attributes are parsed with [tinycss2](https://github.com/Kozea/tinycss2) (Python 3.5+) so ordering of declarations and extra whitespace should not matter. `tinycss2` does not support Python 2 and 3.4 so the only help here is to strip trailing `;`s in `style` attributes. Contents of `\u003cstyle\u003e` tags are completely ignored for now (even with `tinycss2`).\n\n**No validation of conditional comments**. Not sure which library I can use here but at some point I'll likely need this as well.\n\n**JavaScript** - for obvious reasons it will be impossible to implement perfect JS comparison but it might be possible to run some kind of \"beautifier\" to take care of insignificant stylistic changes. However I don't need this feature so this is unlikely to get implemented (unless contributed by someone else).\n\n**Custom hooks** could help adapting the comparison to your specific needs. However I don't know which API would be best so this will wait until there are real-world use cases.\n\n**Better API**: The current API is very minimal and implements just what I needed right now. I hope to improve the API once I use this project in more complex scenarios.\n\n\nOther projects\n--------------\n[xmldiff](https://github.com/Shoobx/xmldiff) is a well established project to compare two XML documents. However it seems as if the code does not contain knowledge about specific HTML semantics (e.g. CSS, empty attributes, insignificant attribute order).\n\n\nMisc\n--------------\nThe code is licensed under the MIT license. It supports Python 2.7 and Python 3.4+ though some features are only available for Python 3.5+.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffelixschwarz%2Fhtmlcompare","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffelixschwarz%2Fhtmlcompare","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffelixschwarz%2Fhtmlcompare/lists"}