{"id":21842648,"url":"https://github.com/brookinsconsulting/xmlwash","last_synced_at":"2025-06-14T04:06:40.424Z","repository":{"id":140126513,"uuid":"104848545","full_name":"brookinsconsulting/xmlwash","owner":"brookinsconsulting","description":"eZ Publish Legacy operator allows to clean in a clever way an xhtml field. It's mostly a wrapper around the safehtml library","archived":false,"fork":false,"pushed_at":"2017-09-26T08:54:22.000Z","size":30,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-06-05T04:12:59.404Z","etag":null,"topics":["extension","ezpublish","ezpublishlegacy","php","wash","xml"],"latest_commit_sha":null,"homepage":"http://projects.ez.no/xmlwash","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/brookinsconsulting.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-09-26T07:05:06.000Z","updated_at":"2017-09-27T18:14:38.000Z","dependencies_parsed_at":null,"dependency_job_id":"15d46dd8-c5a9-464f-8315-ec6d70216189","html_url":"https://github.com/brookinsconsulting/xmlwash","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/brookinsconsulting/xmlwash","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brookinsconsulting%2Fxmlwash","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brookinsconsulting%2Fxmlwash/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brookinsconsulting%2Fxmlwash/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brookinsconsulting%2Fxmlwash/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/brookinsconsulting","download_url":"https://codeload.github.com/brookinsconsulting/xmlwash/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brookinsconsulting%2Fxmlwash/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259756871,"owners_count":22906678,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["extension","ezpublish","ezpublishlegacy","php","wash","xml"],"created_at":"2024-11-27T22:12:42.770Z","updated_at":"2025-06-14T04:06:40.400Z","avatar_url":"https://github.com/brookinsconsulting.png","language":"PHP","funding_links":[],"categories":[],"sub_categories":[],"readme":"# XML Wash\n\nXML Wash is an operator allows to clean in a clever way an xhtml field. Its main goal is twofold:\n\n- allows to use the shorten operator and get a valid result\n- allows to use the literal class=html and be safe from XSS attacks\n\n# Shorten Problem\n\nLet's take the line tpl of article, it displays the title+ introduction. Unfortunately, there isn't a simple way to limit the length of that introduction, and some authors have a rather extensive interpretation of it (bless them, some put the complete article into the introduction). You can go the hard way and try to make them understand that introduction means, err, an introduction, or do it the easy way and truncate the introduction to a length you find normal.\n\nUnfortunately, the shorten operator doesn't work with xml content, as it might cut the text between an opening tag (eg \u003cb\u003e or \u003cdiv\u003e) and its closing (\u003c/b\u003e and \u003c/div\u003e) or in the middle of a tag (eg '\u003ca hre'). If you use xhtml+css positionning for your layout (you should), you are going to have some really funcky results. No good.\n\nFor instance, if your output text is \"Xavier has \u003cb\u003ereally\u003c/b\u003e simple examples.\"\n\n    {$node.data_map.intro.content.output.output_text|shorten(22)} \u003ca href=\"...\u003eRead more\u003c/a\u003e\n\nWill have \"Read more\" in bold.\n\n(rem: to make it funnier, you have to add 4 to the length because it starts with \u003cp\u003e+newline )\n\nIt can also stop in the middle of a tag \"Xavier has \u003cb\" and so on.\n\nMoreover, you might want to \"clean\" the intro, for instance you don't want to display titles (hn) or embeded files in the line view. In other words you need to limit what tags can be in the intro (eg only keep \u003ci\u003e and \u003cb\u003e)\n\n# Security problem\n\nAs soon as you need to have minimaly complex layout in an article or want to be able to paste html code from elsewhere, you need to allow the html class in the literal tag, otherwise, any editor can inject any XSS code (javascript attacks...). Ez solution to this security risk has been to disable it by default (settings/content.ini)\n\n```\n[literal]\nAvailableClasses[]\n# The class 'html' is disabled by default because it gives editors the\n# possibility to insert html and javascript code in XML blocks.\n# Don't enable the 'html' class unless you really trust all users who has\n# privileges to edit objects containing XML blocks.\n#AvailableClasses[]=html\n```\n\nThis is unfortunately a rather expensive option as you end up overriding templates to allow the html code you want in specific pages instead of just do that from an xml block.\n\n# Solution\n\nThe Safe HTML library is very good at cleaning the input and get rid of all these security problems.\n\nhttp://pixel-apes.com/safehtml/\n\nAs a positive side effect, it also clean the generated xhtml (for instance missing closing tags), this make it possible to shorten without having problems.\n\n# How to use ?\n\nThe extension override the default template used by {attribute_view_gui} for the xml fields. You can now safely allows the html class for literals\n\nYou have a new maxlength parameter:\n    \n    {attribute_view attribute=$node.object.data_map.intro maxlength=42}\n\nwhat it does is to add a xmlwash() operator, you can also use it directly like that:\n\n    {$node.data_map.intro.content.output.output_tex|shorten(42)|xmlwash()}\n\n(obvioulsy xmlwash has to be the last one called)\n\nThis extension also offers a strip_tags operator (same syntax than the php version)\n\nIf you want to keep only the paragraphs, italic and links:\n\n     {$node.data_map.intro.content.output.output_text|strip_tags(array('\u003cp\u003e','\u003ci\u003e','\u003ca\u003e'))|shorten(42)|xmlwash()}\n\n# Known bugs and limitations\n\nNone, but feel free to send me a mail (ez AT sydesy DOT com) if you find one.\n\nAs for the limitations, they are mine and I didn't succeed using the svn here, so I used pubsvn ;(\n\n# Screenshot\n\nYour mother was right: you need to wash your xml before showing it\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrookinsconsulting%2Fxmlwash","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbrookinsconsulting%2Fxmlwash","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrookinsconsulting%2Fxmlwash/lists"}