{"id":18331588,"url":"https://github.com/angrycoding/shallow-xml","last_synced_at":"2026-01-22T21:03:34.711Z","repository":{"id":148799891,"uuid":"9704329","full_name":"angrycoding/shallow-xml","owner":"angrycoding","description":"XML parser that allows you to parse or extract fragments from invalid xml / html documents","archived":false,"fork":false,"pushed_at":"2013-04-26T23:34:14.000Z","size":124,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-09T18:54:16.154Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/angrycoding.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-04-26T21:08:10.000Z","updated_at":"2014-03-04T07:08:22.000Z","dependencies_parsed_at":"2023-03-29T19:47:55.264Z","dependency_job_id":null,"html_url":"https://github.com/angrycoding/shallow-xml","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/angrycoding/shallow-xml","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/angrycoding%2Fshallow-xml","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/angrycoding%2Fshallow-xml/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/angrycoding%2Fshallow-xml/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/angrycoding%2Fshallow-xml/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/angrycoding","download_url":"https://codeload.github.com/angrycoding/shallow-xml/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/angrycoding%2Fshallow-xml/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28671201,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-22T20:48:19.482Z","status":"ssl_error","status_checked_at":"2026-01-22T20:48:14.968Z","response_time":144,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-05T19:33:37.911Z","updated_at":"2026-01-22T21:03:34.674Z","avatar_url":"https://github.com/angrycoding.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"Shallow XML parser for Node.js\n===========\n\nThis parser allows you to parse XML documents that contains fragments that is\nimpossible to parse using \"classical\" parsing methods. For instance you have an\nabstract HTML document provided by the 3rd party. The only thing you know about\nthis document is that it contains \"something\" inside `\u003cbody\u003e\u003c/body\u003e` tag. You\nhave no idea if it's valid but your task is to extract this data and use it\nfor something else. You won't be able to parse this document (in case if it's invalid)\nusing classical XML parser.\n\nInstead of doing it in classical way, shallow-xml uses a technique called\n[shallow parsing](http://en.wikipedia.org/wiki/Shallow_parsing). Using regular expressions\nit converts original xml - document into a list of tokens, trying to extract as much\nuseful information as possible, based on known structure defined by the developer.\nTake a look on following XML document (OpenSocial gadget definition):\n\n```xml\n\u003c?xml version=\"1.0\" encoding=\"UTF-8\"?\u003e\n\u003cModule\u003e\n  \u003cModulePrefs title=\"Proxied Content Example\" /\u003e\n  \u003cContent view=\"home\"\u003e\n    \u003cul\u003e\n      \u003cli\u003e111\n      \u003cli\u003e222\n      \u003cli\u003e333\n    \u003c/ul\u003e\n  \u003c/Content\u003e\n  \u003cContent view=\"home...not\"\u003e\n    \u003cdiv\u003e\n      \u003cspan\u003e\n        \u003cstrong\u003e\n          hello\n  \u003c/Content\u003e\n\u003c/Module\u003e\n```\n\nAs you can see this xml document cannot be parsed using classical xml parser,\nbecause HTML code that is placed inside `\u003cContent\u003e\u003c/Content\u003e` tag is invalid.\nHowever this document is not a problem for shallow parser, just define known structure.\nBy setting up the structure you're saying that only listed tags has to be processed during the parsing:\n\n```javascript\n// create parser instance\nvar GadgetParser = new Parser([\n  // parse only listed tags\n  'Module', 'ModulePrefs', 'Content'\n]);\n// parse xml document\nvar gadgetDoc = GadgetParser.parse(xml);\n// output xml document\nconsole.info(gadgetDoc);\n```\n\n## Parser API ##\n\n```javascript\n// find Module element\nvar moduleEl = gadgetDoc('Module');\n// find ModulePrefs element inside Module element\nvar modulePrefsEl = moduleEl('ModulePrefs');\n// extract @title attribute of ModulePrefs element\nconsole.info(modulePrefsEl('@title'));\n// find Content elements inside Module element\nvar contentEls = moduleEl('Content');\n// extract @view attribute of second Content element\nconsole.info(contentEls(0)('@view'));\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fangrycoding%2Fshallow-xml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fangrycoding%2Fshallow-xml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fangrycoding%2Fshallow-xml/lists"}