{"id":18006933,"url":"https://github.com/yne/html2json","last_synced_at":"2026-05-01T13:32:23.695Z","repository":{"id":238928831,"uuid":"798006358","full_name":"yne/html2json","owner":"yne","description":"Tiny HTML (XML) to JSON (JsonML) converter","archived":false,"fork":false,"pushed_at":"2024-12-15T13:57:51.000Z","size":40,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-04T11:47:09.707Z","etag":null,"topics":["html","html2json","json","jsonml","xhtml","xml","xml2json"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yne.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-08T22:54:54.000Z","updated_at":"2025-01-06T13:56:49.000Z","dependencies_parsed_at":"2024-05-08T23:44:07.270Z","dependency_job_id":"3c5ad17a-be47-4127-abd0-366a4d33e22f","html_url":"https://github.com/yne/html2json","commit_stats":null,"previous_names":["yne/html2json"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/yne/html2json","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yne%2Fhtml2json","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yne%2Fhtml2json/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yne%2Fhtml2json/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yne%2Fhtml2json/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yne","download_url":"https://codeload.github.com/yne/html2json/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yne%2Fhtml2json/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32499681,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-30T13:12:12.517Z","status":"online","status_checked_at":"2026-05-01T02:00:05.856Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["html","html2json","json","jsonml","xhtml","xml","xml2json"],"created_at":"2024-10-30T01:11:17.122Z","updated_at":"2026-05-01T13:32:23.675Z","avatar_url":"https://github.com/yne.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"Convert any XML/HTML to JsonML using [yxml](https://dev.yorhel.nl/yxml)\n\n## BUILD\n\n```sh\nmake html2json\n```\n\n## USAGE\n\n```sh\ncat test/basic.html | ./html2json | jq .[1].lang\n\"en\"\n# send json to a frontend (example: GTK)\ncurl https://news.ycombinator.com/rss | ./html2json | ./json2gtk\n```\n\n## FORMAT\n\n\u003ctable\u003e\n\u003ctr\u003e\u003ctd\u003e\n\n```html\n\u003c!DOCTYPE html\u003e\n\u003chtml lang=\"en\"\u003e\n  \u003chead\u003e\n    \u003cmeta charset=\"utf-8\" /\u003e\n    \u003ctitle\u003eBasic Example\u003c/title\u003e\n    \u003clink rel=\"stylesheet\" /\u003e\n  \u003c/head\u003e\n  \u003cbody id=\"home\"\u003e\n    \u003cinput type=\"text\"/\u003e\n    \u003cp\u003econtent\u003c/p\u003e\n  \u003c/body\u003e\n\u003c/html\u003e\n```\n\n\u003c/td\u003e\u003ctd\u003e\n\n```jsonc\n// doctype is ommited\n[\"html\",{\"lang\":\"en\"},[\n    [\"head\", {}, [\n        [\"meta\", {\"charset\": \"utf-8\"} ],\n        [\"title\", {}, [\"Basic Example\"] ],\n        [\"link\", {\"rel\": \"stylesheet\"} ]\n    ]],\n    [\"body\", {\"id\": \"home\"}, [\n        [\"input\", {\"type\": \"text\"}],\n        [\"p\", {}, [\"content\"]]\n    ]]\n]]\n```\n\n\u003c/td\u003e\u003c/tr\u003e\n\u003c/table\u003e\n\n# HTML5 support (WIP)\n\nyxml was added XHTML and HTML5 using:\n- [x] migrate `yxml_ret_t` to bitfield enum so multiple state can be returned (example : parsing `\u003e` in `\u003cp hidden\u003e` will return `ATTREND|ELEMSTART`)\n- [x] accept lowercase `\u003c!doctype `\n- [x] read `\u003cscript\u003e`, `\u003cstyle\u003e` content as raw data until matching closing tag id found \n- [ ] accept unquoted attribute value `\u003cform method=GET\u003e`\n- [ ] accept value-less attribute `\u003cp hidden id=p\u003e`\n- [ ] handle [void elements](https://developer.mozilla.org/en-US/docs/Glossary/Void_element) as self-closed (`\u003cimg\u003e` will internaly generate `\u003cimg\u003e\u003c/img\u003e`), so alwo ignore end-tag of void elements (ex: `\u003c/img\u003e`)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyne%2Fhtml2json","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyne%2Fhtml2json","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyne%2Fhtml2json/lists"}