Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yne/html2json
Tiny HTML (XML) to JSON (JsonML) converter
https://github.com/yne/html2json
html html2json json jsonml xhtml xml xml2json
Last synced: 6 days ago
JSON representation
Tiny HTML (XML) to JSON (JsonML) converter
- Host: GitHub
- URL: https://github.com/yne/html2json
- Owner: yne
- Created: 2024-05-08T22:54:54.000Z (6 months ago)
- Default Branch: master
- Last Pushed: 2024-06-02T07:36:48.000Z (5 months ago)
- Last Synced: 2024-06-02T08:43:31.886Z (5 months ago)
- Topics: html, html2json, json, jsonml, xhtml, xml, xml2json
- Language: C
- Homepage:
- Size: 25.4 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Convert any XML/HTML to JsonML using [yxml](https://dev.yorhel.nl/yxml)
## BUILD
```sh
make html2json
```## USAGE
```sh
cat test/basic.html | ./html2json | jq .[1].lang
"en"
# send json to a frontend (example: GTK)
curl https://news.ycombinator.com/rss | ./html2json | ./json2gtk
```## FORMAT
```html
Basic Example
content
```
```jsonc
// doctype is ommited
["html",{"lang":"en"},[
["head", {}, [
["meta", {"charset": "utf-8"} ],
["title", {}, ["Basic Example"] ],
["link", {"rel": "stylesheet"} ]
]],
["body", {"id": "home"}, [
["input", {"type": "text"}],
["p", {}, ["content"]]
]]
]]
```# HTML5 support (WIP)
yxml was added XHTML and HTML5 using:
- [x] migrate `yxml_ret_t` to bitfield enum so multiple state can be returned (example : parsing `>` in `` will return `ATTREND|ELEMSTART`)
- [x] accept lowercase ``, `` content as raw data until matching closing tag id found
- [ ] accept unquoted attribute value `<form method=GET>`
- [ ] accept value-less attribute `<p hidden id=p>`
- [ ] handle [void elements](https://developer.mozilla.org/en-US/docs/Glossary/Void_element) as self-closed (`<img>` will internaly generate `<img></img>`), so alwo ignore end-tag of void elements (ex: `</img>`)