{"id":19966668,"url":"https://github.com/truthhun/html2json","last_synced_at":"2025-03-01T17:26:15.120Z","repository":{"id":40292065,"uuid":"210108842","full_name":"TruthHun/html2json","owner":"TruthHun","description":"Go语言开发的HTML和Markdown转JSON工具，将HTML和Markdown内容转换为符合各种小程序`rich-text`组件内容渲染所需格式的`JSON`","archived":false,"fork":false,"pushed_at":"2023-05-05T02:21:46.000Z","size":8907,"stargazers_count":10,"open_issues_count":3,"forks_count":7,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-12T08:41:46.115Z","etag":null,"topics":["html2json","mini-program","rich-text"],"latest_commit_sha":null,"homepage":"https://www.bookstack.cn","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TruthHun.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-09-22T07:38:08.000Z","updated_at":"2025-01-06T09:58:56.000Z","dependencies_parsed_at":"2024-06-19T02:48:18.842Z","dependency_job_id":"7c4d9272-12c0-41de-b659-4fe1e3a451b3","html_url":"https://github.com/TruthHun/html2json","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TruthHun%2Fhtml2json","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TruthHun%2Fhtml2json/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TruthHun%2Fhtml2json/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TruthHun%2Fhtml2json/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TruthHun","download_url":"https://codeload.github.com/TruthHun/html2json/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241398859,"owners_count":19956811,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["html2json","mini-program","rich-text"],"created_at":"2024-11-13T02:37:31.894Z","updated_at":"2025-03-01T17:26:15.083Z","avatar_url":"https://github.com/TruthHun.png","language":"HTML","readme":"# html2json\n\n使用Go语言开发的HTML和Markdown转JSON工具，将HTML/Markdown内容转换为符合各种小程序`rich-text`组件内容渲染所需格式的`JSON`\n\n## 介绍\n\n在开发 [BookStack](https://gitee.com/truthhun/BookStack) 的配套微信小程序 [BookChat](https://gitee.com/truthhun/BookChat) \n以及使用 `uni-app` 开发配套的手机APP应用 [BookChatApp](https://gitee.com/truthhun/BookChatApp)的过程中，\n我尝试了很多种开源的小程序HTML解析渲染组件，但都不是很满意。\n\n后来将HTML内容在后端解析，并在小程序段使用自带的`rich-text`组件进行渲染，性能、稳定性以及渲染效果都比较符合预期，尽管并没有第三方HTML渲染工具那样提供图片预览的功能。\n\n目前已经在`BookStack` v2.1 版本中使用了。\n\n## 特点\n\n1. 支持markdown和HTML转JSON\n1. 可作为 resetful 服务提供API接口进行访问\n1. Go语言开发者，可作为包的形式进行引用\n\n## 使用方式\n\n更多使用帮助，使用如下方式查看\n\n```\n./html2json --help\n```\n\n### resetful 方式使用\n\n#### 启动服务\n```\n./html2json serve --port 8888 --tags weixin-html-tags.json\n```\n\n- `--port` - [非必需参数]指定服务端口，默认为 8888\n- `--tags` - [非必须参数]指定信任的HTML元素。json数组文件，里面存放各个支持的HTML标签。默认使用 uni-app 信任的HTML标签\n\n各小程序支持的HTML标签\n\n\u003e - 微信小程序：https://developers.weixin.qq.com/miniprogram/dev/component/rich-text.html\n\u003e - 支付宝小程序：https://docs.alipay.com/mini/component/rich-text\n\u003e - 百度小程序：https://smartprogram.baidu.com/docs/develop/component/base/#rich-text-%E5%AF%8C%E6%96%87%E6%9C%AC/\n\u003e - 头条小程序：https://developer.toutiao.com/dev/miniapp/uEDMy4SMwIjLxAjM\n\u003e - QQ小程序：https://q.qq.com/wiki/develop/miniprogram/component/basic-content/rich-text.html\n\u003e - uni-app: https://uniapp.dcloud.io/component/rich-text?id=rich-text\n\n`weixin-html-tags.json`文件示例：\n```\n[\"a\", \"abbr\", \"address\", \"article\", \"aside\", \"b\", \"bdi\", \"bdo\", \"big\", \"blockquote\", \"br\", \"caption\", \"center\", \"cite\", \"code\", \"col\", \"colgroup\", \"dd\", \"del\", \"div\", \"dl\", \"dt\", \"em\", \"fieldset\", \"font\", \"footer\", \"h1\", \"h2\", \"h3\", \"h4\", \"h5\", \"h6\", \"header\", \"hr\", \"i\", \"img\", \"ins\", \"label\", \"legend\", \"li\", \"mark\", \"nav\", \"ol\", \"p\", \"pre\", \"q\", \"rt\", \"ruby\", \"s\", \"section\", \"small\", \"span\", \"strong\", \"sub\", \"sup\", \"table\", \"tbody\", \"td\", \"tfoot\", \"th\", \"thead\", \"tr\", \"tt\", \"u\", \"ul\"]\n```\n\n#### API接口\n\n##### 解析来自url链接的HTML\n\n**请求方法**\n\nGET\n\n**请求接口**\n```\n/html2json\n```\n\n**请求参数**\n\n- `url` - [必需]需要解析的内容链接。\n- `timeout` - 超时时间，单位为秒，默认为10秒\n- `domain` - 图片等静态资源域名，用于拼装图片等链接。需带 `http` 或 `https`，如 `https://static.bookstack.cn`\n\n\u003e 注意：程序只解析 HTML 中的 Body 内容\n\n**使用示例**\n\n\u003e http://localhost:8888/html2json?timeout=5\u0026url=https://gitee.com/truthhun/BookStack\n\n\n##### 解析Form表单提交HTML的内容\n\n**请求方法**\n\nPOST\n\n**请求接口**\n```\n/html2json\n```\n\n**请求参数**\n\n- `html` - HTML内容字符串\n- `domain` - 图片等静态资源域名，用于拼装图片等链接。需带 `http` 或 `https`，如 `https://static.bookstack.cn`\n\n\n##### 解析form表单提交的markdown内容\n\n**请求方法**\n\nPOST\n\n**请求接口**\n```\n/md2json\n```\n\n**请求参数**\n\n- `markdown` - [必需] markdown内容字符串\n- `domain` - 图片等静态资源域名，用于拼装图片等链接。需带 `http` 或 `https`，如 `https://static.bookstack.cn`\n\n\n### 以包的形式引用(针对Go语言)\n\n\n#### 安装\n```\ngo get -v github.com/TruthHun/html2json\n```\n\n#### 使用示例\n\n```\npackage main\n\nimport (\n\t\"encoding/json\"\n\t\"fmt\"\n\t\"time\"\n\n\t\"github.com/TruthHun/html2json/html2json\"\n)\n\nfunc main()  {\n\t//rt:=html2json.NewDefault()\n\tappTags:=html2json.GetTags(html2json.TagUniAPP)\n\trt:=html2json.New(appTags)\n\thtmlStr:=`\n\u003cdiv\u003e\n\thello world!\n\t\u003cspan\u003ethis is a span\u003c/span\u003e\n\ta b c d e\n\t\u003cimg src=\"https://www.bookstack.cn/static/images/logo.png\"/\u003e\n\t\u003caudio src=\"helloworld.mp3\"\u003e\u003c/audio\u003e\n\t\u003cvideo src=\"../bookstack.mp4\"\u003e\u003c/video\u003e\n\u003ca href=\"https://www.bookstack.cn\"\u003e书栈网 - 分享知识，共享智慧\u003c/a\u003e\n\u003c/div\u003e\n\u003ciframe src=\"https://www.baidu.com\" frameborder=\"0\"\u003e\u003c/iframe\u003e\n\u003cpre\u003e\n\tthis is pre code\n\u003c/pre\u003e\n`\n\tnow:=time.Now()\n\tnodes,err:=rt.Parse(htmlStr,\"https://www.bookstack.cn/static/\")\n\tif err!=nil{\n\t\tpanic(err)\n\t}\n\tfmt.Println(\"spend time\",time.Since(now))\n\tfmt.Println(toJSON(nodes))\n}\n\nfunc toJSON(v interface{}) (js string) {\n\tb,_:=json.Marshal(v)\n\treturn string(b)\n}\n```\n\n**示例代码输出结果**\n```\n[{\n\t\"name\": \"div\",\n\t\"attrs\": {\n\t\t\"class\": \"tag-div\"\n\t},\n\t\"children\": [{\n\t\t\"type\": \"text\",\n\t\t\"text\": \"\\n\\thello world!\\n\\t\"\n\t}, {\n\t\t\"name\": \"span\",\n\t\t\"attrs\": {\n\t\t\t\"class\": \"tag-span\"\n\t\t},\n\t\t\"children\": [{\n\t\t\t\"type\": \"text\",\n\t\t\t\"text\": \"this is a span\"\n\t\t}]\n\t}, {\n\t\t\"type\": \"text\",\n\t\t\"text\": \"\\n\\ta b c d e\\n\\t\"\n\t}, {\n\t\t\"name\": \"img\",\n\t\t\"attrs\": {\n\t\t\t\"class\": \"tag-img\",\n\t\t\t\"src\": \"https://www.bookstack.cn/static/images/logo.png\"\n\t\t}\n\t}, {\n\t\t\"type\": \"text\",\n\t\t\"text\": \"\\n\\t\"\n\t}, {\n\t\t\"name\": \"a\",\n\t\t\"attrs\": {\n\t\t\t\"class\": \"tag-audio\",\n\t\t\t\"href\": \"https://www.bookstack.cn/static/helloworld.mp3\",\n\t\t},\n\t\t\"children\": [{\n\t\t\t\"type\": \"text\",\n\t\t\t\"text\": \" [audio] https://www.bookstack.cn/static/helloworld.mp3 \"\n\t\t}]\n\t}, {\n\t\t\"type\": \"text\",\n\t\t\"text\": \"\\n\\t\"\n\t}, {\n\t\t\"name\": \"a\",\n\t\t\"attrs\": {\n\t\t\t\"class\": \"tag-video\",\n\t\t\t\"href\": \"https://www.bookstack.cn/bookstack.mp4\",\n\t\t},\n\t\t\"children\": [{\n\t\t\t\"type\": \"text\",\n\t\t\t\"text\": \" [video] https://www.bookstack.cn/bookstack.mp4 \"\n\t\t}]\n\t}, {\n\t\t\"type\": \"text\",\n\t\t\"text\": \"\\n\"\n\t}, {\n\t\t\"name\": \"a\",\n\t\t\"attrs\": {\n\t\t\t\"class\": \"tag-a\",\n\t\t\t\"href\": \"https://www.bookstack.cn\"\n\t\t},\n\t\t\"children\": [{\n\t\t\t\"type\": \"text\",\n\t\t\t\"text\": \"书栈网 - 分享知识，共享智慧\"\n\t\t}]\n\t}, {\n\t\t\"type\": \"text\",\n\t\t\"text\": \"\\n\"\n\t}]\n}, {\n\t\"type\": \"text\",\n\t\"text\": \"\\n\"\n}, {\n\t\"name\": \"a\",\n\t\"attrs\": {\n\t\t\"class\": \"tag-iframe\",\n\t\t\"frameborder\": \"0\",\n\t\t\"href\": \"https://www.baidu.com\",\n\t},\n\t\"children\": [{\n\t\t\"type\": \"text\",\n\t\t\"text\": \" [iframe] https://www.baidu.com \"\n\t}]\n}, {\n\t\"type\": \"text\",\n\t\"text\": \"\\n\"\n}, {\n\t\"name\": \"div\",\n\t\"attrs\": {\n\t\t\"class\": \"tag-pre\",\n\t\t\"style\": \"display: block;font-family: monospace;white-space: pre;margin: 1em 0;\"\n\t},\n\t\"children\": [{\n\t\t\"type\": \"text\",\n\t\t\"text\": \"\\tthis is pre code\\n\"\n\t}]\n}, {\n\t\"type\": \"text\",\n\t\"text\": \"\\n\"\n}]\n```\n\n## 说明\n\n所有标签都会生成一个 `\"tag-\"+标签名`的`class`，以便于对标签进行样式控制。\n\n比如 `a`标签，会添加上`tag-a`的class，`div`标签会添加一个`tag-div`，`code`标签会添加一个 `tag-code`的class，以此类推。\n\n**特别注释事项**\n\n由于部分小程序`rich-text`组件并不支持`pre`标签，所以`pre`标签会被转为`div`标签，并且多出一个`tag-pre`的class，同时会在增加一个\n`pre`标签本身默认的css样式：\n\n```\ndisplay: block;\nfont-family: monospace;\nwhite-space: pre;\nmargin: 1em 0;\n```\n\n同时，如果`video`、`iframe`、`audio`标签，如果不在信任的标签里面，则作为`a`标签处理\n\n\n## 程序体验\n\n编译好了的程序，只有一个可执行文件，部署和使用都和简单。\n\n如果是部署到线上生产环境，需要设置守护进程。\n\n\u003e 注意：如果服务不对外提供API请求服务，建议把端口不对外网开放","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftruthhun%2Fhtml2json","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftruthhun%2Fhtml2json","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftruthhun%2Fhtml2json/lists"}