{"id":19990231,"url":"https://github.com/kaola-fed/octoparse","last_synced_at":"2025-04-10T11:40:55.490Z","repository":{"id":32966938,"uuid":"145924806","full_name":"kaola-fed/octoparse","owner":"kaola-fed","description":"octoparse是一款html解析转换工具。可以将html解析成对象并转换成其他文本。支持html转微信小程序、支付宝小程序与百度小程序。","archived":false,"fork":false,"pushed_at":"2022-12-08T19:14:20.000Z","size":5286,"stargazers_count":45,"open_issues_count":21,"forks_count":13,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-03-18T10:51:30.945Z","etag":null,"topics":["html","octoparse","parser"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kaola-fed.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-08-24T01:10:02.000Z","updated_at":"2024-09-10T14:33:28.000Z","dependencies_parsed_at":"2023-01-14T22:52:03.749Z","dependency_job_id":null,"html_url":"https://github.com/kaola-fed/octoparse","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaola-fed%2Foctoparse","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaola-fed%2Foctoparse/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaola-fed%2Foctoparse/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaola-fed%2Foctoparse/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kaola-fed","download_url":"https://codeload.github.com/kaola-fed/octoparse/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248210741,"owners_count":21065603,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["html","octoparse","parser"],"created_at":"2024-11-13T04:51:04.557Z","updated_at":"2025-04-10T11:40:55.460Z","avatar_url":"https://github.com/kaola-fed.png","language":"JavaScript","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"readme":"\n## octoparse\n\n**octoparse**是一款html解析转换工具。可以将html解析成对象并转换成其他文本。支持html转微信小程序、支付宝小程序与百度小程序。\n\n## 快速开始\n\n### 下载\n```html\n    npm install octoparse\n```\n\n### 直接使用\n\n```html\n    import octoparse from 'octoparse'\n    \n    let htmlStr = \n            `\u003cdiv\u003e\n                \u003cp\u003etest\u003c/p\u003e\n                \u003cimg id=\"img1\" class=\".test .testImg2\"  src=\"https://ss3.bdstatic.com/70cFv8Sh_Q1YnxGkpoWK1HF6hhy/it/u=3492149706,1549268323\u0026fm=26\u0026gp=0.jpg\" alt=\"test\" title=\"girl\"\u003e\n             \u003c/div\u003e`;\n    let res = octoparse.htmlParse(htmlStr)\n```\n\n\n### 在vue中以插件方式使用\n\n```html\n    import Vue from 'vue'\n    import octoparse from 'octoparse'\n    Vue.use(octoparse)\n    \n    let htmlStr = \n        `\u003cdiv\u003e\n            \u003cp\u003etest\u003c/p\u003e\n            \u003cimg id=\"img1\" class=\".test .testImg2\"  src=\"https://ss3.bdstatic.com/70cFv8Sh_Q1YnxGkpoWK1HF6hhy/it/u=3492149706,1549268323\u0026fm=26\u0026gp=0.jpg\" alt=\"test\" title=\"girl\"\u003e\n        \u003c/div\u003e`;\n    let res = Vue.$htmlParse(htmlStr)\n```\n### 在小程序中使用\n1、在小程序模板中引入octoparse模板\n\n* 此处的data的key需要为nodes以便和octoparse小程序模板的入口模板保持一致\n* 此处的例子为微信小程序，支付宝小程序的使用方式基本相同，注意引用的模板为platform/alipay/index.axml\n```html\n\u003cimport src=\"node_modules/octoparse/lib/platform/wechat/index.wxml\"/\u003e \n\u003cview class=\"octoParse\"\u003e\n    \u003ctemplate is=\"octoParse\" data=\"{{nodes:htmlData}}\"/\u003e\n\u003c/view\u003e\n```\n2、 在page中挂载数据\n```html\n    import octoparse from 'octoparse'\n    \n    Page({\n        ...\n        onLoad: function(){\n            let htmlStr = \n            `\u003cdiv\u003e\n                \u003cp\u003etest\u003c/p\u003e\n                \u003cimg id=\"img1\" class=\".test .testImg2\"  src=\"https://ss3.bdstatic.com/70cFv8Sh_Q1YnxGkpoWK1HF6hhy/it/u=3492149706,1549268323\u0026fm=26\u0026gp=0.jpg\" alt=\"test\" title=\"girl\"\u003e\n             \u003c/div\u003e`;\n            let res = octoparse.htmlParse(htmlStr)\n            \n            this.setData({\n                htmlData: res   \n            })\n        }\n        ...\n    })\n\n```\n### 在Megola中使用\n\n[Megola 基于Vue的跨平台小程序开发框架](https://github.com/kaola-fed/megalo)\n\n1、在webpack中配置挂载小程序模板\n```html\n    module.exports = {\n        ...\n        target: createMegaloTarget( {\n            compiler: Object.assign( compiler, {}),\n            platform: 'wechat',\n            htmlParse: {\n                templateName: 'octoParse',\n                src: resolve('node_modules/octoparse/lib/platform/wechat')\n            }\n        }),\n         ...\n    }\n\n```\n2、在vue上挂载octoparse\n```html\n    import Vue from 'vue'\n    import octoparse from 'octoparse'\n    Vue.use(octoparse)\n```\n3、在页面中使用\n```html\n    \u003cdiv v-html=\"vhtml\"\u003e\n    \u003c/div\u003e\n    \n    data(){\n        return {\n           vhtml:`\u003cdiv\u003e\u003cp\u003etest\u003c/p\u003e\u003c/div\u003e`\n        }\n    }\n```\n\n### 基本使用\n\n将html转换为树结构\n\n```html\n    let htmlStr = \n            `\u003cdiv\u003e\n                \u003cp\u003etest\u003c/p\u003e\n                \u003cimg style=\"width:200px;height:200px;\" src=\"https://ss3.bdstatic.com/70cFv8Sh_Q1YnxGkpoWK1HF6hhy/it/u=3492149706,1549268323\u0026fm=26\u0026gp=0.jpg\" alt=\"test\" title=\"girl\"\u003e\n             \u003c/div\u003e`;\n    let res = octoparse.htmlParse(htmlStr)\n\n```\n\n### 节点预处理\n\n使用visitors方法对解析的html节点进行处理。\n\n例如：为所有的img图片加上display:block样式去除连续图片中间的缝隙\n\n```html\n    let options = {\n      visitors: {\n        img(node){\n          node.styleStr = 'display:block';\n        }\n      }\n    }\n    let htmlStr = \n            `\u003cimg src=\"//pop.nosdn.127.net/e2170dcf-efd0-4906-9da9-3a9900e52b39\"\u003e\n            \u003cimg src=\"//pop.nosdn.127.net/929408c3-7a72-44d2-9b11-8d5c6ea98dbb\"\u003e`;\n    let res = octoparse.htmlParse(htmlStr, options)\n```\n#### 节点属性释义\n\n| 属性名 | 含义 | 注释 |\n| ------ | ------ | ------ |\n| node | 节点类型 |  |\n| tag | 节点标签名 |  |\n| index | 节点在节点树中的序列号 | |\n| attr | 属性键值对 |  |\n| classStr | class字符串 | 在模板中使用该属性使class生效|\n| styleStr | style字符串 | 在模板中使用该属性使style生效 |\n| nodes | 子节点数组 |  |\n\n\n参考： [访问者模式](https://zh.wikipedia.org/wiki/%E8%AE%BF%E9%97%AE%E8%80%85%E6%A8%A1%E5%BC%8F)\n\n\n### 使用插件\n\n支持插件，例如：\n\n```html\n    import removeBackground from '../../lib/plugins/removeBackground';\n    let options = {\n        plugins: [removeBackground],\n    }\n    let htmlStr = \n            `\u003cdiv\u003e\n                \u003cp\u003etest\u003c/p\u003e\n                \u003cimg style=\"width:200px;height:200px;\" src=\"https://ss3.bdstatic.com/70cFv8Sh_Q1YnxGkpoWK1HF6hhy/it/u=3492149706,1549268323\u0026fm=26\u0026gp=0.jpg\" alt=\"test\" title=\"girl\"\u003e\n             \u003c/div\u003e`;\n    let res = octoparse.htmlParse(htmlStr, options)\n```\n\n### 本地开发\n\n* git clone https://github.com/kaola-fed/octoparse.git\n* cd octoparse\n* npm i\n* npm install gulp -g  (安装一下gulp)\n* npm run build\n* gulp (因为小程序不允许引用根目录以外的文件，所以这里跑一下gulp任务将模板拷贝到小程序demo的目录下面)\n\n\n\n## 灵感来源\n\n名字来源于游戏 `octopath tarveler`。项目启发自 `wxParse`。\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"https://gss3.bdstatic.com/-Po3dSag_xI4khGkpoWK1HF6hhy/baike/c0%3Dbaike80%2C5%2C5%2C80%2C26/sign=4cadfc03b88f8c54f7decd7d5b404690/b219ebc4b74543a961dac02112178a82b801141d.jpg\"\u003e\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkaola-fed%2Foctoparse","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkaola-fed%2Foctoparse","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkaola-fed%2Foctoparse/lists"}