{"id":37072385,"url":"https://github.com/jhyao/tp_html","last_synced_at":"2026-01-14T08:29:52.600Z","repository":{"id":57476760,"uuid":"137167622","full_name":"jhyao/tp_html","owner":"jhyao","description":null,"archived":false,"fork":false,"pushed_at":"2018-12-30T09:03:29.000Z","size":36,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-10-28T17:48:10.929Z","etag":null,"topics":["html","parser","python","template"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jhyao.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-06-13T05:42:13.000Z","updated_at":"2021-03-28T12:27:58.000Z","dependencies_parsed_at":"2022-09-14T16:23:13.864Z","dependency_job_id":null,"html_url":"https://github.com/jhyao/tp_html","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jhyao/tp_html","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jhyao%2Ftp_html","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jhyao%2Ftp_html/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jhyao%2Ftp_html/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jhyao%2Ftp_html/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jhyao","download_url":"https://codeload.github.com/jhyao/tp_html/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jhyao%2Ftp_html/sbom","scorecard":{"id":519363,"data":{"date":"2025-08-11","repo":{"name":"github.com/jhyao/tp_html","commit":"35d388583a61ceac267f446f9f69920577e66dc5"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3,"checks":[{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Code-Review","score":0,"reason":"Found 0/10 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}}]},"last_synced_at":"2025-08-20T02:35:07.361Z","repository_id":57476760,"created_at":"2025-08-20T02:35:07.361Z","updated_at":"2025-08-20T02:35:07.361Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28414090,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T08:16:59.381Z","status":"ssl_error","status_checked_at":"2026-01-14T08:13:45.490Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["html","parser","python","template"],"created_at":"2026-01-14T08:29:51.974Z","updated_at":"2026-01-14T08:29:52.586Z","avatar_url":"https://github.com/jhyao.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Template Parser HTML\nThis tool can help get useful data from html web page. It parses html page with the template file which marks data that you need with special attributes. The template file positions html blocks that contain data, and describes types, names and structures of data. You can modify an example page file to get it, or write a basic html structure that can position your data. I suggest you to use the first method, the tool can delete irrelevant parts and organize html tree automatically.\n## install\n```\npip install tp_html\n```\n## How to use\n```python\nfrom tp_html import Template, ThtmlParser\n\n# get template\ntemplate = Template(template_file='samples/basic_template.html')\n\n# save template\ntemplate.save('samples/basic_template.min.html')\n\n# get parser\nparser = ThtmlParser(template_file='samples/basic_template.html')\nparser = ThtmlParser(template_text='...')\nparser = ThtmlParser(template=template)\n\n# parse data\ndata = parser.parse(page_file='samples/basic_sample.html', encoding='urf-8')\ndata = parser.parse(page_url='http://.....')\ndata = parser.parse(page_text='.....')\n```\n## Template file\n### string\nTo get data from content or attributes of element.\n```html\n\u003ca href=\"....\"\u003elink\u003c/a\u003e\n```\nTo get content. This will get data {'name': 'link'} \n```html\n\u003ca p-value=\"true\" p-name=\"name\"\u003e\u003c/a\u003e\n```\nTo get href. This will get data {'name': '...'}\n```html\n\u003ca p-value=\"true\" p-name=\"name\" p-item=\"href\"\u003e\u003c/a\u003e\n```\n### list\nFor HTML\n```html\n\u003cul class=\"image-list\"\u003e\n    \u003cli class=\"image-item\"\u003e\u003ca href=\"/image/1\"\u003e\u003c/a\u003e\u003c/li\u003e\n    \u003cli class=\"image-item\"\u003e\u003ca href=\"/image/2\"\u003e\u003c/a\u003e\u003c/li\u003e\n    \u003cli class=\"image-item\"\u003e\u003ca href=\"/image/3\"\u003e\u003c/a\u003e\u003c/li\u003e\n    \u003cli class=\"image-item\"\u003e\u003ca href=\"/image/4\"\u003e\u003c/a\u003e\u003c/li\u003e\n\u003c/ul\u003e\n```\ntemplate:\n```html\n\u003cul class=\"image-list\" p-value=\"true\" p-name=\"images\" p-type=\"list\"\u003e\n    \u003cli class=\"image-item\"\u003e\n        \u003ca p-value=\"true\" p-item=\"href\"\u003e\u003c/a\u003e\n    \u003c/li\u003e\n\u003c/ul\u003e\n```\ndata:\n```json\n{\n    \"images\": [\n        \"/image/1\",\n        \"/image/2\",\n        \"/image/3\",\n        \"/image/4\"\n    ]\n}\n```\nIn list template, in the element which is marked with p-type=list, require one child p-value node and just one that is for selecting item data. If list item is dict or list, structure in item is also allowed.\n### dict\nFor HTML\n```html\n\u003cdiv class=\"dict-data-container\"\u003e\n    \u003ca class=\"user-name\" href=\"/user/13456\" title=\"user xxx\"\u003exxx\u003c/a\u003e\n    \u003cp class=\"user-age\"\u003e20\u003c/p\u003e\n    \u003cdiv class=\"sub-div\"\u003e\n        \u003cp class=\"user-fans-num\"\u003e10\u003c/p\u003e\n        \u003cp class=\"user-follow-num\"\u003e20\u003c/p\u003e\n    \u003c/div\u003e\n\u003c/div\u003e\n```\ntemplate:\n```html\n\u003cdiv class=\"dict-data-container\" p-value=\"true\" p-name=\"user_link\" p-type=\"dict\"\u003e\n    \u003ca class=\"user-name\" p-value=\"true\" p-name=\"name link title\" p-item=\"string href title\"\u003e\u003c/a\u003e\n    \u003cp class=\"user-age\" p-value=\"true\" p-name=\"age\"\u003e\u003c/p\u003e\n    \u003cdiv class=\"sub-div\"\u003e\n        \u003cp class=\"user-fans-num\" p-value=\"true\" p-name=\"fans_num\"\u003e\u003c/p\u003e\n        \u003cp class=\"user-follow-num\" p-value=\"true\" p-name=\"follow_num\"\u003e\u003c/p\u003e\n    \u003c/div\u003e\n\u003c/div\u003e\n```\ndata:\n```json\n{\n    \"user_link\": {\n        \"name\": \"xxx\",\n        \"link\": \"/user/13456\",\n        \"title\": \"user xxx\",\n        \"age\": \"20\",\n        \"fans_num\": \"10\",\n        \"follow_num\": \"20\"\n    }\n}\n```\nIn dict template, p-name is required for key of dictionary. Multiple p-item is allowed, split with space, and \"string\" means content of element, others items are attributies name.\n## complex nesting\nhtml\n```html\n\u003c!DOCTYPE html\u003e\n\u003chtml lang=\"en\"\u003e\n\u003chead\u003e\n    \u003cmeta charset=\"UTF-8\"\u003e\n    \u003ctitle\u003eTitle\u003c/title\u003e\n\u003c/head\u003e\n\u003cbody\u003e\n\u003cdiv id=\"container\"\u003e\n    \u003cdiv class=\"column-1\"\u003e\n        \u003c!--dict in list--\u003e\n        \u003cul class=\"column-1-ul\"\u003e\n            \u003cli\u003e\n                \u003cdiv class=\"dict-data-container\"\u003e\n                    \u003ca class=\"user-name\" href=\"/user/1\" title=\"user xxx\"\u003exxx\u003c/a\u003e\n                    \u003cp class=\"user-age\"\u003e20\u003c/p\u003e\n                    \u003cdiv class=\"sub-div\"\u003e\n                        \u003cp class=\"user-fans-num\"\u003e10\u003c/p\u003e\n                        \u003cp class=\"user-follow-num\"\u003e20\u003c/p\u003e\n                    \u003c/div\u003e\n                \u003c/div\u003e\n            \u003c/li\u003e\n            \u003cli\u003e\n                \u003cdiv class=\"dict-data-container\"\u003e\n                    \u003ca class=\"user-name\" href=\"/user/2\" title=\"user yyy\"\u003eyyy\u003c/a\u003e\n                    \u003cp class=\"user-age\"\u003e10\u003c/p\u003e\n                    \u003cdiv class=\"sub-div\"\u003e\n                        \u003cp class=\"user-fans-num\"\u003e10\u003c/p\u003e\n                        \u003cp class=\"user-follow-num\"\u003e20\u003c/p\u003e\n                    \u003c/div\u003e\n                \u003c/div\u003e\n            \u003c/li\u003e\n        \u003c/ul\u003e\n    \u003c/div\u003e\n    \u003cdiv class=\"column-2\"\u003e\n        \u003c!--list in dict--\u003e\n        \u003cdiv class=\"dict-data-container\"\u003e\n            \u003ca class=\"user-name\" href=\"/user/1\" title=\"user xxx\"\u003exxx\u003c/a\u003e\n            \u003cp class=\"user-age\"\u003e20\u003c/p\u003e\n            \u003cdiv class=\"sub-div\"\u003e\n                \u003cp class=\"user-fans-num\"\u003e10\u003c/p\u003e\n                \u003cp class=\"user-follow-num\"\u003e20\u003c/p\u003e\n            \u003c/div\u003e\n            \u003cul class=\"image-list\"\u003e\n                \u003cli class=\"image-item\"\u003e\u003ca href=\"/image/1\"\u003e\u003c/a\u003e\u003c/li\u003e\n                \u003cli class=\"image-item\"\u003e\u003ca href=\"/image/2\"\u003e\u003c/a\u003e\u003c/li\u003e\n                \u003cli class=\"image-item\"\u003e\u003ca href=\"/image/3\"\u003e\u003c/a\u003e\u003c/li\u003e\n                \u003cli class=\"image-item\"\u003e\u003ca href=\"/image/4\"\u003e\u003c/a\u003e\u003c/li\u003e\n            \u003c/ul\u003e\n        \u003c/div\u003e\n    \u003c/div\u003e\n    \u003cdiv class=\"column-3\"\u003e\n        \u003c!--dict in dict--\u003e\n        \u003cdiv class=\"dict-data-container\"\u003e\n            \u003cdiv class=\"profile\"\u003e\n                \u003ca class=\"user-name\" href=\"/user/1\" title=\"user xxx\"\u003exxx\u003c/a\u003e\n                \u003cp class=\"user-age\"\u003e20\u003c/p\u003e\n            \u003c/div\u003e\n            \u003cdiv class=\"sub-div\"\u003e\n                \u003cp class=\"user-fans-num\"\u003e10\u003c/p\u003e\n                \u003cp class=\"user-follow-num\"\u003e20\u003c/p\u003e\n            \u003c/div\u003e\n        \u003c/div\u003e\n    \u003c/div\u003e\n    \u003cdiv class=\"column-4\"\u003e\n        \u003c!--list in list--\u003e\n        \u003cul class=\"image-list\"\u003e\n            \u003cul class=\"image-tags\"\u003e\n                \u003cp class=\"tag\"\u003etag1-1\u003c/p\u003e\n                \u003cp class=\"tag\"\u003etag1-2\u003c/p\u003e\n                \u003cp class=\"tag\"\u003etag1-3\u003c/p\u003e\n            \u003c/ul\u003e\n            \u003cul class=\"image-tags\"\u003e\n                \u003cp class=\"tag\"\u003etag2-1\u003c/p\u003e\n                \u003cp class=\"tag\"\u003etag2-2\u003c/p\u003e\n                \u003cp class=\"tag\"\u003etag2-3\u003c/p\u003e\n            \u003c/ul\u003e\n            \u003cul class=\"image-tags\"\u003e\n                \u003cp class=\"tag\"\u003etag3-1\u003c/p\u003e\n                \u003cp class=\"tag\"\u003etag3-2\u003c/p\u003e\n                \u003cp class=\"tag\"\u003etag3-3\u003c/p\u003e\n            \u003c/ul\u003e\n        \u003c/ul\u003e\n    \u003c/div\u003e\n\u003c/div\u003e\n\u003c/body\u003e\n\u003c/html\u003e\n```\ntemplate:\n```html\n\u003cdiv id=\"container\"\u003e\n    \u003cdiv class=\"column-1\"\u003e\n        \u003c!--dict in list--\u003e\n        \u003cul class=\"column-1-ul\" p-value=\"true\" p-name=\"user_list\" p-type=\"list\"\u003e\n            \u003cli p-value=\"true\" p-type=\"dict\"\u003e\n                \u003cdiv class=\"dict-data-container\"\u003e\n                    \u003ca class=\"user-name\" p-value=\"true\" p-name=\"name link title\" p-item=\"string href title\"\u003e\u003c/a\u003e\n                    \u003cp class=\"user-age\" p-value=\"true\" p-name=\"age\"\u003e\u003c/p\u003e\n                    \u003cdiv class=\"sub-div\"\u003e\n                        \u003cp class=\"user-fans-num\" p-value=\"true\" p-name=\"fans_num\"\u003e\u003c/p\u003e\n                        \u003cp class=\"user-follow-num\" p-value=\"true\" p-name=\"follow_num\"\u003e\u003c/p\u003e\n                    \u003c/div\u003e\n                \u003c/div\u003e\n            \u003c/li\u003e\n        \u003c/ul\u003e\n    \u003c/div\u003e\n    \u003cdiv class=\"column-2\"\u003e\n        \u003c!--list in dict--\u003e\n        \u003cdiv class=\"dict-data-container\" p-value=\"true\" p-name=\"user_all\" p-type=\"dict\"\u003e\n            \u003ca class=\"user-name\" p-value=\"true\" p-name=\"name link title\" p-item=\"string href title\"\u003e\u003c/a\u003e\n            \u003cp class=\"user-age\" p-value=\"true\" p-name=\"age\"\u003e\u003c/p\u003e\n            \u003cdiv class=\"sub-div\"\u003e\n                \u003cp class=\"user-fans-num\" p-value=\"true\" p-name=\"fans_num\"\u003e\u003c/p\u003e\n                \u003cp class=\"user-follow-num\" p-value=\"true\" p-name=\"follow_num\"\u003e\u003c/p\u003e\n            \u003c/div\u003e\n            \u003cul class=\"image-list\" p-value=\"true\" p-name=\"images\" p-type=\"list\"\u003e\n                \u003cli class=\"image-item\"\u003e\n                    \u003ca p-value=\"true\" p-item=\"href\"\u003e\u003c/a\u003e\n                \u003c/li\u003e\n            \u003c/ul\u003e\n        \u003c/div\u003e\n    \u003c/div\u003e\n    \u003cdiv class=\"column-3\"\u003e\n        \u003c!--dict in dict--\u003e\n        \u003cdiv class=\"dict-data-container\" p-value=\"true\" p-name=\"user_info\" p-type=\"dict\"\u003e\n            \u003cdiv class=\"profile\" p-value=\"true\" p-name=\"profile\" p-type=\"dict\"\u003e\n                \u003ca class=\"user-name\" p-value=\"true\" p-name=\"name link title\" p-item=\"string href title\"\u003e\u003c/a\u003e\n                \u003cp class=\"user-age\" p-value=\"true\" p-name=\"age\"\u003e\u003c/p\u003e\n            \u003c/div\u003e\n            \u003cdiv class=\"sub-div\" p-value=\"true\" p-name=\"counts\" p-type=\"dict\"\u003e\n                \u003cp class=\"user-fans-num\" p-value=\"true\" p-name=\"fans_num\"\u003e\u003c/p\u003e\n                \u003cp class=\"user-follow-num\" p-value=\"true\" p-name=\"follow_num\"\u003e\u003c/p\u003e\n            \u003c/div\u003e\n        \u003c/div\u003e\n    \u003c/div\u003e\n    \u003cdiv class=\"column-4\"\u003e\n        \u003c!--list in list--\u003e\n        \u003cul class=\"image-list\" p-value=\"true\" p-name=\"image_tags\" p-type=\"list\"\u003e\n            \u003cul class=\"image-tags\" p-value=\"true\" p-type=\"list\"\u003e\n                \u003cp class=\"tag\" p-value=\"true\"\u003e\u003c/p\u003e\n            \u003c/ul\u003e\n        \u003c/ul\u003e\n    \u003c/div\u003e\n\u003c/div\u003e\n```\ndata:\n```json\n{\n    \"user_list\": [\n        {\n            \"name\": \"xxx\",\n            \"link\": \"/user/1\",\n            \"title\": \"user xxx\",\n            \"age\": \"20\",\n            \"fans_num\": \"10\",\n            \"follow_num\": \"20\"\n        },\n        {\n            \"name\": \"yyy\",\n            \"link\": \"/user/2\",\n            \"title\": \"user yyy\",\n            \"age\": \"10\",\n            \"fans_num\": \"10\",\n            \"follow_num\": \"20\"\n        }\n    ],\n    \"user_all\": {\n        \"name\": \"xxx\",\n        \"link\": \"/user/1\",\n        \"title\": \"user xxx\",\n        \"age\": \"20\",\n        \"fans_num\": \"10\",\n        \"follow_num\": \"20\",\n        \"images\": [\n            \"/image/1\",\n            \"/image/2\",\n            \"/image/3\",\n            \"/image/4\"\n        ]\n    },\n    \"user_info\": {\n        \"profile\": {\n            \"name\": \"xxx\",\n            \"link\": \"/user/1\",\n            \"title\": \"user xxx\",\n            \"age\": \"20\"\n        },\n        \"counts\": {\n            \"fans_num\": \"10\",\n            \"follow_num\": \"20\"\n        }\n    },\n    \"image_tags\": [\n        [\n            \"tag1-1\",\n            \"tag1-2\",\n            \"tag1-3\"\n        ],\n        [\n            \"tag2-1\",\n            \"tag2-2\",\n            \"tag2-3\"\n        ],\n        [\n            \"tag3-1\",\n            \"tag3-2\",\n            \"tag3-3\"\n        ]\n    ]\n}\n```\n# p-data tag in template\np-data is a empty html tag using in template to organize data structure. In the example below, four elements are in the same level. But with p-data tag, they can have more structure.  \nHTML:\n```html\n\u003cdiv class=\"dict-data-container\"\u003e\n    \u003ca class=\"user-name\" href=\"/user/1\" title=\"user xxx\"\u003exxx\u003c/a\u003e\n    \u003cp class=\"user-age\"\u003e20\u003c/p\u003e\n    \u003cp class=\"user-fans-num\"\u003e10\u003c/p\u003e\n    \u003cp class=\"user-follow-num\"\u003e20\u003c/p\u003e\n\u003c/div\u003e\n```\ntemplate:\n```html\n\u003cdiv class=\"dict-data-container\" p-value=\"true\" p-name=\"user_info\" p-type=\"dict\"\u003e\n    \u003cp-data p-value=\"true\" p-name=\"profile\" p-type=\"dict\"\u003e\n        \u003ca class=\"user-name\" p-value=\"true\" p-name=\"name link title\" p-item=\"string href title\"\u003e\u003c/a\u003e\n        \u003cp class=\"user-age\" p-value=\"true\" p-name=\"age\"\u003e\u003c/p\u003e\n    \u003c/p-data\u003e\n    \u003cp-data p-value=\"true\" p-name=\"counts\" p-type=\"dict\"\u003e\n        \u003cp class=\"user-fans-num\" p-value=\"true\" p-name=\"fans_num\"\u003e\u003c/p\u003e\n        \u003cp class=\"user-follow-num\" p-value=\"true\" p-name=\"follow_num\"\u003e\u003c/p\u003e\n    \u003c/p-data\u003e\n\u003c/div\u003e\n```\ndata:\n```json\n{\n    \"user_info\": {\n        \"profile\": {\n            \"name\": \"xxx\",\n            \"link\": \"/user/1\",\n            \"title\": \"user xxx\",\n            \"age\": \"20\"\n        },\n        \"counts\": {\n            \"fans_num\": \"10\",\n            \"follow_num\": \"20\"\n        }\n    }\n}\n```\n# Save template\nThe tool also provide a method to save minimal template into a file. It will have a faster template building speed with the minimal template file.\n```python\ntemplate = TemplateParser(template_file='samples/pixiv_user_template.html')\ntemplate.save('samples/pixiv_user_template.min.html')\nparser = WebPageParser(template_file='samples/pixiv_user_template.min.html')\ndata = parser.parse(page_file='samples/pixiv_user.html')\n```\nminimal template:\n```html\n\u003cp-data selector=\"html \u003e body \u003e div#wrapper \u003e div.layout-a\"\u003e\n    \u003cp-data selector=\"div.layout-column-2 \u003e div._unit \u003e div.works_area.profile \u003e div.works_info \u003e div.worksOption.profile-page \u003e div.worksListOthers \u003e div.works-illust \u003e ul._image-items.no-user\" p-value=\"true\" p-type=\"list\" p-name=\"image-list\"\u003e\n        \u003cp-data selector=\"li.image-item\" p-value=\"true\" p-type=\"dict\"\u003e\n            \u003cp-data selector=\"a.work._work\" p-value=\"true\" p-name=\"url\" p-item=\"href\"\u003e\n                \u003cp-data selector=\"div._layout-thumbnail \u003e img._thumbnail.ui-scroll-view\" p-value=\"true\" p-name=\"illust_id tags\" p-item=\"data-id data-tags\"\u003e\u003c/p-data\u003e\n            \u003c/p-data\u003e\n        \u003c/p-data\u003e\n    \u003c/p-data\u003e\n    \u003cp-data selector=\"div.layout-column-1 \u003e div.ui-layout-west \u003e div._user-profile-card \u003e div.profile \u003e a.user-name\" p-value=\"true\" p-type=\"string\" p-name=\"name\"\u003e\u003c/p-data\u003e\n\u003c/p-data\u003e\n```\n# Test\nTime test for templates in samples folder. Test tool is [timefunc](https://github.com/jhyao/functime)  \ntest code:\n```python\nparser = ThtmlParser(template_file='samples/basic_template.html')\nfunctime.func_time(ThtmlParser, template_file='samples/basic_template.html')\nfunctime.func_time(parser.parse, page_file='samples/basic_sample.html')\n```\nresult:\n```\nbasic_template.html\nTemplateParser AVG(1000): 1.251ms\nparse AVG(100): 3.2599ms\n\ncomplex_template.html\nTemplateParser AVG(1000): 1.6005ms\nparse AVG(100): 6.265ms\n\npixiv_user_template.html\nTemplateParser AVG(10): 24.7ms\nparse AVG(10): 34.4997ms\n\npixiv_user_template.min.html\nTemplateParser AVG(1000): 760.001183us\nparse AVG(10): 31.2003ms\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjhyao%2Ftp_html","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjhyao%2Ftp_html","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjhyao%2Ftp_html/lists"}