{"id":27360520,"url":"https://github.com/fhightower/html-to-json","last_synced_at":"2025-04-13T01:08:55.588Z","repository":{"id":57437659,"uuid":"332068189","full_name":"fhightower/html-to-json","owner":"fhightower","description":"Convert HTML to JSON. Can also (intelligently) convert HTML tables to JSON (using table headers (if available) as keys in the resulting JSON).","archived":false,"fork":false,"pushed_at":"2023-06-06T12:33:46.000Z","size":587,"stargazers_count":50,"open_issues_count":14,"forks_count":8,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-13T01:08:48.432Z","etag":null,"topics":["hacktoberfest","html","html-converter","html-tables","html-tables-to-json","html-to-json","html2json","json"],"latest_commit_sha":null,"homepage":"","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fhightower.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-01-22T21:37:53.000Z","updated_at":"2024-11-24T23:49:44.000Z","dependencies_parsed_at":"2024-06-18T21:52:49.058Z","dependency_job_id":null,"html_url":"https://github.com/fhightower/html-to-json","commit_stats":{"total_commits":64,"total_committers":1,"mean_commits":64.0,"dds":0.0,"last_synced_commit":"305c26d1499c523e8a29176d16acd57618b39bd3"},"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fhightower%2Fhtml-to-json","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fhightower%2Fhtml-to-json/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fhightower%2Fhtml-to-json/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fhightower%2Fhtml-to-json/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fhightower","download_url":"https://codeload.github.com/fhightower/html-to-json/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248650760,"owners_count":21139681,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hacktoberfest","html","html-converter","html-tables","html-tables-to-json","html-to-json","html2json","json"],"created_at":"2025-04-13T01:08:54.505Z","updated_at":"2025-04-13T01:08:55.572Z","avatar_url":"https://github.com/fhightower.png","language":"HTML","funding_links":["https://github.com/sponsors/fhightower"],"categories":[],"sub_categories":[],"readme":"# HTML to JSON\n\n[![PyPI](https://img.shields.io/pypi/v/html-to-json.svg)](https://pypi.python.org/pypi/html-to-json)\n[![codecov](https://codecov.io/gh/fhightower/html-to-json/branch/main/graph/badge.svg?token=V0WOIXRGMM)](https://codecov.io/gh/fhightower/html-to-json)\n\nConvert HTML and/or HTML tables to JSON.\n\n## Current Status\n\n📢 I have a lot of demands on my time at the moment and won't be able to work on this library without [sponsorship](https://github.com/sponsors/fhightower). If this library is useful to you or if you're using this library for a business - please consider [sponsoring](https://github.com/sponsors/fhightower) me. Even a small sponsorship allows me to prioritize work on this library and ongoing maintainance. Thanks!\n\n## Installation\n\n```\npip install html-to-json\n```\n\n## Usage\n\n### HTML to JSON\n\n```python\nimport html_to_json\n\nhtml_string = \"\"\"\u003chead\u003e\n    \u003ctitle\u003eTest site\u003c/title\u003e\n    \u003cmeta charset=\"UTF-8\"\u003e\u003c/head\u003e\"\"\"\noutput_json = html_to_json.convert(html_string)\nprint(output_json)\n```\n\nWhen calling the `html_to_json.convert` function, you can choose to not capture the text values from the html by passing in the key-word argument `capture_element_values=False`. You can also choose to not capture the attributes of the elements by passing `capture_element_attributes=False` into the function.\n\n#### Example\n\nExample input:\n\n```html\n\u003chead\u003e\n    \u003ctitle\u003eFloyd Hightower's Projects\u003c/title\u003e\n    \u003cmeta charset=\"UTF-8\"\u003e\n    \u003cmeta name=\"description\" content=\"Floyd Hightower\u0026#39;s Projects\"\u003e\n    \u003cmeta name=\"keywords\" content=\"projects,fhightower,Floyd,Hightower\"\u003e\n\u003c/head\u003e\n```\n\nExample output:\n\n```json\n{\n    \"head\": [\n    {\n        \"title\": [\n        {\n            \"_value\": \"Floyd Hightower's Projects\"\n        }],\n        \"meta\": [\n        {\n            \"_attributes\":\n            {\n                \"charset\": \"UTF-8\"\n            }\n        },\n        {\n            \"_attributes\":\n            {\n                \"name\": \"description\",\n                \"content\": \"Floyd Hightower's Projects\"\n            }\n        },\n        {\n            \"_attributes\":\n            {\n                \"name\": \"keywords\",\n                \"content\": \"projects,fhightower,Floyd,Hightower\"\n            }\n        }]\n    }]\n}\n```\n\n### HTML Tables to JSON\n\nIn addition to converting HTML to JSON, this library can also intelligently convert HTML tables to JSON.\n\nCurrently, this library can handle three types of tables:\n\nA. Those with [table headers](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/th) in the first row\nB. Those with table headers in the first column\nC. Those without table headers\n\nTables of type A and B are diagrammed below:\n\n![This package can handle tables with the headers in the first row or headers in the first column](./html_table_varieties.jpg)\n\n#### Example\n\nThis code:\n\n```python\nimport html_to_json\n\nhtml_string = \"\"\"\u003ctable\u003e\n    \u003ctr\u003e\n        \u003cth\u003e#\u003c/th\u003e\n        \u003cth\u003eMalware\u003c/th\u003e\n        \u003cth\u003eMD5\u003c/th\u003e\n        \u003cth\u003eDate Added\u003c/th\u003e\n    \u003c/tr\u003e\n\n    \u003ctr\u003e\n        \u003ctd\u003e25548\u003c/td\u003e\n        \u003ctd\u003e\u003ca href=\"/stats/DarkComet/\"\u003eDarkComet\u003c/a\u003e\u003c/td\u003e\n        \u003ctd\u003e\u003ca href=\"/config/034a37b2a2307f876adc9538986d7b86\"\u003e034a37b2a2307f876adc9538986d7b86\u003c/a\u003e\u003c/td\u003e\n        \u003ctd\u003eJuly 9, 2018, 6:25 a.m.\u003c/td\u003e\n    \u003c/tr\u003e\n    \n    \u003ctr\u003e\n        \u003ctd\u003e25547\u003c/td\u003e\n        \u003ctd\u003e\u003ca href=\"/stats/DarkComet/\"\u003eDarkComet\u003c/a\u003e\u003c/td\u003e\n        \u003ctd\u003e\u003ca href=\"/config/706eeefbac3de4d58b27d964173999c3\"\u003e706eeefbac3de4d58b27d964173999c3\u003c/a\u003e\u003c/td\u003e\n        \u003ctd\u003eJuly 7, 2018, 6:25 a.m.\u003c/td\u003e\n    \u003c/tr\u003e\u003c/table\u003e\"\"\"\ntables = html_to_json.convert_tables(html_string)\nprint(tables)\n```\n\nwill produce this output:\n\n```json\n[\n    [\n        {\n            \"#\": \"25548\",\n            \"Malware\": \"DarkComet\",\n            \"MD5\": \"034a37b2a2307f876adc9538986d7b86\",\n            \"Date Added\": \"July 9, 2018, 6:25 a.m.\"\n        }, {\n            \"#\": \"25547\",\n            \"Malware\": \"DarkComet\",\n            \"MD5\": \"706eeefbac3de4d58b27d964173999c3\",\n            \"Date Added\": \"July 7, 2018, 6:25 a.m.\"\n        }\n    ]\n]\n```\n\n## Credits\n\nThis package was created with [Cookiecutter](https://github.com/audreyr/cookiecutter) and fhightower's [Python project template](https://github.com/fhightower-templates/python-project-template).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffhightower%2Fhtml-to-json","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffhightower%2Fhtml-to-json","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffhightower%2Fhtml-to-json/lists"}