{"id":13483361,"url":"https://github.com/kostya/myhtml","last_synced_at":"2025-08-21T09:32:57.561Z","repository":{"id":5913053,"uuid":"54235772","full_name":"kostya/myhtml","owner":"kostya","description":"Fast HTML5 Parser with css selectors for Crystal language","archived":false,"fork":false,"pushed_at":"2025-04-15T22:56:04.000Z","size":448,"stargazers_count":155,"open_issues_count":0,"forks_count":12,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-05-24T20:05:03.871Z","etag":null,"topics":["crystal","fast","html","myhtml","parser"],"latest_commit_sha":null,"homepage":"","language":"Crystal","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kostya.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2016-03-18T22:46:08.000Z","updated_at":"2025-04-15T22:55:52.000Z","dependencies_parsed_at":"2025-04-15T23:40:23.342Z","dependency_job_id":null,"html_url":"https://github.com/kostya/myhtml","commit_stats":null,"previous_names":[],"tags_count":47,"template":false,"template_full_name":null,"purl":"pkg:github/kostya/myhtml","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kostya%2Fmyhtml","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kostya%2Fmyhtml/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kostya%2Fmyhtml/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kostya%2Fmyhtml/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kostya","download_url":"https://codeload.github.com/kostya/myhtml/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kostya%2Fmyhtml/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271455670,"owners_count":24762771,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-21T02:00:08.990Z","response_time":74,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crystal","fast","html","myhtml","parser"],"created_at":"2024-07-31T17:01:10.458Z","updated_at":"2025-08-21T09:32:57.285Z","avatar_url":"https://github.com/kostya.png","language":"Crystal","funding_links":[],"categories":["Crystal","HTML/XML Parsing"],"sub_categories":[],"readme":"# MyHTML\n\n[![Build Status](https://github.com/kostya/myhtml/actions/workflows/ci.yml/badge.svg)](https://github.com/kostya/myhtml/actions/workflows/ci.yml?query=branch%3Amaster+event%3Apush)\n\nFast HTML5 Parser (Crystal binding for awesome lexborisov's [myhtml](https://github.com/lexborisov/myhtml) and [Modest](https://github.com/lexborisov/Modest)). This shard used in production to parse millions of pages per day, very stable and fast.\n\n## WARNING: original libraries (myhtml and Modest) not maintained since july 2020, i recommend switch to successor parser: [Lexbor](https://github.com/kostya/lexbor).\n\n## Installation\n\n\nAdd this to your application's `shard.yml`:\n\n```yaml\ndependencies:\n  myhtml:\n    github: kostya/myhtml\n```\n\nAnd run `shards install`\n\n## Usage example\n\n```crystal\nrequire \"myhtml\"\n\nhtml = \u003c\u003c-HTML\n  \u003chtml\u003e\n    \u003cbody\u003e\n      \u003cdiv id=\"t1\" class=\"red\"\u003e\n        \u003ca href=\"/#\"\u003eO_o\u003c/a\u003e\n      \u003c/div\u003e\n      \u003cdiv id=\"t2\"\u003e\u003c/div\u003e\n    \u003c/body\u003e\n  \u003c/html\u003e\nHTML\n\nmyhtml = Myhtml::Parser.new(html)\n\nmyhtml.nodes(:div).each do |node|\n  id = node.attribute_by(\"id\")\n\n  if first_link = node.scope.nodes(:a).first?\n    href = first_link.attribute_by(\"href\")\n    link_text = first_link.inner_text\n\n    puts \"div with id #{id} have link [#{link_text}](#{href})\"\n  else\n    puts \"div with id #{id} have no links\"\n  end\nend\n\n# Output:\n#   div with id t1 have link [O_o](/#)\n#   div with id t2 have no links\n```\n\n## Css selectors example\n\n```crystal\nrequire \"myhtml\"\n\nhtml = \u003c\u003c-HTML\n  \u003chtml\u003e\n    \u003cbody\u003e\n      \u003ctable id=\"t1\"\u003e\n        \u003ctr\u003e\u003ctd\u003eHello\u003c/td\u003e\u003c/tr\u003e\n      \u003c/table\u003e\n      \u003ctable id=\"t2\"\u003e\n        \u003ctr\u003e\u003ctd\u003e123\u003c/td\u003e\u003ctd\u003eother\u003c/td\u003e\u003c/tr\u003e\n        \u003ctr\u003e\u003ctd\u003efoo\u003c/td\u003e\u003ctd\u003ecolumns\u003c/td\u003e\u003c/tr\u003e\n        \u003ctr\u003e\u003ctd\u003ebar\u003c/td\u003e\u003ctd\u003eare\u003c/td\u003e\u003c/tr\u003e\n        \u003ctr\u003e\u003ctd\u003exyz\u003c/td\u003e\u003ctd\u003eignored\u003c/td\u003e\u003c/tr\u003e\n      \u003c/table\u003e\n    \u003c/body\u003e\n  \u003c/html\u003e\nHTML\n\nmyhtml = Myhtml::Parser.new(html)\n\np myhtml.css(\"#t2 tr td:first-child\").map(\u0026.inner_text).to_a\n# =\u003e [\"123\", \"foo\", \"bar\", \"xyz\"]\n\np myhtml.css(\"#t2 tr td:first-child\").map(\u0026.to_html).to_a\n# =\u003e [\"\u003ctd\u003e123\u003c/td\u003e\", \"\u003ctd\u003efoo\u003c/td\u003e\", \"\u003ctd\u003ebar\u003c/td\u003e\", \"\u003ctd\u003exyz\u003c/td\u003e\"]\n```\n\n## More Examples\n\n[examples](https://github.com/kostya/myhtml/tree/master/examples)\n\n## Development Setup:\n\n```shell\ngit clone https://github.com/kostya/myhtml.git\ncd myhtml\nmake\ncrystal spec\n```\n\n## Benchmark\n\nParse 1000 times google page(600Kb), and 1000 times css select. [myhtml-program](https://github.com/kostya/myhtml/tree/master/bench/test-myhtml.cr), [crystagiri-program](https://github.com/kostya/myhtml/tree/master/bench/test-libxml.cr), [nokogiri-program](https://github.com/kostya/myhtml/tree/master/bench/test-libxml.rb)\n\n| Lang     | Shard      | Lib             | Parse time, s | Css time, s | Memory, MiB |\n| -------- | ---------- | --------------- | ------------- | ----------- | ----------- |\n| Crystal  | lexbor     | lexbor          | 2.54          | 0.099       | 7.8         |\n| Crystal  | myhtml     | myhtml(+modest) | 3.17          | 0.16        | 8.4         |\n| Ruby 2.7 | Nokogiri   | libxml2         | 9.19          | 10.76       | 139.8       |\n| Crystal  | Crystagiri | libxml2         | 11.27         | -           | 25.0        |\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkostya%2Fmyhtml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkostya%2Fmyhtml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkostya%2Fmyhtml/lists"}