{"id":22413679,"url":"https://github.com/b4rtaz/html2llm","last_synced_at":"2025-08-22T20:14:56.429Z","repository":{"id":265705459,"uuid":"896515264","full_name":"b4rtaz/html2llm","owner":"b4rtaz","description":"An experimental project to convert HTML websites into a format compatible with large language models (LLMs), enabling seamless website navigation and content reading.","archived":false,"fork":false,"pushed_at":"2024-12-01T12:52:34.000Z","size":10616,"stargazers_count":18,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-27T04:09:34.162Z","etag":null,"topics":["automation","browser-automation","llm","vision","yolov8"],"latest_commit_sha":null,"homepage":"https://b4rtaz.github.io/html2llm/app-website.html","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/b4rtaz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-30T15:13:28.000Z","updated_at":"2025-01-07T14:07:42.000Z","dependencies_parsed_at":"2024-11-30T16:28:39.907Z","dependency_job_id":"0af2cd31-a16f-4120-bf0c-976acf7f7187","html_url":"https://github.com/b4rtaz/html2llm","commit_stats":null,"previous_names":["b4rtaz/html2llm"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/b4rtaz%2Fhtml2llm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/b4rtaz%2Fhtml2llm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/b4rtaz%2Fhtml2llm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/b4rtaz%2Fhtml2llm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/b4rtaz","download_url":"https://codeload.github.com/b4rtaz/html2llm/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":236640702,"owners_count":19181746,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automation","browser-automation","llm","vision","yolov8"],"created_at":"2024-12-05T14:14:12.660Z","updated_at":"2025-02-01T08:48:59.310Z","avatar_url":"https://github.com/b4rtaz.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"![html2llm](.github/cover.png)\n\n# html2llm\n\nThis project is an experiment aimed at converting an HTML website into a format understandable by large language models (LLMs). The output can be used for various purposes, such as website navigation or content reading. The project incorporates elements of Microsoft's [OmniParser](https://github.com/microsoft/OmniParser) release and operates in the browser using WebAssembly. Surprisingly, it performs quite efficiently, with inference taking less than 300ms on my Mac M1.\n\nDemos:\n\n* [⭕ OmniParser WebAssembly](https://b4rtaz.github.io/html2llm/omni-parser-webassembly.html) - \u003ci\u003ea demo of YOLOv8 icon detection using WebAssembly\u003c/i\u003e\n* [📺 App Website](https://b4rtaz.github.io/html2llm/app-website.html) - \u003ci\u003ea demo of detecting UI elements by combining YOLOv8 with DOM tree traversal\u003c/i\u003e\n\n## 🚧 Idea\n\nThe OmniParser released by Microsoft operates in three steps:\n\n`OCR -\u003e Icon Detection -\u003e Icon/Box Captioning`\n\nThis approach enables control over almost any interface. However, it comes with a significant computational cost, particularly in the final step, which is the most resource-intensive part of the pipeline. The icon detection step requires 6.1MB of weights, while the icon captioning step demands 1GB of weights.\n\nInterestingly, in a browser environment, the first and last step can be skipped because we can traverse the DOM tree to extract this information directly. Surprisingly, the second step, which uses YOLOv8, performs efficiently in the browser thanks to [WebAssembly](https://github.com/Hyuto/yolov8-onnxruntime-web).\n\nFrom the universal approach, we derived the following process:\n\n`Screenshot Capturing -\u003e Icon Detection (OmniParser WebAssembly) -\u003e Icon/Box Captioning via Traversing DOM Tree`\n\nNow we have two problems:\n\n* how to capture a screenshot of the website ([captureVisibleTab](https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/tabs/captureVisibleTab) via a browser extension, [getScreenshotAs](https://www.selenium.dev/selenium/docs/api/java/org/openqa/selenium/TakesScreenshot.html) via Selenium, etc.)\n* how to resolve found bounding boxes to useful information (this is definitely not trivial, this part is resolved in this project by the [element extractor](html2llm/src/element-extractor/element-extractor.ts)).\n\nThis project is on a very early stage.\n\n## 🚀 How to Run on Any Page?\n\nYou can do it by using the [Playwright App Demo](./demos/playwright-app/).\n\n1. Clone the repository.\n2. Install all dependencies `pnpm install`.\n3. Run `cd demos/playwright-app`.\n4. Run `pnpm start \u003cURL\u003e`. For example, `pnpm start https://www.google.com`.\n\n## 💡 License\n\nThis project is released under the MIT license.\n\nThe used part of the OmniParser is released under the [Creative Commons Attribution 4.0 International license](https://github.com/microsoft/OmniParser/blob/master/LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fb4rtaz%2Fhtml2llm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fb4rtaz%2Fhtml2llm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fb4rtaz%2Fhtml2llm/lists"}