An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with html-extraction

A curated list of projects in awesome lists tagged with html-extraction .

https://github.com/bookieio/breadability

Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)

html-extraction html-extractor html-parsing python text-extraction text-mining

Last synced: 21 Oct 2025

https://github.com/html-extract/hext

Domain-specific language for extracting structured data from HTML documents

cpp data-extraction dsl html html-extraction node php python ruby scraping

Last synced: 15 Apr 2025

https://github.com/whomrx666/xtract-html

Xtract-html is a tool for extracting HTML display code from a website, which you can also use for your website.

html html-extraction html-extractor kali-linux linux termux termux-tool xtract-html

Last synced: 11 Mar 2026

https://github.com/whomrx666/xtract-htmlv2

Xtract-htmlV2 is a tool for getting the HTML code from the website you want and is the successor to the previous version

extract html-extraction html-extractor kali-linux linux termux termux-tool xtract-htmlv2

Last synced: 16 Mar 2025

https://github.com/reasonkit/reasonkit-web

High-performance MCP server for browser automation, web capture, and content extraction. Rust-powered CDP client for AI agents.

agent-tools ai-agent async automation browser-automation cdp chrome-devtools-protocol chromium developer-tools headless-browser html-extraction llm-tools mcp model-context-protocol pdf rust screenshot tokio web-automation web-scraping

Last synced: 08 Jan 2026