An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with html-extractor

A curated list of projects in awesome lists tagged with html-extractor .

https://github.com/bookieio/breadability

Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)

html-extraction html-extractor html-parsing python text-extraction text-mining

Last synced: 21 Oct 2025

https://github.com/cdimascio/essence

Automatically extract the main text content (and more) from an HTML document

extractor hacktoberfest html-extractor scraper web-content-extractor webpage-extractor website-extractor

Last synced: 14 Apr 2025

https://github.com/jandc/css-from-html-extractor

PHP library which determines which css is used from html snippets.

css html-extractor php-library

Last synced: 08 Aug 2025

https://github.com/whomrx666/xtract-html

Xtract-html is a tool for extracting HTML display code from a website, which you can also use for your website.

html html-extraction html-extractor kali-linux linux termux termux-tool xtract-html

Last synced: 11 Mar 2026

https://github.com/whomrx666/xtract-htmlv2

Xtract-htmlV2 is a tool for getting the HTML code from the website you want and is the successor to the previous version

extract html-extraction html-extractor kali-linux linux termux termux-tool xtract-htmlv2

Last synced: 16 Mar 2025

https://github.com/importcjj/go-readability

Go package that cleans a HTML page for better readability.

extractor go golang html html-extractor html2text readability text text-extraction

Last synced: 14 Jan 2026