Projects in Awesome Lists tagged with html-extractor
A curated list of projects in awesome lists tagged with html-extractor .
https://github.com/miso-belica/sumy
Module for automatic summarization of text documents and HTML pages.
html-extraction html-extractor html-page lsa nlp pagerank-algorithm python reduction summarization summarizer summary sumy text-extraction textteaser
Last synced: 14 Feb 2026
https://github.com/bookieio/breadability
Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)
html-extraction html-extractor html-parsing python text-extraction text-mining
Last synced: 21 Oct 2025
https://github.com/cdimascio/essence
Automatically extract the main text content (and more) from an HTML document
extractor hacktoberfest html-extractor scraper web-content-extractor webpage-extractor website-extractor
Last synced: 14 Apr 2025
https://github.com/jandc/css-from-html-extractor
PHP library which determines which css is used from html snippets.
css html-extractor php-library
Last synced: 08 Aug 2025
https://github.com/whomrx666/xtract-html
Xtract-html is a tool for extracting HTML display code from a website, which you can also use for your website.
html html-extraction html-extractor kali-linux linux termux termux-tool xtract-html
Last synced: 11 Mar 2026
https://github.com/whomrx666/xtract-htmlv2
Xtract-htmlV2 is a tool for getting the HTML code from the website you want and is the successor to the previous version
extract html-extraction html-extractor kali-linux linux termux termux-tool xtract-htmlv2
Last synced: 16 Mar 2025
https://github.com/importcjj/go-readability
Go package that cleans a HTML page for better readability.
extractor go golang html html-extractor html2text readability text text-extraction
Last synced: 14 Jan 2026