Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with html2text
A curated list of projects in awesome lists tagged with html2text .
https://github.com/adbar/trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
article-extractor corpus corpus-builder corpus-tools crawler html-to-markdown html2text news news-aggregator news-crawler nlp readability rss-feed scraping tei text-cleaning text-extraction text-mining text-preprocessing web-scraping
Last synced: 26 Oct 2024
https://github.com/jaytaylor/html2text
Golang HTML to plaintext conversion library
go golang html-emails html2text plaintext
Last synced: 12 Nov 2024
https://github.com/weblyzard/inscriptis
A python based HTML to text conversion library, command line client and Web service.
client converter html html2text library python web-service
Last synced: 16 Dec 2024
https://github.com/inaridiy/webforai
The best HTML to Markdown library, A esm-native & Useful Utilities with simple, lightweight and epic quality.
article-extractor extractor html-to-markdown html2markdown html2md html2text readability scraping text-mining
Last synced: 19 Dec 2024
https://github.com/rxnlp/nlp-cloud-apis
RxNLP APIs for clustering sentences, extracting topics, counting words & n-grams, extracting text from html or URL, computing similarity between texts and more.
html2text mashape natural-language-processing nlp nlp-apis opinosis-summarization rxnlp-apis sentence-clustering text-mining topic-extraction
Last synced: 18 Dec 2024
https://github.com/thatxliner/unmarkd
An extremely configurable markdown reverser for Python3.
beautifulsoup flexible html html2text markdown markdown-reverser parser python python3 reverse-engineering reverse-markdown reverser
Last synced: 27 Oct 2024
https://github.com/ph-7/html2text
A very simple (but efficient) "HTML to plain text" converter ✍️
converter convertor email-text-parsing html2text htmltotext php php7 plain-text symfony-mailer text text-converter text-convertor
Last synced: 12 Oct 2024
https://github.com/pH-7/Html2Text
A very simple (but efficient) "HTML to plain text" converter ✍️
converter convertor email-text-parsing html2text htmltotext php php7 plain-text symfony-mailer text text-converter text-convertor
Last synced: 25 Nov 2024
https://github.com/masroore/php-html2text
A PHP package to convert HTML into a plain text format
Last synced: 28 Nov 2024
https://github.com/andythefactory/article-extraction-dataset
Article title, authors, date and body extraction dataset.
article-extractor corpus corpus-builder corpus-tools dataset datasets html-to-markdown html2text news news-aggregator news-crawler readability scraping scraping-websites text-cleaning text-extraction text-mining text-preprocessing web-scraping
Last synced: 07 Nov 2024
https://github.com/puhoy/readability_cli
a cli tool to fetch webpages main content and print it as markdown
fetch-webpages html-to-markdown html2text markdown python3 readability readability-cli readability-lxml
Last synced: 17 Nov 2024
https://github.com/kr1shnasomani/webscrub
Python code which extracts the html content and converts it to clean text using Selenium, Beautiful Soup and html2text
beautifulsoup html2text selenium webscraper
Last synced: 21 Dec 2024
https://github.com/cycloidio/docker-image-html2text
Dockerized html2text command-line tool
Last synced: 12 Nov 2024