An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with html-scraper

A curated list of projects in awesome lists tagged with html-scraper .

https://github.com/betahuhn/metadata-scraper

🏷️ A JavaScript library for scraping/parsing metadata from a web page.

html-scraper javascript-library meta-tags metadata metadata-extraction metatags open-graph page parser typescript

Last synced: 16 May 2025

https://github.com/BetaHuhn/metadata-scraper

🏷️ A JavaScript library for scraping/parsing metadata from a web page.

html-scraper javascript-library meta-tags metadata metadata-extraction metatags open-graph page parser typescript

Last synced: 27 Mar 2025

https://github.com/CompileInc/hodor

🕷Configuration based html scraper

cssselect hodor html-scraper lxml pagination python scraping

Last synced: 19 Jul 2025

https://github.com/imelgrat/feed-finder

A PHP class for extracting the URLs of RSS (1.0 and 2.0) and ATOM feeds associated to a page, as well as OPML outline documents.

atom atom-feed composer-package html-parser html-scraper opml opml-outline php regex regular-expression rss rss-feed rss-feed-scraper

Last synced: 16 May 2025

https://github.com/marcomontalbano/html-miner

A powerful miner that will scrape html pages for you. ` HTML Scraper ´

coverage html-scraper istanbul mocha nodejs npm-package nyc scraper

Last synced: 16 Apr 2025

https://github.com/anshu-krishna/html-scraper

A PHP class to simplify data extraction from HTML.

html-scraper html-scraping php php-queryselector scraper web-scraper web-scraping

Last synced: 13 Jul 2025

https://github.com/phatpham9/scraper

An html scraper microservice based on x-ray & micro

es6 html-scraper joi micro microservice nodejs scraper x-ray

Last synced: 14 May 2026

https://github.com/sandeepkundalwal/automated-plagiarism-detector

An automated plagiarism detector that handles unzipping, generates plagiarism report and scraps the reports for threshold plagiarism.

gradle html-scraper java jsoup maven plagiarism-detector python3 teaching-assistant unzipping-files

Last synced: 09 Apr 2026

https://github.com/kgruiz/stealth-crawler

Asynchronous headless-Chrome web crawler that discovers internal links and optionally saves HTML, Markdown, screenshots, or PDFs. Built for scripting, inspection, and automation.

asyncio cli crawler headless-chrome html-scraper pydoll python web-crawler

Last synced: 25 Oct 2025

https://github.com/basemax/kashan-university-phone-directory

This repository contains a scraper and dataset for extracting and publishing the phone directory of employees and other personnel from the University of Kashan. It includes tools to scrape, parse, and export data from a given HTML file into JSON format.

crawler crawlers database html-scraper json kashan kashan-university scraper scraper-api scraper-html scrapers university university-of-kashan

Last synced: 18 May 2026

https://github.com/phatpham9/scraper.fun

Building, using & sharing HTML scraper are way funnier!

crawler html-scraper scraper

Last synced: 24 Mar 2025

https://github.com/stopsopa/docker-puppeteer-html-scraper

(Deprecated -> use better https://github.com/stopsopa/html-scraper-browserless) Microservice tool to scraping html from "any" page

docker html-scraper node nodejs puppeteer scraper

Last synced: 30 Apr 2026

https://github.com/lsegg/scraper-api-challenge

Data extraction package which supports CLI and API requests.

css-selectors html-scraper scraper

Last synced: 09 Jun 2026

https://github.com/felipe-gdr/feriado-scraper

Html scraper to obtain Brazilian national holidays. Writen with Node

html-scraper nodejs

Last synced: 22 Apr 2026