Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/scrapehero-code/amazon-scraper

A simple web scraper to extract Product Data and Pricing from Amazon

amazon-scraper page-scraper scrape-products web-crawling web-scraping web-scraping-tutorials

Last synced: 14 Jun 2024

https://github.com/ayakashi-io/ayakashi

:zap: Ayakashi.io - The next generation web scraping framework

automation data-mining headless-chrome web-crawling web-scraping

Last synced: 08 Jun 2024

https://github.com/my8100/scrapyd-cluster-on-heroku

Set up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO :point_right:

cluster heroku logparser python scrapy scrapyd scrapydweb web-crawling web-scraping

Last synced: 19 May 2024

https://github.com/apify/crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

apify automation crawler crawling headless headless-chrome javascript nodejs npm playwright puppeteer scraper scraping typescript web-crawler web-crawling web-scraping

Last synced: 10 May 2024

https://github.com/TurnerSoftware/InfinityCrawler

A simple but powerful web crawler library for .NET

crawler robots-txt spider web-crawler web-crawling

Last synced: 05 May 2024

https://github.com/brianmadden/krawler

A web crawling framework written in Kotlin

crawler4j framework kotlin link-checker web-crawler web-crawling webcrawler

Last synced: 06 Apr 2024