An open API service indexing awesome lists of open source software.

https://github.com/blocklet/snap-kit

Snap Kit is a powerful, Puppeteer-based service designed for seamless web automation. It enables you to effortlessly capture high-fidelity web page screenshots and efficiently scrape web content for precise data extraction.
https://github.com/blocklet/snap-kit

browser-automation puppeteer self-hosting web-crawling web-screenshot

Last synced: 10 months ago
JSON representation

Snap Kit is a powerful, Puppeteer-based service designed for seamless web automation. It enables you to effortlessly capture high-fidelity web page screenshots and efficiently scrape web content for precise data extraction.

Awesome Lists containing this project

README

          

# Snap Kit

This repository contains three modules:

## Snap Kit

`blocklets/snap-kit`

A [Blocklet](https://www.arcblock.io/en).

This is a Puppeteer-based service designed for seamless web automation. It allows you to effortlessly capture high-fidelity web page screenshots and efficiently scrape web content for precise data extraction.

[Documentation](https://github.com/blocklet/snap-kit/blob/master/blocklets/snap-kit/blocklet.md)

## @arcblock/crawler

`packages/crawler`

A crawler module designed for Blocklets. It supports batch crawling of HTML, webpage screenshots, title, description, and more, based on URL or Sitemap.

[Documentation](https://github.com/blocklet/snap-kit/blob/master/packages/crawler/README.md)

## @arcblock/crawler-middleware

`packages/crawler-middleware`

A Snap Kit middleware for Blocklets.

This Express middleware provides pre-rendered HTML generated by SnapKit for Blocklets. It enables them to serve complete HTML content to web spiders, which is crucial for SEO and ensuring that search engines can properly index dynamically generated content.

[Documentation](https://github.com/blocklet/snap-kit/blob/master/packages/middleware/README.md)