Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/capJavert/jawa

Visual scraper interface, exports to puppeteer script which you can run anywhere.
https://github.com/capJavert/jawa

hacktoberfest nextjs puppeteer react scraper starwars typescript visual-scraping yargs

Last synced: about 2 months ago
JSON representation

Visual scraper interface, exports to puppeteer script which you can run anywhere.

Awesome Lists containing this project

README

        

# Jawa - Visual Scraper

[![npm](https://img.shields.io/npm/v/jawa)](https://www.npmjs.com/package/jawa)
[![Chrome Web Store](https://img.shields.io/chrome-web-store/v/icjgianfpiifbdpddkadmpcegiffiglk)](https://chrome.google.com/webstore/detail/clippy/icjgianfpiifbdpddkadmpcegiffiglk)

[πŸ‡­πŸ‡· Started in Croatia](https://startedincroatia.com)

![DALLΒ·E 2022-10-17 03 53 08 (2)](https://user-images.githubusercontent.com/9803078/196301040-1f1f34b4-e983-4cd8-859b-951b7fa51068.png)

Visual scraper interface, exports to puppeteer script which you can run anywhere. You can try it out here https://jawa.sh

Jawa allows you to visually click elements of any website and then export selectors as a config that you can run in any node environment to scrape the content when needed.

This repo consists of the:
- web app
- cli
- browser extension

## Web app

Web app that provides embedded browser for visually selecting elements and creating the scraper config that you can download and run through the CLI or Cloud.

### Cloud scraping (Beta)

It is now supported to run your scraper config in the cloud directly from web app. Cloud scrapers use the same Jawa CLI. Currently cloud scrapers have limited availability.

If you need more usage you can check out [Jawa Pro](https://jawa.sh/pro?ref=github).

## CLI

Simple CLI to run configs created and exported from web app. You can run it like this:

```
npx jawa path/to/scraper/config/file.json
```

or `npx jawa --help` to see all the options.

`jawa` package now also exports `scrape` function so it can be used outside of CLI in your apps or services:
```js
import { scrape } from 'jawa'
```
```js
const { scrape } = require('jawa')
```

## Browser extension

Browser extension that runs the embedded browser which powers the visual scraper interface.

It is available on:
- [Chrome Web Store](https://chrome.google.com/webstore/detail/jawa-visual-scraper/icjgianfpiifbdpddkadmpcegiffiglk)
- Chrome extensions also work on all Chromium based browsers like:
- Opera
- Microsoft Edge
- Brave