Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ndom91/pw-web-scraper

🌐 Basic Playwright Web-scraper
https://github.com/ndom91/pw-web-scraper

nodejs playwright web-scraper

Last synced: 3 days ago
JSON representation

🌐 Basic Playwright Web-scraper

Host: GitHub
URL: https://github.com/ndom91/pw-web-scraper
Owner: ndom91
Created: 2022-04-23T12:39:31.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2022-04-23T14:58:49.000Z (almost 3 years ago)
Last Synced: 2024-10-05T17:22:01.475Z (4 months ago)
Topics: nodejs, playwright, web-scraper
Language: JavaScript
Homepage:
Size: 233 KB
Stars: 1
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# 🌐 Web Scraper

Basic [`playwright`](https://playwright.dev)/[`apify`](https://sdk.apify.com/) based web-scraper!

## 🕹️ Setup

1. Clone repository and install dependencies

```
$ git clone [email protected]:ndom91/web-scraper-berlin.git
$ cd web-scraper-berlin
$ npm install
```

2. Paste your list of URLs to be scraped into `sites.txt`

3. Double check the `SEARCH_TERM` variable towards the top of `index.js`. This is the term which will trigger sites to be written to `output.txt` during the scraping process.

4. Run `npm run scrape` :tada:

## 📝 License

MIT