Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ndom91/pw-web-scraper
đ Basic Playwright Web-scraper
https://github.com/ndom91/pw-web-scraper
nodejs playwright web-scraper
Last synced: 22 days ago
JSON representation
đ Basic Playwright Web-scraper
- Host: GitHub
- URL: https://github.com/ndom91/pw-web-scraper
- Owner: ndom91
- Created: 2022-04-23T12:39:31.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2022-04-23T14:58:49.000Z (over 2 years ago)
- Last Synced: 2024-10-05T17:22:01.475Z (3 months ago)
- Topics: nodejs, playwright, web-scraper
- Language: JavaScript
- Homepage:
- Size: 233 KB
- Stars: 1
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# đ Web Scraper
Basic [`playwright`](https://playwright.dev)/[`apify`](https://sdk.apify.com/) based web-scraper!
## đšī¸ Setup
1. Clone repository and install dependencies
```
$ git clone [email protected]:ndom91/web-scraper-berlin.git
$ cd web-scraper-berlin
$ npm install
```2. Paste your list of URLs to be scraped into `sites.txt`
3. Double check the `SEARCH_TERM` variable towards the top of `index.js`. This is the term which will trigger sites to be written to `output.txt` during the scraping process.
4. Run `npm run scrape` :tada:
## đ License
MIT