Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/samhuk/stealthy-scraper
Extra stealthy web scraper in Typescript
https://github.com/samhuk/stealthy-scraper
puppeteer scraper stealth typescript
Last synced: about 1 month ago
JSON representation
Extra stealthy web scraper in Typescript
- Host: GitHub
- URL: https://github.com/samhuk/stealthy-scraper
- Owner: samhuk
- License: mit
- Created: 2022-10-17T14:28:49.000Z (about 2 years ago)
- Default Branch: master
- Last Pushed: 2022-10-19T16:21:01.000Z (about 2 years ago)
- Last Synced: 2024-10-13T12:27:09.723Z (2 months ago)
- Topics: puppeteer, scraper, stealth, typescript
- Language: TypeScript
- Homepage:
- Size: 22.5 KB
- Stars: 25
- Watchers: 4
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: contributing/development.md
- License: LICENSE
Awesome Lists containing this project
README
stealthy-scraper
Extra stealthy web scraper in Typescript## Overview
stealthy-scraper is a wrapper around [puppeteer-extra](https://github.com/berstend/puppeteer-extra) that adds additional stealth functionality and other helpful features.
## Why to use
* If puppeteer's `Page.goto` and `Browser.newPage` is being detected. stealthy-scraper has a `newBrowser` function as an alternative way to navigate to a new url which is more reliable.
* If puppeteer's default word typing is being detected. stealthy-scraper has a `safeType` function that better mimicks human typing behavior.
* When you want to more neatly centralize all of the puppeteer, puppeteer-extra, and puppeteer-extra's plugin dependencies into one package.## Usage Overview
`npm i --save stealthy-scraper`
```typescript
import { createScraper } from 'stealthy-scraper'
const scraper = await createScraper({
puppeteerOptions: {
headless: true,
...
},
snapshotsDirPath: './scraper-snapshots',
})
await scraper.page.goto('difficultoscrape.com')
const searchTextInput = await scraper.page.waitForSelector('...')
await scraper.safeType(searchTextInput, 'my search term')
// ...
await scraper.newBrowser(newUrlFromSearchResults)
await scraper.close()
```## Development
See [./contributing/development.md](./contributing/development.md)
## Disclaimer
I do not condone the usage of this package for malevolent purposes. Please be very curtious and a good citizen when using it. I do not take any responsibility for any damages you incur on yourself (e.g. IP blacklisted) or others (e.g. DoS) through any use of this package.
---
If you found this package delightful, feel free to buy me a coffee ✨