https://github.com/lewisakura/spiderboi
A web crawling library written in TypeScript.
https://github.com/lewisakura/spiderboi
spider typescript typescript3 web-crawler web-crawling web-spider webcrawler
Last synced: about 1 year ago
JSON representation
A web crawling library written in TypeScript.
- Host: GitHub
- URL: https://github.com/lewisakura/spiderboi
- Owner: lewisakura
- Created: 2019-02-21T15:20:18.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2023-01-03T16:35:09.000Z (over 3 years ago)
- Last Synced: 2025-04-11T10:00:32.644Z (about 1 year ago)
- Topics: spider, typescript, typescript3, web-crawler, web-crawling, web-spider, webcrawler
- Language: TypeScript
- Size: 376 KB
- Stars: 7
- Watchers: 1
- Forks: 1
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Spiderboi
[](https://nodei.co/npm/spiderboi/)
A web crawling library written in TypeScript.
# Example
```typescript
import Crawler from 'spiderboi';
async function run() {
const crawler = new Crawler('https://google.com');
// this gets the site's robots.txt so that the crawler can respect it
await crawler.readyUp();
const out = await crawler.crawl('/search/about');
console.log(out);
}
run();
/**
* above code should output:
* [ 'https://google.com/search/about/',
* 'https://google.com/search/about/',
* 'https://google.com/#app-store',
* 'https://google.com/#app-store',
* 'https://google.com/#image-texts' ]
*
* unless of course google changes the /search/about page and ruins this example.
*/
```