An open API service indexing awesome lists of open source software.

https://github.com/lewisakura/spiderboi

A web crawling library written in TypeScript.
https://github.com/lewisakura/spiderboi

spider typescript typescript3 web-crawler web-crawling web-spider webcrawler

Last synced: about 1 year ago
JSON representation

A web crawling library written in TypeScript.

Awesome Lists containing this project

README

          

# Spiderboi
[![NPM](https://nodei.co/npm/spiderboi.png?downloads=true&downloadRank=true&stars=true)](https://nodei.co/npm/spiderboi/)

A web crawling library written in TypeScript.

# Example
```typescript
import Crawler from 'spiderboi';

async function run() {
const crawler = new Crawler('https://google.com');

// this gets the site's robots.txt so that the crawler can respect it
await crawler.readyUp();

const out = await crawler.crawl('/search/about');
console.log(out);
}

run();
/**
* above code should output:
* [ 'https://google.com/search/about/',
* 'https://google.com/search/about/',
* 'https://google.com/#app-store',
* 'https://google.com/#app-store',
* 'https://google.com/#image-texts' ]
*
* unless of course google changes the /search/about page and ruins this example.
*/
```