Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/sc0vu/jspachong

Js crawler library.
https://github.com/sc0vu/jspachong

crawler pachong

Last synced: 16 days ago
JSON representation

Js crawler library.

Host: GitHub
URL: https://github.com/sc0vu/jspachong
Owner: sc0Vu
License: mit
Created: 2017-10-05T08:15:21.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2022-12-06T16:05:40.000Z (about 2 years ago)
Last Synced: 2024-10-31T21:35:14.604Z (2 months ago)
Topics: crawler, pachong
Language: JavaScript
Size: 204 KB
Stars: 2
Watchers: 3
Forks: 1
Open Issues: 17
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Javascript Pachong

[![NPM](https://nodei.co/npm/jspachong.png)](https://nodei.co/npm/jspachong/)

[![Build Status](https://travis-ci.org/sc0Vu/jspachong.svg?branch=master)](https://travis-ci.org/sc0Vu/jspachong)

[![Dependency Status](https://www.versioneye.com/user/projects/59e026562de28c219b11a161/badge.svg?style=flat-square)](https://www.versioneye.com/user/projects/59e026562de28c219b11a161)

[![codecov](https://codecov.io/gh/sc0Vu/jspachong/branch/master/graph/badge.svg)](https://codecov.io/gh/sc0Vu/jspachong)

Pachong which means a generic term for vertebrates in chinese, you can find here [chinese dictionary](http://dict.revised.moe.edu.tw/cbdic/), and it's something like crawler.

This is a crawler library written in javascript, so you can use this in server side or browser.

# Usage

```

npm install jspachong

```

### Benchmark

It will crawl 10 pages parallelly and sequentially.

```

npm run benchmark

```

### Server Side

```

var Pachong = require('jspachong')

var crawler = new Pachong(requestObject, options)

crawler.queue(requestObject)

       .queue(requestObject)

       .queue(requestObject)

       .run()

       .then((res) => {})

       .catch((err) => {})

```

* requestObject

  request library options

  ```

  simple: {

    method: 'GET',

    uri: 'https://www.google.com'

    callback: function (err, res) {

      if (err) return

      // Do something here...

    }

  }

  ```

  

  see [request document](https://github.com/request/request#requestoptions-callback) for more information

* options

  ```

  parallel bool

  Run crawlers parallel.

  max integer

  Max crawlers run each time.

  ```

### Browser

To do.

# License

MIT