Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/indatawetrust/reporter

Crawler queue creation tool for paging
https://github.com/indatawetrust/reporter

crawler

Last synced: 22 days ago
JSON representation

Crawler queue creation tool for paging

Awesome Lists containing this project

README

        

[![Travis Build Status](https://img.shields.io/travis/indatawetrust/reporter.svg)](https://travis-ci.org/indatawetrust/reporter)

![img](https://nodei.co/npm/reporter-cli.png?downloads=true)

```
npm i -g reporter-cli
```

##### -- site

Pagination url

example: https://news.ycombinator.com/news?p=

##### -- list

list element selector

##### -- link

link element selector

##### -- title

title element selector

##### -- limit

page limit number

##### -- file

output filename

##### -- start

crawl start page

##### -- end

crawl end page

##### -- special

```
: *, : *..
```

```js
--special 'username: >.hnuser*text, score: >.score*text'
```

###### ^

parent element

###### <

previous sibling element

###### >

next sibling element

##### -- heartbeat.js

function to run after each request

example:

```js
module.exports = item => {
console.log(item.url, item.title)
}
```

##### demo
```bash
reporter --site https://news.ycombinator.com/news?p= \
--list .athing \
--link .storylink \
--title .storylink \
--limit 10 \
--special 'username: >.hnuser*text, score: >.score*text'
```