Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nescalante/bochazo-lurker
https://github.com/nescalante/bochazo-lurker
Last synced: 20 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/nescalante/bochazo-lurker
- Owner: nescalante
- Created: 2014-03-19T23:10:33.000Z (almost 11 years ago)
- Default Branch: master
- Last Pushed: 2017-03-31T20:09:20.000Z (over 7 years ago)
- Last Synced: 2024-04-14T22:15:56.861Z (8 months ago)
- Language: JavaScript
- Size: 329 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
### Strategy Definition
Strategy must implement the following properties:
`name: String`: File name to generate json.
`pagesUrl: String`: First url to be requested.
`getPages: function(querySelector, callback)`: This will return on callback the hrefs to be requested.
`getPlacesByPage: function(querySelector, callback)`: This will return on callback the places list on each page.
`getPlace: function (querySelector, callback)`: This will return on callback the full place info.
### Strategy Example
Define a strategy to scrap some site:
```
module.exports = {
name: 'SomeName',
pagesUrl: 'somesite.com',
getPages: function (querySelector, callback) {
var someData = querySelector('a').text();
callback(null, ['some', 'hrefs', 'to', 'be', 'requested']);
},
getPlacesByPage: function (querySelector, callback) {
callback(null, ['more', 'hrefs']);
},
getPlace: function (querySelector, callback) {
var place = {
description: 'we scrap some place here!',
address: 'false street 123'
};
callback(null, place);
}
};
```Add it to [scrappers list](https://github.com/nescalante/bochazo-lurker/blob/master/scrappers/index.js):
```
var yourStrategy = require('./yourStrategy');module.exports = {
YourStrategy: yourStrategy
};
```Scrap [what you need](https://github.com/nescalante/bochazo-lurker/blob/master/app.js)!
```
var core = require('./core');
var scrappers = require('./scrappers');core.import(scrappers.YourStrategy);
```