Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/msankhala/80legs-app-ebay

80legs crawler app for ebay search result pages to fetch data about products.
https://github.com/msankhala/80legs-app-ebay

Last synced: about 2 months ago
JSON representation

80legs crawler app for ebay search result pages to fetch data about products.

Awesome Lists containing this project

README

        

EightyApps
==========

###Basic 80app format

```javascript
var EightyApp = function() {
this.processDocument = function(html, url, headers, status, jQuery) {
var app = this;
$ = jQuery;
var $html = app.parseHtml(html, $);
var object = new Object();

// populate the object

return JSON.stringify(object);
}

this.parseLinks = function(html, url, headers, status, jQuery) {
var app = this;
$ = jQuery;
var $html = app.parseHtml(html, $);
var links = [];

// get all the links

return links;
}
}

try {
// Testing
module.exports = function(EightyAppBase) {
EightyApp.prototype = new EightyAppBase();
return new EightyApp();
}
} catch(e) {
// Production
EightyApp.prototype = new EightyAppBase();
}
```

### Testing
To test your 80apps, you should use our [test site](http://80apptester.80legs.com/).

### Note about "img" tags
Note that if you use the parseHTML method in EightyApp.js, "img" tags will be changed to "img80" tags. This is so the crawlers do not load the images when using the EightyApp to parse the html response (strangely "img" tags seem to be the only html elements affected by this). If you need to reference an "img" tag by its tag type explicitly (i.e. not by its class, id, or some other attribute) in some html, it will instead be an "img80" tag, but everything else should be the same.

### Currently Available Cheerio (i.e. jQuery) Methods
The new version of Voltron - Mauler - uses an extended version of Cheerio, a lighter weight version of jQuery. You will still write functions in the same manner as before (ex: $html.find('selector').parent()); however, in order for Cheerio to obtain its faster speed, it only uses certain core jQuery functions. You can see the list of already implemented Cheerio functions that are available to you here: http://cheeriojs.github.io/cheerio/

Our extended version of Cheerio also includes a number of other functions that are available for your use. These include:

* .not
* .makeArray
* .each
* .filter
* .prop

We are currently working on implementing the pseudo selectors :eq and :first (ex: $html.find('div:eq(1)')); however, they are NOT currently implemented and WILL cause errors. Check here or our knowledge base for updated Cheerio functionality.