{"id":20170190,"url":"https://github.com/fabrix-app/spool-scraper","last_synced_at":"2026-05-09T06:35:07.530Z","repository":{"id":95882857,"uuid":"149795018","full_name":"fabrix-app/spool-scraper","owner":"fabrix-app","description":"Spool: Webscraper ","archived":false,"fork":false,"pushed_at":"2018-10-08T15:21:08.000Z","size":42,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-14T04:17:52.514Z","etag":null,"topics":["cheerio","crawler","fabrix","nodejs","scraping","spools","typescript","webscraper"],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fabrix-app.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-09-21T17:20:03.000Z","updated_at":"2018-10-08T15:20:26.000Z","dependencies_parsed_at":"2023-03-13T16:44:07.335Z","dependency_job_id":null,"html_url":"https://github.com/fabrix-app/spool-scraper","commit_stats":{"total_commits":8,"total_committers":1,"mean_commits":8.0,"dds":0.0,"last_synced_commit":"1c664ca1b72e122409097b3187859c2e16d6ffce"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fabrix-app%2Fspool-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fabrix-app%2Fspool-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fabrix-app%2Fspool-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fabrix-app%2Fspool-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fabrix-app","download_url":"https://codeload.github.com/fabrix-app/spool-scraper/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241605820,"owners_count":19989612,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cheerio","crawler","fabrix","nodejs","scraping","spools","typescript","webscraper"],"created_at":"2024-11-14T01:17:38.659Z","updated_at":"2026-05-09T06:35:02.490Z","avatar_url":"https://github.com/fabrix-app.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# spool-scraper\n\n[![Gitter][gitter-image]][gitter-url]\n[![NPM version][npm-image]][npm-url]\n[![Build Status][ci-image]][ci-url]\n[![Test Coverage][coverage-image]][coverage-url]\n[![Dependency Status][daviddm-image]][daviddm-url]\n[![Follow @FabrixApp on Twitter][twitter-image]][twitter-url]\n\n:package: Scraper Spool\n\nA Spool to make Scraping the web super easy by implementing [Crawler](https://www.npmjs.com/package/crawler).\n\n## Install\n```sh\n$ npm install --save @fabrix/spool-scraper\n```\n\n## Configure\n\n```js\n// config/main.ts\nimport { ScraperSpool } from '@fabrix/spool-scraper'\nexport const main = {\n  spools: [\n    // ... other spools\n    ScraperSpool\n  ]\n}\n```\n\n## Configuration\n\n```\n// config/scraper.ts\nexport const scraper = {\n  max_connections: 10,\n    rate_limit: 1000,\n    encoding: null,\n    jQuery: true,\n    force_UTF8: true,\n    retries: 3,\n    retry_timeout: 10000,\n    incoming_encoding: null,\n    skip_duplicates: false,\n    // Boolean If true, userAgent should be an array and rotate it (Default false)\n    rotate_UA: false,\n    // String|Array, If rotateUA is false, but userAgent is an array, crawler will use the first one.\n    user_agent: [],\n    // String If truthy sets the HTTP referer header\n    referer: null,\n    // Object Raw key-value of http headers\n    headers: null,\n    pre_request: (opts, done) =\u003e {\n      // 'options' here is not the 'options' you pass to 'c.queue',\n      // instead, it's the options that is going to be passed to 'request' module\n      console.log(opts)\n      // when done is called, the request will start\n      done()\n    }\n}\n```\n\nFor more information about store (type and configuration) please see the scraper documentation.\n\n## Usage\nFor the best results, create a Scrape Class and override the default process method. \n```ts\n  import { Scrape } from '@fabrix/spool-scraper'\n  \n  export class AmazonScrape extends Scrape {\n    process(res): Promise\u003cany\u003e {\n      const $ = res.$\n      const amazon = $('.nav-logo-base').text()\n      return Promise.resolve(amazon)\n    }\n  }\n```\n\nThen you can either queue your scrape or scrape directly \n```js\n// Return a result immediately \u003csee config for options\u003e\nconst direct = this.app.scrapes.AmazonScrape.direct('https://amazon.com', options, preRequest)\n\n// Add this to the queue \u003csee config for options\u003e\nthis.app.scrapes.AmazonScrape.queue('https://amazon.com', options, preRequest)\n```\n\n[npm-image]: https://img.shields.io/npm/v/@fabrix/spool-scraper.svg?style=flat-square\n[npm-url]: https://npmjs.org/package/@fabrix/spool-scraper\n[ci-image]: https://img.shields.io/circleci/project/github/fabrix-app/spool-scraper/master.svg\n[ci-url]: https://circleci.com/gh/fabrix-app/spool-scraper/tree/master\n[daviddm-image]: http://img.shields.io/david/fabrix-app/spool-scraper.svg?style=flat-square\n[daviddm-url]: https://david-dm.org/fabrix-app/spool-scraper\n[gitter-image]: http://img.shields.io/badge/+%20GITTER-JOIN%20CHAT%20%E2%86%92-1DCE73.svg?style=flat-square\n[gitter-url]: https://gitter.im/fabrix-app/fabrix\n[twitter-image]: https://img.shields.io/twitter/follow/FabrixApp.svg?style=social\n[twitter-url]: https://twitter.com/FabrixApp\n[coverage-image]: https://img.shields.io/codeclimate/coverage/github/fabrix-app/spool-scraper.svg?style=flat-square\n[coverage-url]: https://codeclimate.com/github/fabrix-app/spool-scraper/coverage\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffabrix-app%2Fspool-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffabrix-app%2Fspool-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffabrix-app%2Fspool-scraper/lists"}