{"id":29463493,"url":"https://github.com/lukluk/scraper-engine","last_synced_at":"2025-07-14T05:35:35.738Z","repository":{"id":17313113,"uuid":"20083864","full_name":"lukluk/scraper-engine","owner":"lukluk","description":"Async Scraper Framework Based Nodejs","archived":false,"fork":false,"pushed_at":"2016-05-25T11:53:15.000Z","size":2985,"stargazers_count":11,"open_issues_count":2,"forks_count":7,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-05T20:51:14.986Z","etag":null,"topics":["scraper-engine","scraping"],"latest_commit_sha":null,"homepage":"http://scraper-engine.com","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lukluk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-05-23T02:12:21.000Z","updated_at":"2023-07-29T00:13:20.000Z","dependencies_parsed_at":"2022-09-02T20:51:25.691Z","dependency_job_id":null,"html_url":"https://github.com/lukluk/scraper-engine","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/lukluk/scraper-engine","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lukluk%2Fscraper-engine","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lukluk%2Fscraper-engine/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lukluk%2Fscraper-engine/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lukluk%2Fscraper-engine/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lukluk","download_url":"https://codeload.github.com/lukluk/scraper-engine/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lukluk%2Fscraper-engine/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265246016,"owners_count":23734109,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["scraper-engine","scraping"],"created_at":"2025-07-14T05:35:35.178Z","updated_at":"2025-07-14T05:35:35.726Z","avatar_url":"https://github.com/lukluk.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"#Scraper engine\n\u003ecomplate solutions for build scraper API service, easy to use \n\nversion 8.0.0 \n\noutput.json and output.csv\n\n#tutorial\nhttps://youtu.be/j_PJkSVx7n4\n\n\nhttps://www.youtube.com/watch?v=HHP2NyEJq4w\n\n#How to install\n\n```\nnpm install scraper-engine\n```\ncreate app.js\n```\nvar port=4000;\nrequire('scraper-engine').start(__dirname,port);\n```\n\n```\n$ node app.js\nScraper Engine Started (port 4000)...\n```\nand open your browser\n\nhttp://localhost:4000/output.json?site=**controller**\n\n## example controller\n\nhttp://localhost:4000/output.json?site=olx\nor \nhttp://localhost:4000/output.csv?site=olx\n\nexample : Walmart Category controller\n\n```\nvar S = require('string');\nexports.scraper = {\n    name: 'OLX',\n    url: function (index) {\n        return \"http://www.walmart.com/browse/toys/action-figures/4171_4172_133130?page=\"+index+\"\u0026cat_id=4171_4172_133130\"\n    },\n\tnext:function($,currentindex){\n\t\tif(currentindex\u003e=5){\n\t\t\treturn false\n\t\t}else{\n\t\t\treturn true\n\t\t}\n\t},\n    rows: function ($) {\n        return $('.tile-grid-unit-wrapper');\n    },\n    fields: {\n        title: function ($) {\n            return S($.find('.tile-heading').text()).trim().s;\n        },\n        price: function ($) {\n            return S($.find('.tile-price').text().replace('$','')).trim().s;\n        },\n        image: function ($) {\n            return $.find('.product-image').attr('src');\n        },\n        urlproduct: function ($) {\n            return $.find('.js-product-title').attr('href');\n        }\n \n \n    }\n \n}\n```\n\nExample: Olx Controller\n```\nvar S = require('string');\nexports.scraper = {\n    name: 'OLX',\n    url: function () {\n        return \"http://olx.co.id/all-results/q-batu-bacan/\"\n    },\n    rows: function ($) {\n        return $('.offer');\n    },\n    fields: {\n        title: function ($) {\n            return S($.find('.link.linkWithHash').text()).trim().s;\n        },\n        price: function ($) {\n            return S($.find('.price').text()).trim().s;\n        },\n        image: function ($) {\n            return $.find('.linkWithHash img').attr('src');\n        }\n\n\n    }\n\n}\n```\n#using request parameter\nExample: Olx Controller part 2 \n```\nvar S = require('string');\nvar keyword=\"\";\nexports.scraper = {\n    name: 'OLX-pass-url',\n    url: function () {\n        return \"http://olx.co.id/all-results/q-\"+keyword+\"/\"\n    },\n    setup:function(req){\n        keyword=req.query.keyword\n    },\n    rows: function ($) {\n        return $('.offer');\n    },\n    fields: {\n        title: function ($) {\n            return S($.find('.link.linkWithHash').text()).trim().s;\n        },\n        price: function ($) {\n            return S($.find('.price').text()).trim().s;\n        },\n        image: function ($) {\n            return $.find('.linkWithHash img').attr('src');\n        }\n\n\n    }\n\n}\n```\n\n#scraping all pages detail\nExample: Olx Controller part 3\n```\nvar S = require('string');\nvar keyword=\"\";\nexports.scraper = {\n    name: 'OLX-all-pages',\n    url: function () {\n        return \"http://olx.co.id/all-results/q-\"+keyword+\"/\"\n    },\n    setup:function(req){\n        keyword=req.query.keyword\n    },\n    list: function ($) {\n        var urls=[];\n        $('.offer').each(function(){\n            ulrs.push($(this).find('.link.linkWithHash').attr('href'))\n        })\n        return urls;\n    },\n    fields: {\n        title: function ($) {\n            return $.find('.offerheadinner h1').text()\n        },\n        price: function ($) {\n            return $.find('.pricelabel strong').text()\n        },\n        seller: function ($) {\n            return $.find('.userdetails .brkword').text();\n        }\n\n\n    }\n\n}\n```\n\n\n#Author\n##luklukaha@gmail.com\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flukluk%2Fscraper-engine","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flukluk%2Fscraper-engine","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flukluk%2Fscraper-engine/lists"}