{"id":26951538,"url":"https://github.com/xcrap-cloud/puppeteer-client","last_synced_at":"2025-06-10T17:09:50.529Z","repository":{"id":285783137,"uuid":"959322152","full_name":"Xcrap-Cloud/puppeteer-client","owner":"Xcrap-Cloud","description":"Xcrap Puppeteer Client is a package of the Xcrap framework that implements an HTTP client using the Puppeteer library.","archived":false,"fork":false,"pushed_at":"2025-04-08T19:28:44.000Z","size":78,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-04-09T15:45:37.411Z","etag":null,"topics":["client","http","javascript","nodejs","pupteteer","scraping","typescript","web","xcrap"],"latest_commit_sha":null,"homepage":"https://www.npmjs.com/package/@xcrap/puppeteer-client","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Xcrap-Cloud.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-04-02T15:51:00.000Z","updated_at":"2025-04-08T19:28:48.000Z","dependencies_parsed_at":"2025-04-09T15:16:50.227Z","dependency_job_id":"6380126b-87a7-4057-a46e-269cc1bb3168","html_url":"https://github.com/Xcrap-Cloud/puppeteer-client","commit_stats":null,"previous_names":["xcrap-cloud/puppeteer-client"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xcrap-Cloud%2Fpuppeteer-client","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xcrap-Cloud%2Fpuppeteer-client/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xcrap-Cloud%2Fpuppeteer-client/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xcrap-Cloud%2Fpuppeteer-client/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Xcrap-Cloud","download_url":"https://codeload.github.com/Xcrap-Cloud/puppeteer-client/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xcrap-Cloud%2Fpuppeteer-client/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259114567,"owners_count":22807252,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["client","http","javascript","nodejs","pupteteer","scraping","typescript","web","xcrap"],"created_at":"2025-04-03T00:16:23.954Z","updated_at":"2025-06-10T17:09:50.505Z","avatar_url":"https://github.com/Xcrap-Cloud.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🕷️ Xcrap Pupeeteer Client\n\n**Xcrap Puppeteer Client** is a package of the Xcrap framework that implements an HTTP client using the [Puppeteer](https://www.npmjs.com/package/puppeteer) library.\n\n## 📦 Installation\n\nThere are no secrets to installing it, just use your favorite dependency manager. Here is an example using NPM:\n\n```cmd\nnpm i @xcrap/puppeteer-client @xcrap/core @xcrap/parser\n```\n\n\u003e You need to install `@xcrap/parser` and `@xcrap/core` as well because I left them as `peerDependencies`, which means that the package needs `@xcrap/parser` and `@xcrap/core` as dependencies, however, the ones that the user has installed in the project will be used.\n\n## 🚀 Usage\n\nLike any HTTP client, `PuppteerClient` has two methods: `fetch()` to make a request for a specific URL and `fetchMany()` to make requests for multiple URLs at the same time, being able to control concurrency and delays between requests.\n\n### Example usage\n\n```ts\nimport { PuppteerClient } from \"@xcrap/puppeteer-client\"\nimport { extract } from \"@xcrap/parser\"\n\n;(async() =\u003e {\n    const client = new PuppteerClient()\n    const url = \"https://example.com\"\n    const response = await client.fetch({ url: url })\n    const parser = response.asHtmlParser()\n    const pageTitle = await parser.parseFist({ query: \"title\", extractor: extract(\"innerText\") })\n\n    console.log(\"Page Title:\", pageTitle)\n})();\n```\n\n### Using Actions\n\nIf you want to perform operations on the page before or after requests, you can use the `actions` property, which is an array of functions. Actions are flexible enough that you can do exactly what you would normally do with Puppeteer: login, click buttons, evaluate functions, etc.\n\n```ts\nconst response = await client.fetch({\n    url: \"https://example.com\",\n    actions: [\n        async (page) =\u003e {\n            await page.type(\"#username\", \"user\")\n            await page.type(\"#password\", \"mypassword123\")\n            await page.click(\"#submit\")\n        }\n    ]\n})\n```\n\nBy default, an action is executed after requests. If you want to manually define when it will be executed, you will have to pass an object instead of a simple function:\n\n```ts\nconst response = await client.fetch({\n    url: \"https://example.com\",\n    actions: [\n        {\n            type: \"afterRequest\", // Executed after the request\n            exec: async (page) =\u003e {\n                await page.type(\"#username\", \"user\")\n                await page.type(\"#password\", \"mypassword123\")\n                await page.click(\"#submit\")\n            }\n        },\n        {\n            type: \"beforeRequest\", // Executed before the request\n            func: async (page) =\u003e {\n                const width = 1920 + Math.floor(Math.random() * 100)\n                const height = 3000 + Math.floor(Math.random() * 100)\n\n                await page.setViewport({\n                    width: width,\n                    height: height,\n                    deviceScaleFactor: 1,\n                    hasTouch: false,\n                    isLandscape: false,\n                    isMobile: false,\n                })\n            }\n        }\n    ]\n})\n```\n\n### Adding a proxy\n\nIn an HTTP client that extends `BaseClient` we can add a proxy in the constructor as we can see in the following example:\n\n1. **Providing a `proxy` string:\n\n```ts\nconst client = new PuppteerClient({ proxy: \"http://47.251.122.81:8888\" })\n```\n\n2. **Providing a function that will generate a `proxy`:**\n\n```ts\nfunction randomProxy() {\n    const proxies = [\n        \"http://47.251.122.81:8888\",\n        \"http://159.203.61.169:3128\"\n    ]\n\n    const randomIndex = Math.floor(Math.random() * proxies.length)\n\n    return proxies[randomIndex]\n}\n\nconst client = new PuppteerClient({ proxy: randomProxy })\n```\n\n### Using a custom User Agent\n\nIn a client that extends `BaseClient` we can also customize the `User-Agent` of the requests. We can do this in two ways:\n\n1. **By providing a `userAgent` string:\n\n```ts\nconst client = new PuppteerClient({ userAgent: \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36\" })\n```\n\n2. **By providing a function that will generate a `userAgent`:**\n\n```ts\nfunction randomUserAgent() {\n    const userAgents = [\n        \"Mozilla/5.0 (iPhone; CPU iPhone OS 9_8_4; like Mac OS X) AppleWebKit/603.37 (KHTML, like Gecko) Chrome/54.0.1244.188 Mobile Safari/601.5\", \"Mozilla/5.0 (Windows NT 10.3;; en-US) AppleWebKit/537.35 (KHTML, like Gecko) Chrome/47.0.1707.185 Safari/601\"\n    ]\n\n    const randomIndex = Math.floor(Math.random() * userAgents.length)\n\n    return userAgents[randomIndex]\n}\n\nconst client = new PuppteerClient({ userAgent: randomUserAgent })\n```\n\n### Using custom Proxy URL\n\nIn a client that extends `BaseClient` we can use proxy URLs, I don't know how to explain to you how they work, but I kind of discovered this kind of porxy when I was trying to solve the CORS problem by making a request on the client side, and then I met the *CORS Proxy*. Here I have a [template](https://gist.github.com/marcuth/9fbd321b011da44d1287faae31a8dd3a) for one for CloudFlare Workers in case you want to roll your own.\n\nWell, we can do it the same way we did with `userAgent`:\n\n1. **Providing a `proxyUrl` string:\n\n```ts\nconst client = new PuppteerClient({ proxyUrl: \"https://my-proxy-app.my-username.workers.dev\" })\n```\n\n2. **Providing a function that will generate a `proxyUrl`:**\n\n```ts\nfunction randomProxyUrl() {\nconst proxyUrls = [\n        \"https://my-proxy-app.my-username-1.workers.dev\",\n        \"https://my-proxy-app.my-username-2.workers.dev\"\n    ]\n\n    const randomIndex = Math.floor(Math.random() * proxyUrls.length)\n\n    return proxyUrls[randomIndex]\n}\n\nconst client = new PuppteerClient({ proxyUrl: randomProxyUrl })\n```\n\n## 🤝 Contributing\n\n- Want to contribute? Follow these steps:\n- Fork the repository.\n- Create a new branch (git checkout -b feature-new).\n- Commit your changes (git commit -m 'Add new feature').\n- Push to the branch (git push origin feature-new).\n- Open a Pull Request.\n\n## 📝 License\n\nThis project is licensed under the MIT License.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxcrap-cloud%2Fpuppeteer-client","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxcrap-cloud%2Fpuppeteer-client","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxcrap-cloud%2Fpuppeteer-client/lists"}