{"id":26777832,"url":"https://github.com/xcrap-cloud/core","last_synced_at":"2026-02-28T23:01:30.478Z","repository":{"id":284788174,"uuid":"956070982","full_name":"Xcrap-Cloud/core","owner":"Xcrap-Cloud","description":"Xcrap core é o pacote principal do framework Xcrap, ele contém a Base de um Cliente HTTP, a interface de como deve ser um Cliente HTTP, um objeto de HttpResponse que juntamente com o `@xcrap/parser` é utilizado para fazer parsing e extração de dados do conteúdo da resposta.","archived":false,"fork":false,"pushed_at":"2025-03-27T17:19:31.000Z","size":0,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-03-27T17:40:54.344Z","etag":null,"topics":["html","json","parser","scrapy","xcrap"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Xcrap-Cloud.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-27T16:45:17.000Z","updated_at":"2025-03-27T17:27:36.000Z","dependencies_parsed_at":"2025-03-27T17:41:03.527Z","dependency_job_id":null,"html_url":"https://github.com/Xcrap-Cloud/core","commit_stats":null,"previous_names":["xcrap-cloud/core"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xcrap-Cloud%2Fcore","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xcrap-Cloud%2Fcore/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xcrap-Cloud%2Fcore/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Xcrap-Cloud%2Fcore/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Xcrap-Cloud","download_url":"https://codeload.github.com/Xcrap-Cloud/core/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246140588,"owners_count":20729802,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["html","json","parser","scrapy","xcrap"],"created_at":"2025-03-29T05:10:43.751Z","updated_at":"2026-02-28T23:01:30.470Z","avatar_url":"https://github.com/Xcrap-Cloud.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🕷️ Xcrap Core\n\n**Xcrap Core** is the package that includes some essential items of the **Xcrap** Web Scraping framework, such as:\n\n- `ClientInterface` interface to help you understand how to implement an HTTP client;\n- `BaseClient` class to extend and create your own HTTP clients;\n- `HttpClient` class, which is an HTTP client implementation using Node Core;\n- `FetchClient` class, which is an HTTP client implementation using the Fetch API;\n- `Randomizer` class to randomize values like UserAgents, Proxies, and Proxy URLs;\n- `Rotator` to rotate values such as UserAgents, Proxies, and Proxy URLs;\n- `StaticPaginator` to handle calculable pagination URLs;\n\n---\n## 📦 Installation\n\nFor installation, there are no secrets, just use your preferred dependency manager. Here's an example using NPM:\n\n```\nnpm i @xcrap/core @xcrap/parser\n```\n\n\u003e You need to install `@xcrap/parser` as well because I left it as a `peerDependency`, which means the `@xcrap/core` package needs `@xcrap/parser` as a dependency, but it will use whatever version the user has installed in their project.\n\n---\n\n## 🚀 Usage\n\n### Using the FetchClient\n\nThe `FetchClient` uses the native `fetch` API. It allows you to make HTTP requests using the modern `fetch` interface, suitable for environments where `undici` or native fetch is available.\n\n#### Example usage\n\n```ts\nimport { FetchClient } from \"@xcrap/core\"\n\n;(async () =\u003e {\n    const client = new FetchClient()\n    const response = await client.fetch({ url: \"https://example.com\" })\n    console.log(\"Status:\", response.status)\n    console.log(\"Body:\", response.text)\n})();\n```\n\n### Using the HttpClient\n\nThe `HttpClient` is an implementation that uses Node Core underneath, meaning the `node:http` and `node:https` modules.\n\nLike any HTTP client, it has two methods: `fetch()` to make a request to a specific URL and `fetchMany()` to make requests to multiple URLs at once, with control over concurrency and delays between requests.\n\n#### Example usage\n\n```ts\nimport { HttpClient } from \"@xcrap/core\"\nimport { extract } from \"@xcrap/parser\"\n\n;(async () =\u003e {\n    const client = new HttpClient()\n    const url = \"https://example.com\"\n    const response = await client.fetch({ url: url })\n    const parser = response.asHtmlParser()\n    const pageTitle = await parser.parseFist({ query: \"title\", extractor: extract(\"innerText\") })\n\n    console.log(\"Page Title:\", pageTitle)\n})();\n```\n\n#### Adding a Proxy\n\nIn an HTTP client that extends `BaseClient`, we can add a proxy in the constructor as shown in the following example:\n\n1. **Providing a `proxy` string:**\n\n```ts\nconst client = new HttpClient({ proxy: \"http://47.251.122.81:8888\" })\n```\n\n2. **Providing a function that will generate a `proxy`:**\n\n```ts\nfunction randomProxy() {\n    const proxies = [\n        \"http://47.251.122.81:8888\",\n        \"http://159.203.61.169:3128\"\n    ]\n    \n    const randomIndex = Math.floor(Math.random() * proxies.length)\n    \n    return proxies[randomIndex]\n}\n\nconst client = new HttpClient({ proxy: randomProxy })\n```\n\n#### Using a Custom User Agent\n\nIn a client that extends `BaseClient`, we can also customize the `User-Agent` of requests. This can be done in two ways:\n\n1. **Providing a `userAgent` string:**\n\n```ts\nconst client = new HttpClient({ userAgent: \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36\" })\n```\n\n2. **Providing a function that will generate a `userAgent`:**\n\n```ts\nfunction randomUserAgent() {\n    const userAgents = [\n        \"Mozilla/5.0 (iPhone; CPU iPhone OS 9_8_4; like Mac OS X) AppleWebKit/603.37 (KHTML, like Gecko)  Chrome/54.0.1244.188 Mobile Safari/601.5\",\n        \"Mozilla/5.0 (Windows NT 10.3;; en-US) AppleWebKit/537.35 (KHTML, like Gecko) Chrome/47.0.1707.185 Safari/601\"\n    ]\n    \n    const randomIndex = Math.floor(Math.random() * userAgents.length)\n    \n    return userAgents[randomIndex]\n}\n\nconst client = new HttpClient({ userAgent: randomUserAgent })\n```\n\n#### Using Custom Proxy URLs\n\nIn a client that extends `BaseClient`, we can use proxy URLs. I’m not sure how to explain how they work, but I ended up discovering this type of proxy when I was trying to solve a CORS issue by making a request on the client side, and then I encountered the *CORS Proxy*. Here’s a [template](https://gist.github.com/marcuth/9fbd321b011da44d1287faae31a8dd3a) for one using CloudFlare Workers, in case you want to deploy your own.\n\nWell, we can do this just like we did with the `userAgent`: \n\n1. **Providing a `proxyUrl` string:**\n\n```ts\nconst client = new HttpClient({ proxyUrl: \"https://my-proxy-app.my-username.workers.dev\" })\n```\n\n2. **Providing a function that will generate a `proxyUrl`:**\n\n```ts\nfunction randomProxyUrl() {\n    const proxyUrls = [\n        \"https://my-proxy-app.my-username-1.workers.dev\",\n        \"https://my-proxy-app.my-username-2.workers.dev\"\n    ]\n    \n    const randomIndex = Math.floor(Math.random() * proxyUrls.length)\n    \n    return proxyUrls[randomIndex]\n}\n\nconst client = new HttpClient({ proxyUrl: randomProxyUrl })\n```\n\n## 🤝 Contributing\n\n- Want to contribute? Follow these steps:\n- Fork the repository.\n- Create a new branch (git checkout -b feature-new).\n- Commit your changes (git commit -m 'Add new feature').\n- Push to the branch (git push origin feature-new).\n- Open a Pull Request.\n\n## 📝 License\n\nThis project is licensed under the MIT License.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxcrap-cloud%2Fcore","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxcrap-cloud%2Fcore","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxcrap-cloud%2Fcore/lists"}