{"id":22489735,"url":"https://github.com/capturr/scraper","last_synced_at":"2025-08-02T22:31:05.771Z","repository":{"id":57109339,"uuid":"399795670","full_name":"capturr/scraper","owner":"capturr","description":"All In One API to easily scrape data from any website, without worrying about captchas and bot detection mecanisms.","archived":true,"fork":false,"pushed_at":"2023-08-09T10:34:44.000Z","size":987,"stargazers_count":21,"open_issues_count":0,"forks_count":3,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-11-22T11:41:40.525Z","etag":null,"topics":["captcha","cheerio","crawler","crawling","data","declarative","extract","growth-hacking","html","javascript","json","jsonld","nodejs","recaptcha","scraper","scraping","spider","typescript","web","web-scraping"],"latest_commit_sha":null,"homepage":"http://scrapingapi.io","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/capturr.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2021-08-25T11:35:45.000Z","updated_at":"2024-09-12T11:27:30.000Z","dependencies_parsed_at":"2023-08-24T17:09:44.884Z","dependency_job_id":null,"html_url":"https://github.com/capturr/scraper","commit_stats":null,"previous_names":["dopamyn/scraper","datasaucer/scraper","scrapingapi/scraper"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/capturr%2Fscraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/capturr%2Fscraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/capturr%2Fscraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/capturr%2Fscraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/capturr","download_url":"https://codeload.github.com/capturr/scraper/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228370907,"owners_count":17909387,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["captcha","cheerio","crawler","crawling","data","declarative","extract","growth-hacking","html","javascript","json","jsonld","nodejs","recaptcha","scraper","scraping","spider","typescript","web","web-scraping"],"created_at":"2024-12-06T17:20:33.520Z","updated_at":"2024-12-06T17:22:59.465Z","avatar_url":"https://github.com/capturr.png","language":"TypeScript","readme":"--------------\n# This project has been moved to [datasaucer/api](https://github.com/datasaucer/api).\n------\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://scrapingapi.io/?utm_source=github\u0026utm_medium=readme\u0026utm_campaign=logo\" target=\"_blank\"\u003e\n        \u003cimg src=\"media/logo_text.png\" alt=\"ScrapingAPI Logo\" /\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003e\u003cb\u003e\u003cu\u003eOne\u003c/u\u003e powerful API to scrape \u003cu\u003eall\u003c/u\u003e the web\u003c/b\u003e\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n    Easily scrape data from any website, without worrying about captchas and bot detection mecanisms.\n\u003c/p\u003e\n\n\n\u003cdiv align=\"center\"\u003e\n \n![version](https://img.shields.io/github/package-json/v/scrapingapi/scraper)\n[![npm](https://img.shields.io/npm/dm/scrapingapi)](https://www.npmjs.com/package/scrapingapi)\n[![discord](https://img.shields.io/discord/956821594372714546?label=Discord)](https://discord.gg/m7KWXcBaBu)\n\u003cspan\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u003c/span\u003e\n\u003ca target=\"_blank\" href=\"https://twitter.com/intent/tweet?url=https://bit.ly/3iAvAmP\u0026text=Easily%20#scrape%20data%20from%20any%20website,%20without%20worrying%20about%20#captchas%20and%20#bot%20detection%20mecanisms.\"\u003e\n    \u003cimg height=\"26px\" src=\"https://simplesharebuttons.com/images/somacro/twitter.png\"\n        alt=\"Tweet\"\u003e\u003c/a\u003e\n\u003ca target=\"_blank\" href=\"https://www.linkedin.com/shareArticle?mini=true\u0026url=https://bit.ly/3qC2IyV\"\u003e\n    \u003cimg height=\"26px\" src=\"https://simplesharebuttons.com/images/somacro/linkedin.png\"\n        alt=\"Share on LinkedIn\"\u003e\u003c/a\u003e\n\u003ca target=\"_blank\" href=\"https://news.ycombinator.com/submitlink?u=https://bit.ly/3qBkLFe\u0026t=Easily%20scrape%20data%20from%20any%20website%2C%20without%20taking%20care%20of%20captchas%20and%20bot%20detection%20mecanisms.\"\u003e\n    \u003cimg height=\"26px\" src=\"media/ycombinator.png\"\n        alt=\"Share on Hacker News\"\u003e\u003c/a\u003e\n\n\u003c/div\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://scrapingapi.io/?utm_source=github\u0026utm_medium=readme\u0026utm_campaign=links\"\u003e\u003cb\u003eWebsite\u003c/b\u003e\u003c/a\u003e •\n    \u003ca href=\"https://discord.gg/m7KWXcBaBu\"\u003e\u003cb\u003eDiscord\u003c/b\u003e\u003c/a\u003e • \n    \u003ca href=\"https://github.com/scrapingapi/scraper/stargazers\"\u003e\u003cb\u003e⭐ Give a Star\u003c/b\u003e\u003c/a\u003e\n\u003c/p\u003e  \n\n\u003cp align=\"center\"\u003e\n    \u003ca target=\"_blank\" href=\"#simple-usage-example\"\u003e\n        \u003cimg src=\"media/sample_code.png\" alt=\"How does ScraperAPI works\" width=\"1000px\" /\u003e\n        \u003cimg src=\"media/sample_result.png\" alt=\"How does ScraperAPI works\" width=\"1000px\" /\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\n## Features\n\n* No captcha, no bot detection. Websites will see you as a human.\n* Integrated [**data extraction**](#extractors):\n    * Easily extract data with CSS / jQuery-like selectors\n    * Use filters to get ultra-clean data: url, price, ...\n    * Iterate through items (ex: search results, products list, articles, ...)\n* **Bulk requests**: Up to 3 per call\n* Post json / form-encoded body\n* Set request device, headers and cookies\n* Returns **response body, headers, final URL \u0026 status code**\n* Typescript typings\n\n-----------\n\n\u003cp align=\"center\"\u003e\n    Do you like this project ? Please let me know,\n    \u003ca target=\"_blank\" href=\"https://github.com/scrapingapi/scraper/stargazers\"\u003e\u003cb\u003e⭐ Give it a Star :)\u003c/b\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n------------\n\n## Get started in 5 minutes chrono\n\n1. **Install** the package\n    ```console\n    npm install --save scrapingapi\n    ```\n    If you're a Yarn guy:\n    ```console\n    yarn add --save scrapingapi\n    ```\n\n2. Create your free [API Key](https://scrapingapi.io/?utm_source=github\u0026utm_medium=readme\u0026utm_campaign=getstarted)\n\n3. Make your first request (example below 👇)\n\n## Simple Usage Example\n\nHere is an example of scraping **current Bitcoin price + search results** from Google Search.\n\n```javascript\nimport Scraper, { $ } from 'scrapingapi';\nconst page = new Scraper('API_KEY');\n\n// Scrape Google search results for \"bitcoin\"\npage.get(\"https://www.google.com/search?q=bitcoin\", { device: \"desktop\" }, {\n    // Extract the current bitcoin price                  \n    price: $(\"#search .obcontainer .card-section \u003e div:eq(1)\").filter(\"price\"),\n    // For each Google search result\n    results: $(\"h2:contains('Web results') + div\").each({\n        // We retrieve the URL\n        url: $(\"a[href]\").attr(\"href\").filter(\"url\"),\n        // ... And the title text\n        title: $(\"h3\")\n    })\n}).then( data =\u003e {\n\n    console.log(\"Here are the results:\", data);\n\n});\n```\n\nThe `Scraper.get` method sends a **GET request** to the provided URL, and automatically extract the data you asked: the price and the results.\n\n![Google Search Example](media/google-dom.jpg \"Google Search Example\")\n\nIn the data parameter, you will get a [TScrapeResult](src/types.ts#L107) object, containing the scraping results.\n\n```json\n{\n    \"url\": \"https://www.google.com/search?q=bitcoin\",\n    \"status\": 200,\n    \"time\": 2.930,\n    \"bandwidth\": 26.33,\n    \"data\": {\n        \"price\": {\n            \"amount\": 49805.02,\n            \"currency\": \"EUR\"\n        },\n        \"results\": [{\n            \"url\": \"https://bitcoin.org/\",\n            \"title\": \"Bitcoin - Open source P2P money\"\n        }, {\n            \"url\": \"https://coinmarketcap.com/currencies/bitcoin/\",\n            \"title\": \"Bitcoin price today, BTC to USD live, marketcap and chart\"\n        }, {\n            \"url\": \"https://www.bitcoin.com/\",\n            \"title\": \"Bitcoin.com | Buy BTC, ETH \u0026 BCH | Wallet, news, markets ...\"\n        }, {\n            \"url\": \"https://en.wikipedia.org/wiki/Bitcoin\",\n            \"title\": \"Bitcoin - Wikipedia\"\n        }]\n    }\n}\n```\n\n### Use Typescript\n\nTake advantage of the power of typescript by typing your response data:\n\n```typescript\nimport Scraper, { $, TExtractedPrice } from '../src';\nconst page = new Scraper('API_KEY');\n\ntype BitcoinGoogleResults = {\n    // Metadata generated by the price filter\n    price: TExtractedPrice,\n    // An array containing an informations object for each Google search result\n    results: {\n        url: string,\n        title: string\n    }[]\n}\n\npage.get\u003cBitcoinGoogleResults\u003e(\"https://www.google.com/search?q=bitcoin\").then( ... );\n```\n\n-----------\n\n\u003cp align=\"center\"\u003e\n    Do you like this project ? Please let me know,\n    \u003ca target=\"_blank\" href=\"https://github.com/scrapingapi/scraper/stargazers\"\u003e\u003cb\u003e⭐ Give it a Star :)\u003c/b\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n------------\n\n# Documentation / Guide\n\nLet's consider we want to scrape an Amazon product page to retrieve the following info:\n\n* Product info\n    * Title\n    * Current price\n    * Image URL\n* Reviews\n    * Average rating\n    * List of reviews\n\nReady ? Let's start step by step:\n\n1. [Make the **Request**](#request)\n    - [**Method**: GET, POST](#request-methods)\n    - [**Options**: device, cookies, body, withBody, withHeaders](#request-options)\n2. [**Extract** your data](#extractors)\n    - [Simple values](#value-extractor)\n    - [Filters a Validators](#item-extractor)\n    - [Optional values](#item-extractor)\n3. [**Iterate** through lists](#response)\n4. [Handle the **Response**](#response)\n5. [Another **Example**](#another-example)\n\n## 1. Make the Request\n\n### 1.1 Request Methods\n\nThis SDK provides one method per supported HTTP method:\n\n* GET: [See the definition](src/index.ts#66)\n    ```typescript \n    page.get( url, options, extractor );\n    ```\n* POST: [See the definition](src/index.ts#74)\n    ```typescript\n    page.post( url, body, bodyType, options, extractor );\n    ```\n* Bulk requests: \n    With the `scrape` method, You can also send up to **3 requests per call** if each of them points to different domain names.\n    [See the definition](src/index.ts#38)\n    ```typescript\n    page.scrape( requests );\n    ```\n\n\u003cdetails\u003e\u003csummary\u003eShow Example\u003c/summary\u003e\n\u003cp\u003e\n\nFor our example, we only need to make a get request.\n\n```typescript\npage.get( \"https://www.amazon.com/dp/B08L76BSZ5\", \u003coptions\u003e, \u003cextractors\u003e );\n```\n\n\u003c/p\u003e\n\u003c/details\u003e\n\n### 1.2 Request Options\n\n```typescript \npage.get( url, options, extractor );\n               ^^^^^^^\n```\n\nDepending on your needs, you can change some settings for your request:\n\n* **device** (string): Which user-agent do you want to use for your request: `desktop`, `mobile` or `tablet`\n    ```json\n    { \"device\": \"mobile\" }\n    ```\n* **cookies** (string): The cookie string you want to pass to the request. Example:\n    ```json\n    { \"cookies\": \"sessionId=34; userId=87;\" }\n    ```\n* **withBody** (boolean): If you want to get the page HTML in the response. Default: `false`\n    ```json\n    { \"withBody\": true }\n    ```\n* **withHeaders** (boolean): If you want to retrieve the response headers. Default: `false`\n    ```json\n    { \"withHeaders\": true }\n    ```\n\nFor POST requests only:\n\n* **body** (object): The data to send in your POST request. Must be combined with bodyType.\n    ```json\n    { \"body\": { \"name\": \"bob\", \"age\": 25 } }\n    ```\n* **bodyType** (string): In which format do you want to POST your data: `form` or `json`\n    ```json\n    { \"bodyType\": \"form\" }\n    ```\n\n#### Practical Example\n\n\u003cdetails\u003e\u003csummary\u003eShow the Example\u003c/summary\u003e\n\u003cp\u003e\n\nHere, we will simulate a mobile device, because the mobile version of Amazon is easier to scrape given that there are less elements on the page. We will also retrieve the response headers.\n\n```typescript\npage.get(\"https://www.amazon.com/dp/B08L76BSZ5\", { device: 'mobile', withHeaders: true }, \u003cextractors\u003e);\n```\n\n\u003c/p\u003e\n\u003c/details\u003e\n\n## 2. Extract your data\n\nWe're now at the most interesting part: how to extract \u0026 filter values, and how to iterate items.\n\n```typescript \npage.get( url, options, extractor );\n                        ^^^^^^^^^\n```\n\n### 2.1 Extract a value\n\nLet's start with the basics: extract a single information from the webpage.\n\nExtractors are simple javascript objects, were you can associate a `key` (the name of your data) to a `value selector`.\nThe following example will extract the text content of the element that matches given selector:\n\n```typescript\n{\n    \u003ckey\u003e: $( \u003cselector\u003e )\n}\n```\n\nHere you have two elements:\n\n1. The **Key**: You can choose any name for the key, but it should not:\n\n* Start by a `$`\n* Be a reserved key: `select` is the one and only reserved key for the moment\n\n2. The **Selector** of the element which contains the information you want to extract. \n    To create a value selector will use the `$()` function. If you've already used jQuery, it should look a bit familiar :)\n    And for the attribute you put in the `$()` function, it's a CSS-like / jQuery-like selector that matches the element you want to extract the value.\n    \u003cdetails\u003e\u003csummary\u003eShow examples\u003c/summary\u003e\n    \u003cp\u003e\n\n    - `$(\"h3\")`: Simply matches all `h3` elements\n        - Matches: \n            ```html\n            \u003ch3\u003eThis is a title\u003c/h3\u003e\n            ```\n        - Do not matches because it's not a `h3` element:\n            ```html\n            \u003cp\u003eHello\u003c/p\u003e\n            ```\n    - `$(\"a.myLink[href]\")`: Matches `a` elements having the class `myLink`, and where the `href` attribute is defined\n        - Matches: \n            ```html\n            \u003ca class=\"myLink anotherclass\" href=\"https://scrapingapi.io\"\u003eLink Text\u003c/a\u003e\n            ```\n        - Do not matches, because it doesn't contains the `myLink` class\n            ```html\n            \u003ca class=\"thisClassIsAlone\" href=\"https://scrapingapi.io\"\u003eLink Text\u003c/a\u003e\n            ```\n    - `$(\"h2:contains('Scraping API') + div\")`: Matches `div` elements that are next to `h2` elements where the content is equal to `Scraping API`\n        - Matches: \n            ```html\n            \u003ch2\u003eScraping API\u003c/h2\u003e\n            \u003cdiv\u003eis cool\u003c/div\u003e\n            ```\n        - Do not matches, because the `div` element is not next to the `h2` element\n            ```html\n            \u003ch2\u003eScraping API\u003c/h2\u003e\n            \u003cp\u003eis maybe not\u003c/p\u003e\n            \u003cdiv\u003ewell configured\u003c/div\u003e\n            ```\n    Don't hesitate to go deeper by checking theses references:\n    * [CSS selectors](https://www.w3schools.com/cssref/css_selectors.asp)\n    * [jQuery selectors](https://www.w3schools.com/jquery/jquery_ref_selectors.asp) \n\n    \u003c/p\u003e\n    \u003c/details\u003e \n\nBut instead of extracting the text content of the element, you can also extract the HTML content.\nFor that, simply use the `.html()` method:\n\n```typescript\n{\n    \u003ckey\u003e: $( \u003cselector\u003e ).html()\n                          ^^^^^^^\n}\n```\n\nIt's also possible to extract any other [HTML attributes](https://www.w3schools.com/tags/ref_attributes.asp): `href`, `class`, `src`, etc ...\n\n```typescript\n{\n    \u003ckey\u003e: $( \u003cselector\u003e ).attr( \u003cattribute\u003e )\n                           ^^^^^^^^^^^^^^^^^^^\n}\n```\n\n#### Practical Example\n\n\u003cdetails\u003e\u003csummary\u003eShow the Example\u003c/summary\u003e\n\u003cp\u003e\n\nLet's start by extracting the product info:\n\n* Title\n* Current price\n* Image URL\n* Rating\n\n```typescript\npage.get(\"https://www.amazon.com/dp/B08L76BSZ5\", { device: 'mobile', withHeaders: true }, {\n\n    title: $(\"#title\"),\n    price: $(\"#corePrice_feature_div .a-offscreen:first\"),\n    image: $(\"#main-image\").attr(\"src\"),\n    reviews: {\n        rating: $(\".cr-widget-Acr [data-hook='average-stars-rating-text']\")\n    }\n\n});\n```\n\nPretty easy, isn't it ? 🙂\nWith this code, you will get the following data:\n\n```json\n{\n    \"title\": \"sportbull Unisex 3D Printed Graphics Novelty Casual Short Sleeve T-Shirts Tees\",\n    \"price\": \"$9.99\",\n    \"image\": \"https://m.media-amazon.com/images/I/71c3pFtZywL._AC_AC_SY350_QL65_.jpg\",\n    \"reviews\": {\n        \"rating\": \"4.4 out of 5\"\n    }\n}\n```\n\nThat's cool, but here we have two problems:\n\n* The price is a string, and we need to parse it if we want to separate the price amount from the currency. \n    In a perfect world, we could simply make a \n    ```typescript\n    const amount = parseFloat( data.price.substring(1) );\n    ``` \n    to get the amount.\n    Yes, but depending on any factors, he price format could vary in `9.99 USD`, `9.99 dollars incl. taxes`, `$9.99 USD free shipping`, etc ... \n    In addition, what warranties you that the price element systematically contains a price ? For some reaosns, we could have another random value.\n    We want to build something strong, so we need to solve this issue.\n* Same issue with the image URL, we need to filter and validate it to be sure we have an URL in the correct form.\n\nThat's a great transition to see how you can filter the data you've extracted.\n\n\n\u003c/p\u003e\n\u003c/details\u003e\n\n## 2.2. Filter the data\n\nTo ensure that the data we've extracted matches with what we're expecting, we can specify filters for each selector:\n\n```typescript\n$( \u003cselector\u003e ).attr( \u003cattribute\u003e ).filter( \u003cfilter name\u003e )\n                                   ^^^^^^^^^^^^^^^^^^^^^^^^\n```\n\nFor the moment, we only support two filters:\n\n* **url**: Checks if the value is an URL. If the URL is relative, it will be transformed into an absolute URL.\n* **price**: Powered by the [price-extract](https://github.com/scrapingapi/price-extract) package, this filter ensures that the value express a price, autodetect the currency ISO code, and separate the amount from the currency.\n    It will give you an object with the price info:\n    ```json\n    { \"amount\": 9.99, \"currency\": \"USD\" }\n    ```\n\n💡 **If you want I add another filter, please don't hesitate to share your purposal by [submitting a an issue](https://github.com/scrapingapi/price-extract/issues).** Thank you !\n\n\u003cdetails\u003e\u003csummary\u003eShow the Example\u003c/summary\u003e\n\u003cp\u003e\n\nTo come back on our Amazon example, we will simply add filters on the `price` and `image` data:\n\n```typescript\npage.get(\"https://www.amazon.com/dp/B08L76BSZ5\", { device: 'mobile', withHeaders: true }, {\n\n    title: $(\"#title\"),\n    price: $(\"#corePrice_feature_div .a-offscreen:first\").filter(\"price\"),\n                                                         ^^^^^^^^^^^^^^^^\n    image: $(\"#main-image\").attr(\"src\").filter(\"url\"),\n                                       ^^^^^^^^^^^^^^\n    reviews: {\n        rating: $(\".cr-widget-Acr [data-hook='average-stars-rating-text']\")\n    }\n});\n```\n\nBy running this code, you will get the following data:\n\n```json\n{\n    \"title\": \"sportbull Unisex 3D Printed Graphics Novelty Casual Short Sleeve T-Shirts Tees\",\n    \"price\": { \"amount\": 9.99, \"currency\": \"USD\" },\n    \"image\": \"https://m.media-amazon.com/images/I/71c3pFtZywL._AC_AC_SY350_QL65_.jpg\",\n    \"reviews\": {\n        \"rating\": \"4.4 out of 5\"\n    }\n}\n```\n\nWe get clean price data, and we're certain that `image` is an URL.\n\n\u003c/p\u003e\n\u003c/details\u003e\n\n## 2.3. Optional values\n\nBy default, all values you will select are required. That means we absolutly want this value to be present in the item, otherwhisee, this item will be excluded from the response.\n\nBut you can of course make a value optional:\n\n```typescript\n$( \u003cselector\u003e ).attr( \u003cattribute\u003e ).optional()\n                                   ^^^^^^^^^^^\n```\n\nWhen a value is optional and it has not been found on the scraped page, you will get an undefined value.\n\nHere are the reasons why a value could not be found:\n\n* The selector do not matches with any element in the page\n* The attribute you want to retrieve do not exists\n* The value is empty\n* The filter has rejected the value\n\n### Practical Example\n\n\u003cdetails\u003e\u003csummary\u003eShow the Example\u003c/summary\u003e\n\u003cp\u003e\n\nLet's consider that the average rating is not necessarly present on the page.\nEven if we're not able to get this info, we still want to retireve all the other values.\nSo, we have to make the `reviews.rating` data optional.\n\n```typescript\npage.get(\"https://www.amazon.com/dp/B08L76BSZ5\", { device: 'mobile', withHeaders: true }, {\n\n    title: $(\"#title\"),\n    price: $(\"#corePrice_feature_div .a-offscreen:first\").filter(\"price\"),\n    image: $(\"#main-image\").attr(\"src\").filter(\"url\"),\n    reviews: {\n        rating: $(\".cr-widget-Acr [data-hook='average-stars-rating-text']\").optional()\n                                                                           ^^^^^^^^^^^\n    }\n\n});\n```\n\nIf `reviews.rating` has not been found, you will get the following data:\n\n```json\n{\n    \"title\": \"sportbull Unisex 3D Printed Graphics Novelty Casual Short Sleeve T-Shirts Tees\",\n    \"price\": { \"amount\": 9.99, \"currency\": \"USD\" },\n    \"image\": \"https://m.media-amazon.com/images/I/71c3pFtZywL._AC_AC_SY350_QL65_.jpg\",\n    \"reviews\": {}\n}\n```\n\n\u003c/p\u003e\n\u003c/details\u003e\n\n### And next ?\n\nNow, you know how to extract high quality data from webpages.\nBut what if we want to extract lists, like search results, products lists, blog articles, etc ... ?\nThat's the next topic 👇\n\n## 3. Iterate through lists\n\nThe scrapingapi SDK allows you to extract every item that matches a selector. \nAgain, it's highly inspired by the jQuery API:\n\n```typescript\n$( \u003citems selector\u003e ).each( \u003cvalues\u003e );\n```\n\nFirstly, you have to provide the **items selector** which will match all the DOM elements you want to iterate.\nThen, you specify the values you want to extract for each element that will be iterated, like we've seen previously.\n\n💡 All the selectors you provide to extract the `values` will be executed inside the `items selector`.\n\n### The `this` selector\n\nIn the `values`, if the selector is `\"this\"`, it will make reference to the items selector.\n\nBy exemple:\n\n```type\n$(\"\u003e ul.tags \u003e li\").each({\n    text: $(\"this\")\n})\n```\n\nThe `text` value will be the text content of every `\"\u003e ul.tags \u003e li` element.\n\n### Practical Example\n\n\u003cdetails\u003e\u003csummary\u003eShow the Example\u003c/summary\u003e\n\u003cp\u003e\n\nWe can now extract every review item:\n\n```typescript\npage.get(\"https://www.amazon.com/dp/B08L76BSZ5\", { device: 'mobile', withHeaders: true }, {\n\n    title: $(\"#title\"),\n    price: $(\"#corePrice_feature_div .a-offscreen:first\").filter(\"price\"),\n    image: $(\"#main-image\").attr(\"src\").filter(\"url\"),\n    reviews: {\n        rating: $(\".cr-widget-Acr [data-hook='average-stars-rating-text']\").optional(),\n        list: $(\"#cm-cr-dp-aw-review-list \u003e [data-hook='mobley-review-content']\").each({\n            author: $(\".a-profile-name\"),\n            title: $(\"[data-hook='review-title']\")\n        })\n    }\n});\n```\n\nYou will get the following result:\n\n```json\n{\n    \"title\": \"sportbull Unisex 3D Printed Graphics Novelty Casual Short Sleeve T-Shirts Tees\",\n    \"price\": { \"amount\": 9.99, \"currency\": \"USD\" },\n    \"image\": \"https://m.media-amazon.com/images/I/71c3pFtZywL._AC_AC_SY350_QL65_.jpg\",\n    \"reviews\": {\n        \"rating\": \"4.4 out of 5\",\n        \"list\": [\n            { \"author\": \"Jon\", \"title\": \"Great shirt; very trippy\" },\n            { \"author\": \"LK\", \"title\": \"Birthday Gift for Bartender Son\" },\n            { \"author\": \"Yessica suazo\", \"title\": \"Decepcionada del producto que recibi\" },\n            { \"author\": \"Avid Reader\", \"title\": \"Worth it.\" },\n            { \"author\": \"Nancy K\", \"title\": \"Husband loves it!\" },\n            { \"author\": \"Mychelle\", \"title\": \"Gets Noticed\" },\n            { \"author\": \"Suzy M. Lewis\", \"title\": \"Wish I had bought a size up\" },\n            { \"author\": \"Devann Shultz\", \"title\": \"Great buy!\" }\n        ]\n    }\n}\n```\n\n\u003c/p\u003e\n\u003c/details\u003e\n\n## 5. Handle the Response\n\nEvery time you launch a request, you will receive a response following this format:\n\n```typescript\ntype Response = {\n    // The final URL after all the redirections\n    url: string,\n    // The scraped page status code\n    status: number,\n    // The scraped page headers (must provide the withHeaders option)\n    headers?: { [key: string]: string },\n    // The page HTML (when the withBody option is true)\n    body?: string,\n    // The extracted data, if you provided extractors\n    data?: object;\n    // The time, in seconds, your request took to be resolved from our server\n    // The communication delays between your app and our servers are ignored\n    time: number,\n    // The used bandwidth, in kb\n    bandwidth: number,\n```\n\n### Optimize the response time\n\nEvery option uses additionnal CPU resources, slows down communication inside scrapingapi's network, and increase your response size.\n\nThat's why it's better do use as few options as possible to make your responses faster.\n\n-----------\n\n\u003cp align=\"center\"\u003e\n    Do you like this project ? Please let me know,\n    \u003ca target=\"_blank\" href=\"https://github.com/scrapingapi/scraper/stargazers\"\u003e\u003cb\u003e⭐ Give it a Star :)\u003c/b\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n------------\n\n## Another example\n\n![https://wallpapercave.com/wp/wp4014371.jpg](https://wallpapercave.com/wp/wp4014371.jpg)\n\nConsider that `http://example.com/products` responds with a webpage containing the following HTML:\n\n```html\n\u003ch2\u003eSpace Cat Holograms to motive you programming\u003c/h2\u003e\n\u003cp\u003eFree shipping to all the Milky Way.\u003c/p\u003e\n\u003csection id=\"products\"\u003e\n\n    \u003carticle class=\"product\"\u003e\n        \u003cimg src=\"https://wallpapercave.com/wp/wp4014371.jpg\" /\u003e\n        \u003ch3\u003eSandwich cat lost on a burger rocket\u003c/h3\u003e\n        \u003cstrong class=\"red price\"\u003e123.45 $\u003c/strong\u003e\n        \u003cul class=\"tags\"\u003e\n            \u003cli\u003esandwich\u003c/li\u003e\n            \u003cli\u003eburger\u003c/li\u003e\n            \u003cli\u003erocket\u003c/li\u003e\n        \u003c/ul\u003e\n    \u003c/article\u003e\n\n    \u003carticle class=\"product\"\u003e\n        \u003cimg src=\"https://wallpapercave.com/wp/wp4575175.jpg\" /\u003e\n        \u003ch3\u003eAliens can't sleep because of this cute DJ\u003c/h3\u003e\n        \u003cul class=\"tags\"\u003e\n            \u003cli\u003ealiens\u003c/li\u003e\n            \u003cli\u003esleep\u003c/li\u003e\n            \u003cli\u003ecute\u003c/li\u003e\n            \u003cli\u003edj\u003c/li\u003e\n            \u003cli\u003emusic\u003c/li\u003e\n        \u003c/ul\u003e\n    \u003c/article\u003e\n\n    \u003carticle class=\"product\"\u003e\n        \u003cimg src=\"https://wallpapercave.com/wp/wp4575192.jpg\" /\u003e\n        \u003ch3\u003eTravelling at the speed of light with a radioactive spaceship\u003c/h3\u003e\n        \u003cp class=\"details\"\u003e\n            Warning: Contains Plutonium.\n        \u003c/p\u003e\n        \u003cstrong class=\"red price\"\u003e456.78 $\u003c/strong\u003e\n        \u003cul class=\"tags\"\u003e\n            \u003cli\u003epizza\u003c/li\u003e\n            \u003cli\u003eslice\u003c/li\u003e\n            \u003cli\u003espaceship\u003c/li\u003e\n        \u003c/ul\u003e\n    \u003c/article\u003e\n\n    \u003carticle class=\"product\"\u003e\n        \u003cimg src=\"https://wallpapercave.com/wp/wp4575163.jpg\" /\u003e\n        \u003ch3\u003eGentleman dropped his litter into a black hole\u003c/h3\u003e\n        \u003cp class=\"details\"\u003e\n            Since he found this calm planet.\n        \u003c/p\u003e\n        \u003cstrong class=\"red price\"\u003eundefined\u003c/strong\u003e\n        \u003cul class=\"tags\"\u003e\n            \u003cli\u003eluxury\u003c/li\u003e\n            \u003cli\u003elitter\u003c/li\u003e\n        \u003c/ul\u003e\n    \u003c/article\u003e\n\u003c/section\u003e\n```\n\nLet's extract the product list: \n\n```typescript\ntype Product = {\n    name: string,\n    image: string,\n    price: { amount: number, currency: string },\n    tags: { text: string }[],\n    description?: string\n}\n\nscraper.get\u003cProduct[]\u003e(\"http://example.com/products\", {}, $(\"#products \u003e article.product\").each({\n\n    name: $(\"\u003e h3\"),,\n    image: $(\"\u003e img\").attr(\"src\").filter(\"url\"),\n    price: $(\"\u003e .price\").filter(\"price\"),\n    tags: $(\"\u003e ul.tags \u003e li\").each({\n        text: $(\"this\")\n    }),\n    description: $(\"\u003e .details\").optional()\n\n}));\n```\n\nHere is the response:\n\n```json\n{\n    \"url\": \"http://example.com/products\",\n    \"status\": 200,\n    \"data\": [{\n        \"name\": \"Sandwich cat lost on a burger rocket\",\n        \"image\": \"https://wallpapercave.com/wp/wp4014371.jpg\",\n        \"price\": { \"amount\": 123.45, \"currency\": \"USD\" },\n        \"tags\": [\n            { \"text\": \"sandwich\" },\n            { \"text\": \"burger\" },\n            { \"text\": \"rocket\" }\n        ]\n    },{\n        \"name\": \"Gentlemen can't find his litter anymore\",\n        \"image\": \"https://wallpapercave.com/wp/wp4575192.jpg\",\n        \"price\": { \"amount\": 456.78, \"currency\": \"USD\" },\n        \"tags\": [\n            { \"text\": \"pizza\" },\n            { \"text\": \"slice\" },\n            { \"text\": \"spaceship\" }\n        ]\n    }]\n}\n```\n\nDid you notice ? In the request, the `price` data has been marked as required. but for two HTML elements we've iterated with the `$foreach` instruction, **the extractor wasn't able to extract the price**.\n\n* `Aliens can't sleep because of this cute DJ` doesn't contains any element that matches with `\u003e .price`\n* `Gentleman dropped his litter into a black hole` contains a `.price` element, but the content text doesn't represents a price\n\n-----------\n\n# About\n\n## Need any additional information or help ? \n\n* Search if a related issue [has not been created before](https://github.com/scrapingapi/scraper/issues)\n* If not, feel free to [create a new issue](https://github.com/scrapingapi/scraper/issues/new)\n* For more personal questions, or for profesionnal inquiries: \n    \u003cdetails\u003e\n    \u003csummary\u003eSend me an email\u003c/summary\u003e\n\n    `contact@gaetan-legac.fr`\n    \u003c/details\u003e\n\n## Credits\n\nFor any complaint about abused kittens that has been sent to the deep space, see it with [WallpaperCave](https://wallpapercave.com/space-cat-wallpapers).\n","funding_links":[],"categories":["TypeScript"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcapturr%2Fscraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcapturr%2Fscraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcapturr%2Fscraper/lists"}