{"id":21455759,"url":"https://github.com/luminati-io/amazon-scraper","last_synced_at":"2025-04-22T12:50:17.007Z","repository":{"id":261020975,"uuid":"882620696","full_name":"luminati-io/Amazon-scraper","owner":"luminati-io","description":"Extract Amazon data with the #1 Amazon Scraper API, including search results, product details, offers, reviews, Q\u0026A, bestsellers, and seller information. Start your free trial now!","archived":false,"fork":false,"pushed_at":"2024-11-06T13:12:03.000Z","size":7449,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-03-29T15:22:37.328Z","etag":null,"topics":["amazon","amazon-api","amazon-data","amazon-dataset","amazon-product-scraper","amazon-reviews","amazon-scraper","amazon-scraping","datasets","e-commerce-scraper","price-scraper","python","scraping-amazon","web-scraper","web-scraping"],"latest_commit_sha":null,"homepage":"https://brightdata.com/products/web-scraper/amazon","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/luminati-io.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-03T10:01:15.000Z","updated_at":"2024-11-07T12:48:43.000Z","dependencies_parsed_at":"2025-01-23T13:13:14.441Z","dependency_job_id":null,"html_url":"https://github.com/luminati-io/Amazon-scraper","commit_stats":null,"previous_names":["luminati-io/amazon-scraper"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2FAmazon-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2FAmazon-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2FAmazon-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luminati-io%2FAmazon-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/luminati-io","download_url":"https://codeload.github.com/luminati-io/Amazon-scraper/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250243742,"owners_count":21398395,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["amazon","amazon-api","amazon-data","amazon-dataset","amazon-product-scraper","amazon-reviews","amazon-scraper","amazon-scraping","datasets","e-commerce-scraper","price-scraper","python","scraping-amazon","web-scraper","web-scraping"],"created_at":"2024-11-23T05:13:19.972Z","updated_at":"2025-04-22T12:50:16.960Z","avatar_url":"https://github.com/luminati-io.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Amazon Scraper\r\n\r\n[![Promo](https://github.com/luminati-io/Amazon-scraper/blob/main/images/Proxies%20and%20scrapers%20GitHub%20bonus%20banner.png)](https://brightdata.com/products/web-scraper/amazon?promo=github15) \r\n\r\n## Table of Contents\r\n\r\n- [Free Amazon Scraper](#free-amazon-scraper)\r\n   - [Prerequisites](#prerequisites)\r\n   - [Quick Setup](#quick-setup)\r\n   - [How to Scrape Amazon Data](#how-to-scrape-amazon-data)\r\n   - [Output](#output)\r\n- [Challenges When Scraping Amazon Data](#challenges-when-scraping-amazon-data)\r\n- [Solution: Bright Data Amazon Scraper API](#solution-bright-data-amazon-scraper-api)\r\n- [Amazon Scraper API in Action](#amazon-scraper-api-in-action)\r\n   - [Customize Data Collection with API Parameters](#customize-data-collection-with-api-parameters)\r\n   - [Amazon Product Data](#amazon-product-data)\r\n   - [Amazon Reviews Data](#amazon-reviews-data)\r\n   - [Amazon Products Search](#amazon-products-search)\r\n   - [Amazon Sellers Info](#amazon-sellers-info)\r\n   - [Amazon Products by Best Sellers](#amazon-products-by-best-sellers)\r\n   - [Amazon Products by Category URL](#amazon-products-by-category-url)\r\n   - [Amazon Products by Keyword](#amazon-products-by-keyword)\r\n   - [Amazon Products Global Dataset](#amazon-products-global-dataset)\r\n   - [Amazon Products Global Dataset - Discover by Category URL](#amazon-products-global-dataset---discover-by-category-url)\r\n   - [Amazon Products Global Dataset - Discover by Keywords](#amazon-products-global-dataset---discover-by-keywords)\r\n\r\n\r\n## Free Amazon Scraper\r\nUse this free tool to extract Amazon product data directly from search results pages. Easily retrieve product titles, prices, ratings, reviews, and more with just a few simple steps.\r\n\r\n### Prerequisites\r\n- Python 3.11 or higher.\r\n- Install the necessary dependencies (see steps below).\r\n\r\n### Quick Setup\r\n1. Open your terminal and navigate to this project’s directory.\r\n2. Run the following command to install dependencies:\r\n   \r\n    ```bash\r\n    pip install -r requirements.txt\r\n    ```\r\n\r\n### How to Scrape Amazon Data\r\nTo start scraping Amazon data, simply provide a search query. You can also specify the Amazon domain and the number of pages you want to scrape.\r\n\r\n#### Command:\r\n```bash\r\npython main.py \"\u003cyour_search_query\u003e\" --domain=\"\u003camazon_domain\u003e\" --pages=\u003cnumber_of_pages\u003e\r\n```\r\n- `\u003cyour_search_query\u003e`: The search keywords (e.g., \"coffee maker\").\r\n- `\u003camazon_domain\u003e`: The Amazon domain you want to scrape (default: `com` for Amazon US).\r\n- `\u003cnumber_of_pages\u003e`: Number of pages to scrape (optional, defaults to scraping all available pages).\r\n\r\n#### Example:\r\nTo scrape data for \"coffee maker\" on the Amazon US domain and scrape the first 3 pages of results.\r\nHere's the command:\r\n```bash\r\npython main.py \"coffee maker\" --domain=\"com\" --pages=3\r\n```\r\n### Output\r\nAfter scraping, the extracted data will be saved as `amazon_data.csv` in the project directory. The CSV file will include the following details:\r\n- **Name:** Product title.\r\n- **Current Price:** Product price (empty if out of stock).\r\n- **Rating:** Average customer rating.\r\n- **Reviews:** Total number of customer reviews.\r\n- **ASIN:** Amazon Standard Identification Number.\r\n- **Link:** Direct URL to the product page on Amazon.\r\n\r\nHere's how the data will look:\r\n\r\n\u003cimg width=\"700\" alt=\"bright-data-amazon_csv_data\" src=\"https://github.com/luminati-io/Amazon-scraper/blob/main/images/bright-data-amazon_csv_data.png\"\u003e\r\n\r\n## Challenges When Scraping Amazon Data\r\nScraping Amazon data isn't always straightforward. Here are a few challenges that you might encounter:\r\n1. **Advanced Anti-Bot Measures:** Amazon uses CAPTCHAs, invisible bot detection techniques, and behavioral analysis (like tracking mouse movements) to block bots.\r\n2. **Frequent Page Structure Updates:** Amazon frequently changes its HTML structure, IDs, and class names, making it necessary to regular updates to scrapers to align with the new page layout.\r\n3. **High Resource Consumption:** Scraping JavaScript-heavy pages with tools like Playwright or Selenium can consume significant system resources. Handling dynamic content and running multiple browser instances can slow down performance, especially when scraping large amounts of data.\r\n\r\nBelow is an example of what happens when Amazon detects automated scraping attempts:\r\n\r\n\u003cimg src=\"https://github.com/luminati-io/Amazon-scraper/blob/main/images/Amazon%20Blocked.png\" alt=\"Amazon Blocked\" width=\"700\"/\u003e\r\n\r\nAs shown above, Amazon blocked the request to prevent further data scraping — a common issue that many scrapers encounter.\r\n\r\n## Solution: Bright Data Amazon Scraper API\r\nThe [Bright Data Amazon Scraper API](https://brightdata.com/products/web-scraper/amazon) is the ultimate solution for scraping Amazon product data at scale. Here’s why:\r\n\r\n- **No Infrastructure Management**: No need to handle proxies or unblocking systems.\r\n- **Geo-Location Scraping**: Scrape from any geographical region.\r\n- **Global IP Coverage**: Access [over 72 million real user IPs](https://brightdata.com/proxy-types/residential-proxies) in [195 countries](https://brightdata.com/locations) with 99.99% uptime.\r\n- **Flexible Data Delivery**: Get data via Amazon S3, Google Cloud, Azure, Snowflake, or SFTP in formats like JSON, NDJSON, CSV, and `.gz`.\r\n- **Privacy Compliance**: Fully complies with GDPR, CCPA, and other data protection laws.\r\n- **24/7 Support**: Dedicated support team is available around the clock to assist with any API-related questions or issues.\r\n\r\nYou also get **20 free API calls** to test the product and see how it fits your needs.\r\n\r\n## Amazon Scraper API in Action\r\n\r\n\u003e For a detailed guide on setting up the Amazon Scraper API, check our [Step-by-Step Setup Guide](https://github.com/luminati-io/Amazon-scraper/blob/main/scraper_api_setup.md#amazon-reviews).\r\n\r\n### Customize Data Collection with API Parameters\r\n\r\nUse the following API parameters to customize your data collection:\r\n\r\n| **Parameter**       | **Type**   | **Description**                                                                                   | **Example**                                           |\r\n|---------------------|------------|---------------------------------------------------------------------------------------------------|-------------------------------------------------------|\r\n| `limit`             | `integer`  | Limit the number of results returned for each input.                                            | `limit=10`                                           |\r\n| `include_errors`    | `boolean`   | Include an error report in the output for troubleshooting.                                      | `include_errors=true`                                |\r\n| `notify`            | `url`      | URL where a notification is sent once the collection completes.                                  | `notify=https://notify-me.com/`                      |\r\n| `format`            | `enum`     | Format for data delivery. Supported formats: JSON, NDJSON, JSONL, CSV.                          | `format=json`                                        |\r\n\r\n💡Additional delivery methods: You can choose to deliver the data via [webhook](https://docs.brightdata.com/scraping-automation/web-data-apis/web-scraper-api/overview#via-webhook) or through the [API](https://docs.brightdata.com/scraping-automation/web-data-apis/web-scraper-api/overview#via-api).\r\n\r\n### Amazon Product Data\r\nCollect detailed product data from Amazon by providing a product URL.\r\n\r\n\u003cimg width=\"700\" alt=\"bright-data-web-scraper-api-amazon-product-data\" src=\"https://github.com/luminati-io/Amazon-scraper/blob/main/images/bright-data-web-scraper-api-amazon-product-data.png\"\u003e\r\n\r\n#### Key Input Parameters:\r\n| Parameter | Type   | Description                    | Required |\r\n|-----------|--------|--------------------------------|----------|\r\n| `url`       | `string` | The Amazon product URL to scrape data | Yes      |\r\n\r\n#### Performance:\r\n- Average response time per input: 13 seconds\r\n\r\n#### Sample Output Data:\r\nHere’s an example of the output you will receive after scraping Amazon product data:\r\n```json\r\n{\r\n    \"url\": \"https://www.amazon.com/KitchenAid-Protective-Dishwasher-Stainless-8-72-Inch/dp/B07PZF3QS3\",\r\n    \"title\": \"KitchenAid All Purpose Kitchen Shears with Protective Sheath...\",\r\n    \"seller_name\": \"Amazon.com\",\r\n    \"brand\": \"KitchenAid\",\r\n    \"description\": \"These all-purpose shears from KitchenAid are a valuable addition...\",\r\n    \"initial_price\": 11.99,\r\n    \"final_price\": 8.99,\r\n    \"currency\": \"USD\",\r\n    \"availability\": \"In Stock\",\r\n    \"reviews_count\": 77557,\r\n    \"rating\": 4.8,\r\n    \"categories\": [\r\n        \"Home \u0026 Kitchen\",\r\n        \"Kitchen \u0026 Dining\",\r\n        \"Kitchen Utensils \u0026 Gadgets\",\r\n        \"Shears\"\r\n    ],\r\n    \"asin\": \"B07PZF3QS3\",\r\n    \"images\": [\r\n        \"https://m.media-amazon.com/images/I/41E7ALk+uXL._AC_SL1200_.jpg\",\r\n        \"https://m.media-amazon.com/images/I/710B9HpzMPL._AC_SL1500_.jpg\"\r\n    ],\r\n    \"delivery\": [\r\n        \"FREE delivery Friday, October 25 on orders shipped by Amazon over $35\",\r\n        \"Or fastest Same-Day delivery Today 10 AM - 3 PM. Order within 4 hrs 46 mins\"\r\n    ]\r\n}\r\n```\r\n#### Code Example:\r\nBelow is a Python script that triggers the Amazon product data collection and stores the results in a JSON file:\r\n```python\r\nimport json\r\nimport requests\r\nimport time\r\n\r\n\r\ndef trigger_datasets(api_token, dataset_id, datasets):\r\n    headers = {\r\n        \"Authorization\": f\"Bearer {api_token}\",\r\n        \"Content-Type\": \"application/json\",\r\n    }\r\n\r\n    trigger_url = (\r\n        f\"https://api.brightdata.com/datasets/v3/trigger?dataset_id={dataset_id}\"\r\n    )\r\n\r\n    # Sending API request to trigger dataset collection\r\n    response = requests.post(trigger_url, headers=headers, data=json.dumps(datasets))\r\n\r\n    if response.status_code == 200:\r\n        print(\"Data collection triggered successfully!\")\r\n        snapshot_id = response.json().get(\"snapshot_id\")\r\n        return snapshot_id if snapshot_id else print(\"No snapshot ID returned.\")\r\n    else:\r\n        print(f\"Error: {response.status_code} - {response.text}\")\r\n        return None\r\n\r\n\r\ndef get_snapshot_data(api_token, snapshot_id):\r\n    headers = {\"Authorization\": f\"Bearer {api_token}\"}\r\n    snapshot_url = (\r\n        f\"https://api.brightdata.com/datasets/v3/snapshot/{snapshot_id}?format=json\"\r\n    )\r\n\r\n    # Polling until the snapshot data is ready\r\n    while True:\r\n        time.sleep(10)\r\n        response = requests.get(snapshot_url, headers=headers)\r\n\r\n        if response.status_code == 200:\r\n            return response.json()\r\n        elif response.status_code == 202:\r\n            print(\"Snapshot still processing... retrying.\")\r\n        else:\r\n            print(f\"Error: {response.status_code} - {response.text}\")\r\n            return None\r\n\r\n\r\ndef store_data(data, filename=\"amazon_products_data.json\"):\r\n    if data:\r\n        with open(filename, \"w\") as file:\r\n            json.dump(data, file, indent=4)\r\n        print(f\"Data saved in {filename}.\")\r\n    else:\r\n        print(\"No data to store.\")\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    API_TOKEN = \"YOUR_API_TOKEN\"\r\n    DATASET_ID = \"gd_l7q7dkf244hwjntr0\"\r\n\r\n    datasets = [\r\n        {\r\n            \"url\": \"https://www.amazon.com/Quencher-FlowState-Stainless-Insulated-Smoothie/dp/B0CRMZHDG8\"\r\n        },\r\n        {\r\n            \"url\": \"https://www.amazon.com/KitchenAid-Protective-Dishwasher-Stainless-8-72-Inch/dp/B07PZF3QS3\"\r\n        },\r\n        {\r\n            \"url\": \"https://www.amazon.com/TruSkin-Naturals-Vitamin-Topical-Hyaluronic/dp/B01M4MCUAF\"\r\n        },\r\n    ]\r\n\r\n    # Trigger dataset collection\r\n    snapshot_id = trigger_datasets(API_TOKEN, DATASET_ID, datasets)\r\n\r\n    if snapshot_id:\r\n        # Retrieve the data once the snapshot is ready\r\n        data = get_snapshot_data(API_TOKEN, snapshot_id)\r\n        if data:\r\n            store_data(data)\r\n```\r\nYou can view the full output by downloading [this sample JSON file](https://github.com/luminati-io/Amazon-scraper/blob/main/output_data/amazon_products_data.json).\r\n\r\n### Amazon Reviews Data\r\nCollect Amazon reviews by providing the product URL along with specific parameters like time frames, keywords, and the number of reviews to scrape.\r\n\r\n\u003cimg width=\"700\" alt=\"bright-data-web-scraper-api-amazon-product-reviews\" src=\"https://github.com/luminati-io/Amazon-scraper/blob/main/images/bright-data-web-scraper-api-amazon-product-reviews.png\"\u003e\r\n\r\n\r\n#### Key Input Parameters:\r\n| **Parameter**       | **Type**  | **Description**                                                                 | **Required** |\r\n|---------------------|-----------|---------------------------------------------------------------------------------|--------------|\r\n| `url`               | `string`  | The Amazon product URL from which to scrape reviews.                             | Yes          |\r\n| `days_range`        | `number`  | The number of past days to consider when collecting reviews (leave blank for no limit). | No           |\r\n| `keyword`           | `string`  | Filter reviews by a specific keyword.                            | No           |\r\n| `num_of_reviews`    | `number`  | The number of reviews to scrape (if not provided, it will scrape all available reviews). | No           |\r\n\r\n#### Performance:\r\n- Average response time per input: 1 minute 1 second\r\n\r\n#### Sample Output Data:\r\nHere’s an example of the output you’ll receive when scraping Amazon reviews:\r\n```json\r\n{\r\n    \"url\": \"https://www.amazon.com/RORSOU-R10-Headphones-Microphone-Lightweight/dp/B094NC89P9/\",\r\n    \"product_name\": \"RORSOU R10 On-Ear Headphones with Microphone...\",\r\n    \"product_rating\": 4.5,\r\n    \"product_rating_object\": {\r\n        \"one_star\": 386,\r\n        \"two_star\": 237,\r\n        \"three_star\": 584,\r\n        \"four_star\": 1493,\r\n        \"five_star\": 7630\r\n    },\r\n    \"rating\": 5,\r\n    \"author_name\": \"Amazon Customer\",\r\n    \"review_header\": \"Great Sound For the Price!\",\r\n    \"review_text\": \"I bought these headphones twice...\",\r\n    \"badge\": \"Verified Purchase\",\r\n    \"review_posted_date\": \"September 7, 2024\",\r\n    \"helpful_count\": 3\r\n}\r\n```\r\n#### Code Example:\r\nBelow is a Python script that triggers the Amazon review data collection and stores the results in a JSON file:\r\n```python\r\nimport json\r\nimport requests\r\nimport time\r\n\r\n\r\ndef trigger_datasets(api_token, dataset_id, datasets):\r\n    headers = {\r\n        \"Authorization\": f\"Bearer {api_token}\",\r\n        \"Content-Type\": \"application/json\",\r\n    }\r\n\r\n    trigger_url = (\r\n        f\"https://api.brightdata.com/datasets/v3/trigger?dataset_id={dataset_id}\"\r\n    )\r\n\r\n    # Sending API request to trigger dataset collection\r\n    response = requests.post(trigger_url, headers=headers, data=json.dumps(datasets))\r\n\r\n    if response.status_code == 200:\r\n        print(\"Data collection triggered successfully!\")\r\n        snapshot_id = response.json().get(\"snapshot_id\")\r\n        return snapshot_id if snapshot_id else print(\"No snapshot ID returned.\")\r\n    else:\r\n        print(f\"Error: {response.status_code} - {response.text}\")\r\n        return None\r\n\r\n\r\ndef get_snapshot_data(api_token, snapshot_id):\r\n    headers = {\"Authorization\": f\"Bearer {api_token}\"}\r\n    snapshot_url = (\r\n        f\"https://api.brightdata.com/datasets/v3/snapshot/{snapshot_id}?format=json\"\r\n    )\r\n\r\n    # Polling until the snapshot data is ready\r\n    while True:\r\n        time.sleep(10)\r\n        response = requests.get(snapshot_url, headers=headers)\r\n\r\n        if response.status_code == 200:\r\n            return response.json()\r\n        elif response.status_code == 202:\r\n            print(\"Snapshot still processing... retrying.\")\r\n        else:\r\n            print(f\"Error: {response.status_code} - {response.text}\")\r\n            return None\r\n\r\n\r\ndef store_data(data, filename=\"amazon_reviews_data.json\"):\r\n    if data:\r\n        with open(filename, \"w\") as file:\r\n            json.dump(data, file, indent=4)\r\n        print(f\"Data saved in {filename}.\")\r\n    else:\r\n        print(\"No data to store.\")\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    API_TOKEN = \"YOUR_API_TOKEN\"\r\n    DATASET_ID = \"gd_le8e811kzy4ggddlq\"\r\n\r\n    datasets = [\r\n        {\r\n            \"url\": \"https://www.amazon.com/RORSOU-R10-Headphones-Microphone-Lightweight/dp/B094NC89P9/\",\r\n            \"days_range\": 0,\r\n            \"num_of_reviews\": 4,\r\n            \"keyword\": \"great\",\r\n        },\r\n        {\r\n            \"url\": \"https://www.amazon.com/Solar-Eclipse-Glasses-Certified-Viewing/dp/B08GB3QC1H\",\r\n            \"days_range\": 0,\r\n            \"num_of_reviews\": 4,\r\n            \"keyword\": \"\",\r\n        },\r\n    ]\r\n\r\n    # Trigger dataset collection\r\n    snapshot_id = trigger_datasets(API_TOKEN, DATASET_ID, datasets)\r\n\r\n    if snapshot_id:\r\n        # Retrieve the data once the snapshot is ready\r\n        data = get_snapshot_data(API_TOKEN, snapshot_id)\r\n        if data:\r\n            store_data(data)\r\n```\r\nYou can view the full output by downloading [this sample JSON file](https://github.com/luminati-io/Amazon-scraper/blob/main/output_data/amazon_reviews_data.json).\r\n\r\n### Amazon Products Search\r\nDiscover Amazon products by providing a keyword for your search.\r\n\r\n\u003cimg width=\"700\" alt=\"bright-data-web-scraper-api-keyword-search\" src=\"https://github.com/luminati-io/Amazon-scraper/blob/main/images/bright-data-web-scraper-api-keyword-search.png\"\u003e\r\n\r\n#### Key Input Parameters:\r\n| Parameter         | Type    | Description                                 | Required |\r\n|-------------------|---------|---------------------------------------------|----------|\r\n| `keyword`         | string  | The keyword used to search for products      | Yes      |\r\n| `url`             | string  | The domain URL to search within              | Yes      |\r\n| `pages_to_search` | number  | The number of pages to search through        | No       |\r\n\r\n#### Performance:\r\n- Average response time per input: 1 second\r\n\r\n#### Sample Output Data:\r\nHere’s an example of the output you’ll receive after performing a keyword search for products on Amazon:\r\n```json\r\n{\r\n    \"asin\": \"B08H75RTZ8\",\r\n    \"url\": \"https://www.amazon.com/Microsoft-Xbox-Gaming-Console-video-game/dp/B08H75RTZ8/ref=sr_1_1\",\r\n    \"name\": \"Xbox Series X 1TB SSD Console - Includes Xbox Wireless Controller...\",\r\n    \"sponsored\": \"false\",\r\n    \"initial_price\": 479,\r\n    \"final_price\": 479,\r\n    \"currency\": \"USD\",\r\n    \"sold\": 2000,\r\n    \"rating\": 4.8,\r\n    \"num_ratings\": 28675,\r\n    \"variations\": null,\r\n    \"badge\": null,\r\n    \"brand\": null,\r\n    \"delivery\": [\"FREE delivery\"],\r\n    \"keyword\": \"X-box\",\r\n    \"image\": \"https://m.media-amazon.com/images/I/616klipzdtL._AC_UY218_.jpg\",\r\n    \"domain\": \"https://www.amazon.com/\",\r\n    \"bought_past_month\": 2000,\r\n    \"page_number\": 1,\r\n    \"rank_on_page\": 1,\r\n    \"timestamp\": \"2024-10-20T10:39:37.679Z\",\r\n    \"input\": {\r\n        \"keyword\": \"X-box\",\r\n        \"url\": \"https://www.amazon.com\",\r\n        \"pages_to_search\": 1\r\n    }\r\n}\r\n```\r\n#### Code Example:\r\nBelow is a Python script that triggers an Amazon product search based on a keyword and stores the results in a JSON file:\r\n```python\r\nimport json\r\nimport requests\r\nimport time\r\n\r\n\r\ndef trigger_datasets(api_token, dataset_id, datasets):\r\n    headers = {\r\n        \"Authorization\": f\"Bearer {api_token}\",\r\n        \"Content-Type\": \"application/json\",\r\n    }\r\n\r\n    trigger_url = (\r\n        f\"https://api.brightdata.com/datasets/v3/trigger?dataset_id={dataset_id}\"\r\n    )\r\n\r\n    # Sending API request to trigger dataset collection\r\n    response = requests.post(trigger_url, headers=headers, data=json.dumps(datasets))\r\n\r\n    if response.status_code == 200:\r\n        print(\"Data collection triggered successfully!\")\r\n        snapshot_id = response.json().get(\"snapshot_id\")\r\n        return snapshot_id if snapshot_id else print(\"No snapshot ID returned.\")\r\n    else:\r\n        print(f\"Error: {response.status_code} - {response.text}\")\r\n        return None\r\n\r\n\r\ndef get_snapshot_data(api_token, snapshot_id):\r\n    headers = {\"Authorization\": f\"Bearer {api_token}\"}\r\n    snapshot_url = (\r\n        f\"https://api.brightdata.com/datasets/v3/snapshot/{snapshot_id}?format=json\"\r\n    )\r\n\r\n    # Polling until the snapshot data is ready\r\n    while True:\r\n        response = requests.get(snapshot_url, headers=headers)\r\n\r\n        if response.status_code == 200:\r\n            return response.json()\r\n        elif response.status_code == 202:\r\n            print(\"Snapshot still processing... retrying.\")\r\n        else:\r\n            print(f\"Error: {response.status_code} - {response.text}\")\r\n            return None\r\n        time.sleep(10)\r\n\r\n\r\ndef store_data(data, filename=\"amazon_keywords_data.json\"):\r\n    if data:\r\n        with open(filename, \"w\") as file:\r\n            json.dump(data, file, indent=4)\r\n        print(f\"Data saved in {filename}.\")\r\n    else:\r\n        print(\"No data to store.\")\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    API_TOKEN = \"YOUR_API_TOKEN\"\r\n    DATASET_ID = \"gd_lwdb4vjm1ehb499uxs\"\r\n\r\n    datasets = [\r\n        {\"keyword\": \"X-box\", \"url\": \"https://www.amazon.com\", \"pages_to_search\": 1},\r\n        {\"keyword\": \"PS5\", \"url\": \"https://www.amazon.de\"},\r\n        {\r\n            \"keyword\": \"car cleaning kit\",\r\n            \"url\": \"https://www.amazon.es\",\r\n            \"pages_to_search\": 4,\r\n        },\r\n    ]\r\n\r\n    # Trigger dataset collection\r\n    snapshot_id = trigger_datasets(API_TOKEN, DATASET_ID, datasets)\r\n\r\n    if snapshot_id:\r\n        # Retrieve the data once the snapshot is ready\r\n        data = get_snapshot_data(API_TOKEN, snapshot_id)\r\n        if data:\r\n            store_data(data)\r\n```\r\nYou can view the full output by downloading [this sample JSON file](https://github.com/luminati-io/Amazon-scraper/blob/main/output_data/amazon_keywords_data.json).\r\n\r\n### Amazon Sellers Info\r\nDiscover detailed information about Amazon sellers by providing their specific seller URL.\r\n\r\n\u003cimg width=\"700\" alt=\"bright-data-web-scraper-api-seller-info\" src=\"https://github.com/luminati-io/Amazon-scraper/blob/main/images/bright-data-web-scraper-api-seller-info.png\"\u003e\r\n\r\n\r\n#### Key Input Parameters:\r\n| **Parameter** | **Type**  | **Description**                    | **Required** |\r\n|---------------|-----------|------------------------------------|--------------|\r\n| `url`         | `string`  | The Amazon seller URL              | Yes          |\r\n\r\n#### Performance:\r\n- Average response time per input: 1 second\r\n\r\n#### Sample Output Data:\r\nBelow is an example of the output you will receive after scraping seller information:\r\n```json\r\n{\r\n    \"input\": {\r\n        \"url\": \"https://www.amazon.com/sp?seller=A33W53J5GVPZ8K\"\r\n    },\r\n    \"seller_id\": \"A33W53J5GVPZ8K\",\r\n    \"seller_name\": \"Peckomatic\",\r\n    \"description\": \"Peckomatic is committed to providing each customer with the highest standard of customer service.\",\r\n    \"detailed_info\": [\r\n        {\"title\": \"Business Name\"},\r\n        {\"title\": \"Business Address\"}\r\n    ],\r\n    \"stars\": \"4.5 out of 5 stars\",\r\n    \"feedbacks\": [\r\n        {\r\n            \"date\": \"By Kao y. on November 16, 2021.\",\r\n            \"stars\": \"5 out of 5 stars\",\r\n            \"text\": \"It say not to exceed 10lbs total but I did anyway. My bird was 8lbs + the 3lb box = 11lbs. Bird arrived in great condition.\"\r\n        },\r\n        {\r\n            \"date\": \"By JL on June 9, 2021.\",\r\n            \"stars\": \"1 out of 5 stars\",\r\n            \"text\": \"How this seller packages its items is not acceptable...\"\r\n        }\r\n    ],\r\n    \"rating_positive\": \"89%\",\r\n    \"feedbacks_percentages\": {\r\n        \"star_5\": \"80%\",\r\n        \"star_4\": \"9%\",\r\n        \"star_3\": \"7%\",\r\n        \"star_2\": \"0%\",\r\n        \"star_1\": \"5%\"\r\n    },\r\n    \"products_link\": \"https://www.amazon.com/s?me=A33W53J5GVPZ8K\",\r\n    \"buisness_name\": \"Francis Kunnumpurath\",\r\n    \"buisness_address\": \"2612 State Route 80, Lafayette, NY, 13084, US\",\r\n    \"rating_count_lifetime\": 44,\r\n    \"country\": \"US\"\r\n}\r\n```\r\n\r\n#### Code Example:\r\nHere’s a Python script that triggers the collection of Amazon seller data and stores the results in a JSON file:\r\n```python\r\nimport json\r\nimport requests\r\nimport time\r\n\r\n\r\ndef trigger_datasets(api_token, dataset_id, datasets):\r\n    headers = {\r\n        \"Authorization\": f\"Bearer {api_token}\",\r\n        \"Content-Type\": \"application/json\",\r\n    }\r\n\r\n    trigger_url = (\r\n        f\"https://api.brightdata.com/datasets/v3/trigger?dataset_id={dataset_id}\"\r\n    )\r\n\r\n    # Sending API request to trigger dataset collection\r\n    response = requests.post(trigger_url, headers=headers, data=json.dumps(datasets))\r\n\r\n    if response.status_code == 200:\r\n        print(\"Data collection triggered successfully!\")\r\n        snapshot_id = response.json().get(\"snapshot_id\")\r\n        return snapshot_id if snapshot_id else print(\"No snapshot ID returned.\")\r\n    else:\r\n        print(f\"Error: {response.status_code} - {response.text}\")\r\n        return None\r\n\r\n\r\ndef get_snapshot_data(api_token, snapshot_id):\r\n    headers = {\"Authorization\": f\"Bearer {api_token}\"}\r\n    snapshot_url = (\r\n        f\"https://api.brightdata.com/datasets/v3/snapshot/{snapshot_id}?format=json\"\r\n    )\r\n\r\n    # Polling until the snapshot data is ready\r\n    while True:\r\n        response = requests.get(snapshot_url, headers=headers)\r\n\r\n        if response.status_code == 200:\r\n            return response.json()\r\n        elif response.status_code == 202:\r\n            print(\"Snapshot still processing... retrying.\")\r\n        else:\r\n            print(f\"Error: {response.status_code} - {response.text}\")\r\n            return None\r\n        time.sleep(10)\r\n\r\n\r\ndef store_data(data, filename=\"amazon_seller_data.json\"):\r\n    if data:\r\n        with open(filename, \"w\") as file:\r\n            json.dump(data, file, indent=4)\r\n        print(f\"Data saved in {filename}.\")\r\n    else:\r\n        print(\"No data to store.\")\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    API_TOKEN = \"API_TOKEN\"\r\n    DATASET_ID = \"gd_lhotzucw1etoe5iw1k\"\r\n\r\n    # Define the dataset with seller URLs\r\n    datasets = [\r\n        {\"url\": \"https://www.amazon.com/sp?seller=A33W53J5GVPZ8K\"},\r\n        {\"url\": \"https://www.amazon.com/sp?seller=A33YXLPENB0JBD\"},\r\n        {\"url\": \"https://www.amazon.com/sp?seller=A33ZG27WW2U3E6\"},\r\n    ]\r\n\r\n    # Trigger dataset collection\r\n    snapshot_id = trigger_datasets(API_TOKEN, DATASET_ID, datasets)\r\n\r\n    if snapshot_id:\r\n        # Retrieve the data once the snapshot is ready\r\n        data = get_snapshot_data(API_TOKEN, snapshot_id)\r\n        if data:\r\n            store_data(data)\r\n```\r\n\r\nYou can view the full output by downloading [this sample JSON file](https://github.com/luminati-io/Amazon-scraper/blob/main/output_data/amazon_seller_data.json).\r\n\r\n### Amazon Products by Best Sellers\r\nDiscover top-selling products on Amazon by providing the URL for the Best Sellers category.\r\n\r\n\u003cimg width=\"700\" alt=\"bright-data-web-scraper-api-amazon-best-sellers\" src=\"https://github.com/luminati-io/Amazon-scraper/blob/main/images/bright-data-web-scraper-api-amazon-best-sellers.png\"\u003e\r\n\r\n\r\n#### Key Input Parameters:\r\n\r\n| Parameter       | Type     | Description                                    | Required |\r\n|-----------------|----------|------------------------------------------------|----------|\r\n| `category_url`  | `string` | The Best Sellers category URL from which to scrape | Yes      |\r\n\r\n#### Performance:\r\n- Average response time per input: 6 minutes 49 seconds\r\n\r\n#### Sample Output Data:\r\nHere’s an example of the output you will receive after scraping Amazon’s Best Sellers data:\r\n\r\n```json\r\n{\r\n    \"title\": \"Amazon Basics Multipurpose Copy Printer Paper, 8.5\\\" x 11\\\", 1 Ream, 500 Sheets, White\",\r\n    \"seller_name\": \"Amazon.com\",\r\n    \"brand\": \"Amazon Basics\",\r\n    \"initial_price\": 9.99,\r\n    \"final_price\": 7.41,\r\n    \"currency\": \"USD\",\r\n    \"availability\": \"In Stock\",\r\n    \"reviews_count\": 178695,\r\n    \"rating\": 4.8,\r\n    \"categories\": [\r\n        \"Office Products\",\r\n        \"Paper\",\r\n        \"Copy \u0026 Multipurpose Paper\"\r\n    ],\r\n    \"asin\": \"B01FV0F8H8\",\r\n    \"buybox_seller\": \"Amazon.com\",\r\n    \"discount\": \"-26%\",\r\n    \"root_bs_rank\": 1,\r\n    \"url\": \"https://www.amazon.com/AmazonBasics-Multipurpose-Copy-Printer-Paper/dp/B01FV0F8H8?th=1\u0026psc=1\",\r\n    \"image_url\": \"https://m.media-amazon.com/images/I/81x0cTHWQJL._AC_SL1500_.jpg\",\r\n    \"delivery\": [\r\n        \"FREE delivery Friday, October 25\",\r\n        \"Same-Day delivery Today 10 AM - 3 PM\"\r\n    ],\r\n    \"features\": [\r\n        \"1 ream (500 sheets) of 8.5 x 11 white copier and printer paper\",\r\n        \"Works with laser/inkjet printers, copiers, and fax machines\",\r\n        \"Smooth 20lb weight paper for consistent ink and toner distribution\"\r\n    ],\r\n    \"bought_past_month\": 100000,\r\n    \"root_bs_category\": \"Office Products\",\r\n    \"bs_category\": \"Copy \u0026 Multipurpose Paper\",\r\n    \"bs_rank\": 1,\r\n    \"amazon_choice\": true,\r\n    \"badge\": \"Amazon's Choice\",\r\n    \"seller_url\": \"https://www.amazon.com/sp?ie=UTF8\u0026seller=ATVPDKIKX0DER\u0026asin=B01FV0F8H8\",\r\n    \"timestamp\": \"2024-10-20T13:30:56.666Z\"\r\n}\r\n```\r\n#### Code Example:\r\nBelow is a Python script that triggers the collection of Amazon Best Sellers data and stores the results in a JSON file:\r\n```python\r\nimport json\r\nimport requests\r\nimport time\r\n\r\n\r\ndef trigger_datasets(api_token, dataset_id, datasets):\r\n    headers = {\r\n        \"Authorization\": f\"Bearer {api_token}\",\r\n        \"Content-Type\": \"application/json\",\r\n    }\r\n\r\n    trigger_url = f\"https://api.brightdata.com/datasets/v3/trigger?dataset_id={dataset_id}\u0026type=discover_new\u0026discover_by=best_sellers_url\u0026limit_per_input=3\"\r\n\r\n    # Sending API request to trigger dataset collection\r\n    response = requests.post(trigger_url, headers=headers, data=json.dumps(datasets))\r\n\r\n    if response.status_code == 200:\r\n        print(\"Data collection triggered successfully!\")\r\n        snapshot_id = response.json().get(\"snapshot_id\")\r\n        return snapshot_id if snapshot_id else print(\"No snapshot ID returned.\")\r\n    else:\r\n        print(f\"Error: {response.status_code} - {response.text}\")\r\n        return None\r\n\r\n\r\ndef get_snapshot_data(api_token, snapshot_id):\r\n    headers = {\"Authorization\": f\"Bearer {api_token}\"}\r\n    snapshot_url = (\r\n        f\"https://api.brightdata.com/datasets/v3/snapshot/{snapshot_id}?format=json\"\r\n    )\r\n\r\n    # Polling until the snapshot data is ready\r\n    while True:\r\n        time.sleep(10)\r\n        response = requests.get(snapshot_url, headers=headers)\r\n\r\n        if response.status_code == 200:\r\n            return response.json()\r\n        elif response.status_code == 202:\r\n            print(\"Snapshot still processing... retrying.\")\r\n        else:\r\n            print(f\"Error: {response.status_code} - {response.text}\")\r\n            return None\r\n\r\n\r\ndef store_data(data, filename=\"amazon_bestsellers_data.json\"):\r\n    if data:\r\n        with open(filename, \"w\") as file:\r\n            json.dump(data, file, indent=4)\r\n        print(f\"Data saved in {filename}.\")\r\n    else:\r\n        print(\"No data to store.\")\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    API_TOKEN = \"YOUR_API_TOKEN\"\r\n    DATASET_ID = \"gd_l7q7dkf244hwjntr0\"\r\n\r\n    datasets = [\r\n        {\r\n            \"category_url\": \"https://www.amazon.com/gp/bestsellers/office-products/ref=pd_zg_ts_office-products\"\r\n        },\r\n    ]\r\n\r\n    # Trigger dataset collection\r\n    snapshot_id = trigger_datasets(API_TOKEN, DATASET_ID, datasets)\r\n\r\n    if snapshot_id:\r\n        # Retrieve the data once the snapshot is ready\r\n        data = get_snapshot_data(API_TOKEN, snapshot_id)\r\n        if data:\r\n            store_data(data)\r\n```\r\nYou can view the full output by downloading [this sample JSON file](https://github.com/luminati-io/Amazon-scraper/blob/main/output_data/amazon_bestsellers.json).\r\n\r\n### Amazon Products by Category URL\r\nDiscover and collect Amazon product data by providing a specific category URL. Customize your search with sorting options and location-based filters.\r\n\r\n\u003cimg width=\"700\" alt=\"bright-data-web-scraper-api-discover-by-category-url\" src=\"https://github.com/luminati-io/Amazon-scraper/blob/main/images/bright-data-web-scraper-api-discover-by-category-url.png\"\u003e\r\n\r\n#### Key Input Parameters:\r\n| **Parameter** | **Type**  | **Description**                              | **Required** |\r\n|---------------|-----------|----------------------------------------------|--------------|\r\n| `url`         | `string`  | The category URL to scrape products from      | Yes          |\r\n| `sort_by`     | `string`  | Criteria for sorting the product results      | No           |\r\n| `zipcode`     | `string`  | Zip code for location-specific product results| No           |\r\n\r\n#### Performance:\r\n- Average response time per input: 16 minutes 16 seconds\r\n\r\n#### Sample Output Data:\r\nBelow is an example of the data you’ll receive after scraping products from a specified category:\r\n```json\r\n{\r\n    \"title\": \"Quilted Makeup Bag Floral Makeup Bag Cotton Makeup Bag\",\r\n    \"brand\": \"WYJ\",\r\n    \"price\": 9.99,\r\n    \"currency\": \"USD\",\r\n    \"availability\": \"In Stock\",\r\n    \"rating\": 5,\r\n    \"reviews_count\": 1,\r\n    \"categories\": [\r\n        \"Beauty \u0026 Personal Care\",\r\n        \"Cosmetic Bags\"\r\n    ],\r\n    \"asin\": \"B0DC3WX7RM\",\r\n    \"seller_name\": \"yisenshangmaoyouxiangongsi\",\r\n    \"number_of_sellers\": 1,\r\n    \"url\": \"https://www.amazon.com/WYJ-Quilted-Coquette-Aesthetic-Blue/dp/B0DC3WX7RM\",\r\n    \"image_url\": \"https://m.media-amazon.com/images/I/71SI04tB6QL._AC_SL1500_.jpg\",\r\n    \"product_dimensions\": \"8.7\\\"L x 2.8\\\"W x 5.1\\\"H\",\r\n    \"item_weight\": \"2.5 Ounces\",\r\n    \"variations\": [\r\n        {\r\n            \"name\": \"Pink\",\r\n            \"asin\": \"B0DC3RKYPF\",\r\n            \"price\": 9.99\r\n        },\r\n        {\r\n            \"name\": \"Blue\",\r\n            \"asin\": \"B0DC3WX7RM\",\r\n            \"price\": 9.99\r\n        },\r\n        {\r\n            \"name\": \"Purple\",\r\n            \"asin\": \"B0DC47CDDT\",\r\n            \"price\": 9.99\r\n        }\r\n    ],\r\n    \"badge\": \"#1 New Release\",\r\n    \"top_review\": \"I love everything about this bag! It's made well and a good size. Super cute!\"\r\n}\r\n```\r\n\r\n#### Code Example:\r\nBelow is a Python script that triggers the collection of products from a specified category URL and stores the data in a JSON file:\r\n```python\r\nimport json\r\nimport requests\r\nimport time\r\n\r\n\r\ndef trigger_datasets(api_token, dataset_id, datasets):\r\n    headers = {\r\n        \"Authorization\": f\"Bearer {api_token}\",\r\n        \"Content-Type\": \"application/json\",\r\n    }\r\n\r\n    trigger_url = f\"https://api.brightdata.com/datasets/v3/trigger?dataset_id={dataset_id}\u0026type=discover_new\u0026discover_by=category_url\u0026limit_per_input=4\"\r\n\r\n    # Sending API request to trigger dataset collection\r\n    response = requests.post(trigger_url, headers=headers, data=json.dumps(datasets))\r\n\r\n    if response.status_code == 200:\r\n        print(\"Data collection triggered successfully!\")\r\n        snapshot_id = response.json().get(\"snapshot_id\")\r\n        return snapshot_id if snapshot_id else print(\"No snapshot ID returned.\")\r\n    else:\r\n        print(f\"Error: {response.status_code} - {response.text}\")\r\n        return None\r\n\r\n\r\ndef get_snapshot_data(api_token, snapshot_id):\r\n    headers = {\"Authorization\": f\"Bearer {api_token}\"}\r\n    snapshot_url = (\r\n        f\"https://api.brightdata.com/datasets/v3/snapshot/{snapshot_id}?format=json\"\r\n    )\r\n\r\n    # Polling until the snapshot data is ready\r\n    while True:\r\n        response = requests.get(snapshot_url, headers=headers)\r\n\r\n        if response.status_code == 200:\r\n            return response.json()\r\n        elif response.status_code == 202:\r\n            print(\"Snapshot still processing... retrying.\")\r\n        else:\r\n            print(f\"Error: {response.status_code} - {response.text}\")\r\n            return None\r\n        time.sleep(10)\r\n\r\n\r\ndef store_data(data, filename=\"amazon_bestsellers_data.json\"):\r\n    if data:\r\n        with open(filename, \"w\") as file:\r\n            json.dump(data, file, indent=4)\r\n        print(f\"Data saved in {filename}.\")\r\n    else:\r\n        print(\"No data to store.\")\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    API_TOKEN = \"YOUR_API_TOKEN\"\r\n    DATASET_ID = \"gd_l7q7dkf244hwjntr0\"\r\n\r\n    datasets = [\r\n        {\r\n            \"url\": \"https://www.amazon.com/s?i=luggage-intl-ship\",\r\n            \"sort_by\": \"Best Sellers\",\r\n            \"zipcode\": \"10001\",\r\n        },\r\n        {\r\n            \"url\": \"https://www.amazon.com/s?i=baby-products-intl-ship\",\r\n            \"sort_by\": \"Avg. Customer Review\",\r\n            \"zipcode\": \"\",\r\n        },\r\n        {\r\n            \"url\": \"https://www.amazon.com/s?rh=n%3A16225012011\u0026fs=true\u0026ref=lp_16225012011_sar\",\r\n            \"sort_by\": \"Price: Low to High\",\r\n            \"zipcode\": \"\",\r\n        },\r\n    ]\r\n\r\n    # Trigger dataset collection\r\n    snapshot_id = trigger_datasets(API_TOKEN, DATASET_ID, datasets)\r\n\r\n    if snapshot_id:\r\n        # Retrieve the data once the snapshot is ready\r\n        data = get_snapshot_data(API_TOKEN, snapshot_id)\r\n        if data:\r\n            store_data(data)\r\n```\r\n\r\nYou can view the full output by downloading [this sample JSON file](https://github.com/luminati-io/Amazon-scraper/blob/main/output_data/amazon_discover_by_category_url.json).\r\n\r\n### Amazon Products by Keyword\r\nDiscover products by using specific keywords.\r\n\r\n\u003cimg width=\"700\" alt=\"bright-data-web-scraper-api-discover-by-keyword\" src=\"https://github.com/luminati-io/Amazon-scraper/blob/main/images/bright-data-web-scraper-api-discover-by-keyword.png\"\u003e\r\n\r\n#### Key Input Parameters:\r\n| **Parameter** | **Type**  | **Description**                   | **Required** |\r\n|---------------|-----------|-----------------------------------|--------------|\r\n| `keyword`     | `string`  | The keyword to search for products | Yes          |\r\n\r\n#### Performance:\r\n- Average response time per input: 2 minutes 46 seconds\r\n\r\n#### Sample Output Data:\r\nHere’s an example of the output you will receive after searching for products using a keyword:\r\n\r\n```json\r\n{\r\n    \"title\": \"SYLVANIA ECO LED Light Bulb, A19 60W Equivalent, 750 Lumens, 2700K, Non-Dimmable, Frosted, Soft White - 8 Count (Pack of 1)\",\r\n    \"brand\": \"LEDVANCE\",\r\n    \"seller_name\": \"Amazon.com\",\r\n    \"initial_price\": 13.99,\r\n    \"final_price\": 12.12,\r\n    \"currency\": \"USD\",\r\n    \"discount\": \"-13%\",\r\n    \"rating\": 4.7,\r\n    \"reviews_count\": 48418,\r\n    \"availability\": \"In Stock\",\r\n    \"url\": \"https://www.amazon.com/Sylvania-40821-Equivalent-Efficient-Temperature/dp/B08FRSS4BF\",\r\n    \"image_url\": \"https://m.media-amazon.com/images/I/81wKhRO66oL._AC_SL1500_.jpg\",\r\n    \"delivery\": [\r\n        \"FREE delivery Friday, October 25 on orders shipped by Amazon over $35\",\r\n        \"Or Prime members get FREE delivery Tomorrow, October 21. Order within 8 hrs 8 mins. Join Prime\"\r\n    ],\r\n    \"features\": [\r\n        \"60W Incandescent Replacement Bulb - 750 Lumens\",\r\n        \"Long-lasting – 7 years lifespan\",\r\n        \"Energy-saving – Estimated energy cost of $1.08 per year\"\r\n    ],\r\n    \"discovery_input\": {\r\n        \"keyword\": \"light bulb\"\r\n    },\r\n    \"input\": {\r\n        \"url\": \"https://www.amazon.com/Sylvania-40821-Equivalent-Efficient-Temperature/dp/B08FRSS4BF\"\r\n    }\r\n}\r\n```\r\n#### Code Example:\r\nBelow is a Python script that triggers the collection of Amazon products based on a keyword and stores the results in a JSON file:\r\n```python\r\nimport json\r\nimport requests\r\nimport time\r\n\r\n\r\ndef trigger_datasets(\r\n    api_token, dataset_id, datasets, dataset_type=\"discover_new\", discover_by=\"keyword\"\r\n):\r\n    headers = {\r\n        \"Authorization\": f\"Bearer {api_token}\",\r\n        \"Content-Type\": \"application/json\",\r\n    }\r\n\r\n    trigger_url = f\"https://api.brightdata.com/datasets/v3/trigger?dataset_id={dataset_id}\u0026type={dataset_type}\u0026discover_by={discover_by}\"\r\n\r\n    # Sending API request to trigger dataset collection\r\n    response = requests.post(trigger_url, headers=headers, data=json.dumps(datasets))\r\n\r\n    if response.status_code == 200:\r\n        print(\"Data collection triggered successfully!\")\r\n        snapshot_id = response.json().get(\"snapshot_id\")\r\n        return snapshot_id if snapshot_id else print(\"No snapshot ID returned.\")\r\n    else:\r\n        print(f\"Error: {response.status_code} - {response.text}\")\r\n        return None\r\n\r\n\r\ndef get_snapshot_data(api_token, snapshot_id):\r\n    headers = {\"Authorization\": f\"Bearer {api_token}\"}\r\n    snapshot_url = (\r\n        f\"https://api.brightdata.com/datasets/v3/snapshot/{snapshot_id}?format=json\"\r\n    )\r\n\r\n    # Polling until the snapshot data is ready\r\n    while True:\r\n        response = requests.get(snapshot_url, headers=headers)\r\n\r\n        if response.status_code == 200:\r\n            return response.json()\r\n        elif response.status_code == 202:\r\n            print(\"Snapshot still processing... retrying.\")\r\n        else:\r\n            print(f\"Error: {response.status_code} - {response.text}\")\r\n            return None\r\n        time.sleep(10)\r\n\r\n\r\ndef store_data(data, filename=\"amazon_keyword_data.json\"):\r\n    if data:\r\n        with open(filename, \"w\") as file:\r\n            json.dump(data, file, indent=4)\r\n        print(f\"Data saved in {filename}.\")\r\n    else:\r\n        print(\"No data to store.\")\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    API_TOKEN = \"API_TOKEN\"\r\n    DATASET_ID = \"gd_l7q7dkf244hwjntr0\"\r\n\r\n    # Define the dataset with keywords\r\n    datasets = [{\"keyword\": \"light bulb\"}, {\"keyword\": \"dog toys\"}]\r\n\r\n    # Trigger dataset collection\r\n    snapshot_id = trigger_datasets(API_TOKEN, DATASET_ID, datasets)\r\n\r\n    if snapshot_id:\r\n        # Retrieve the data once the snapshot is ready\r\n        data = get_snapshot_data(API_TOKEN, snapshot_id)\r\n        if data:\r\n            store_data(data)\r\n```\r\n\r\nYou can view the full output by downloading [this sample JSON file](https://github.com/luminati-io/Amazon-scraper/blob/main/output_data/amazon_keyword_data.json).\r\n\r\n### Amazon Products Global Dataset\r\nCollect product data across all major Amazon domains by providing a URL.\r\n\r\n\u003cimg width=\"700\" alt=\"bright-data-web-scraper-api-amazon-product-global-dataset\" src=\"https://github.com/luminati-io/Amazon-scraper/blob/main/images/bright-data-web-scraper-api-amazon-product-global-dataset.png\"\u003e\r\n\r\n\r\n#### Key Input Parameters:\r\n| **Parameter** | **Type**  | **Description**           | **Required** |\r\n|---------------|-----------|---------------------------|--------------|\r\n| `url`         | `string`  | The Amazon product URL     | Yes          |\r\n\r\n#### Performance:\r\n- **Average response time per input**: Less than 1 second\r\n\r\n#### Sample Output Data:\r\nHere’s an example of the output you will receive after collecting product data:\r\n\r\n```json\r\n{\r\n    \"title\": \"Toys of Wood Oxford Wooden Stacking Rings – Learning to Count – Counting Game with 45 Rings – Wooden Toy for Ages 3 and Above\",\r\n    \"brand\": \"Toys of Wood Oxford\",\r\n    \"seller_name\": \"Toys of Wood Oxford\",\r\n    \"initial_price\": 23.99,\r\n    \"currency\": \"EUR\",\r\n    \"final_price\": 23.99,\r\n    \"availability\": \"Only 20 left in stock.\",\r\n    \"rating\": 4.5,\r\n    \"reviews_count\": 1677,\r\n    \"asin\": \"B078TNNZK3\",\r\n    \"url\": \"https://www.amazon.de/dp/B078TNNZK3?th=1\u0026psc=1\",\r\n    \"image_url\": \"https://m.media-amazon.com/images/I/815t1-d+7BL._AC_SL1500_.jpg\",\r\n    \"product_dimensions\": \"43.31 x 11.61 x 11.51 cm; 830 g\",\r\n    \"categories\": [\r\n        \"Toys\",\r\n        \"Baby \u0026 Toddler Toys\",\r\n        \"Early Development \u0026 Activity Toys\",\r\n        \"Sorting, Stacking \u0026 Plugging Toys\"\r\n    ],\r\n    \"delivery\": [\r\n        \"FREE delivery Friday, 25 October on eligible first order\",\r\n        \"Or fastest delivery Thursday, 24 October. Order within 4 hrs 40 mins\"\r\n    ],\r\n    \"features\": [\r\n        \"Sturdy and stable base plate with 9 pins and 45 beautiful large wooden rings and 10 removable square number plates in rainbow colours.\",\r\n        \"Great for learning counting, sorting, and matching colors and numbers, as well as practicing simple mathematics.\",\r\n        \"Made from sustainable wood with eco-friendly and non-toxic paints. Complies with EN71 / CPSA standards.\"\r\n    ],\r\n    \"top_review\": \"Sehr lehrreich\",\r\n    \"variations\": [\r\n        {\r\n            \"name\": \"Caterpillar Threading Toy\",\r\n            \"price\": 13.99,\r\n            \"currency\": \"EUR\"\r\n        },\r\n        {\r\n            \"name\": \"Pack of 15\",\r\n            \"price\": 16.99,\r\n            \"currency\": \"EUR\"\r\n        },\r\n        {\r\n            \"name\": \"Pack of 45\",\r\n            \"price\": 23.99,\r\n            \"currency\": \"EUR\"\r\n        }\r\n    ],\r\n    \"product_rating_object\": {\r\n        \"one_star\": 35,\r\n        \"two_star\": 0,\r\n        \"three_star\": 82,\r\n        \"four_star\": 227,\r\n        \"five_star\": 1308\r\n    }\r\n}\r\n```\r\n#### Code Example:\r\nBelow is a Python script that triggers the collection of products across all major Amazon domains and stores the results in a JSON file:\r\n```python\r\nimport json\r\nimport requests\r\nimport time\r\n\r\n\r\ndef trigger_datasets(\r\n    api_token, dataset_id, datasets, dataset_type=\"trigger\", discover_by=\"url\"\r\n):\r\n    headers = {\r\n        \"Authorization\": f\"Bearer {api_token}\",\r\n        \"Content-Type\": \"application/json\",\r\n    }\r\n\r\n    trigger_url = f\"https://api.brightdata.com/datasets/v3/trigger?dataset_id={\r\n        dataset_id}\u0026type={dataset_type}\u0026discover_by={discover_by}\"\r\n\r\n    # Sending API request to trigger dataset collection\r\n    response = requests.post(trigger_url, headers=headers, data=json.dumps(datasets))\r\n\r\n    if response.status_code == 200:\r\n        print(\"Data collection triggered successfully!\")\r\n        snapshot_id = response.json().get(\"snapshot_id\")\r\n        return snapshot_id if snapshot_id else print(\"No snapshot ID returned.\")\r\n    else:\r\n        print(f\"Error: {response.status_code} - {response.text}\")\r\n        return None\r\n\r\n\r\ndef get_snapshot_data(api_token, snapshot_id):\r\n    headers = {\"Authorization\": f\"Bearer {api_token}\"}\r\n    snapshot_url = f\"https://api.brightdata.com/datasets/v3/snapshot/{\r\n        snapshot_id}?format=json\"\r\n\r\n    # Polling until the snapshot data is ready\r\n    while True:\r\n        response = requests.get(snapshot_url, headers=headers)\r\n\r\n        if response.status_code == 200:\r\n            return response.json()\r\n        elif response.status_code == 202:\r\n            print(\"Snapshot still processing... retrying.\")\r\n        else:\r\n            print(f\"Error: {response.status_code} - {response.text}\")\r\n            return None\r\n        time.sleep(10)\r\n\r\n\r\ndef store_data(data, filename=\"amazon_products_global_dataset.json\"):\r\n    if data:\r\n        with open(filename, \"w\") as file:\r\n            json.dump(data, file, indent=4)\r\n        print(f\"Data saved in {filename}.\")\r\n    else:\r\n        print(\"No data to store.\")\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    API_TOKEN = \"API_TOKEN\"\r\n    DATASET_ID = \"gd_lwhideng15g8jg63s7\"\r\n\r\n    # Define the dataset with URLs\r\n    datasets = [\r\n        {\"url\": \"https://www.amazon.com/dp/B0CHHSFMRL/\"},\r\n        {\r\n            \"url\": \"https://www.amazon.de/-/en/dp/B078TNNZK3/ref=sspa_dk_browse_2/?_encoding=UTF8\u0026ie=UTF8\u0026sp_csd=d2lkZ2V0TmFtZT1zcF9icm93c2VfdGhlbWF0aWM%3D\u0026pd_rd_w=fHlOu\u0026content-id=amzn1.sym.642a11a6-0e1e-47fa-93c2-5dc9d607a7a1\u0026pf_rd_p=642a11a6-0e1e-47fa-93c2-5dc9d607a7a1\u0026pf_rd_r=4JX920KFM8Q7PR83HJ7V\u0026pd_rd_wg=K1OVN\u0026pd_rd_r=be656f87-1a09-4144-b7cf-4e932d6a73c4\u0026ref_=sspa_dk_browse\u0026th=1\"\r\n        },\r\n        {\r\n            \"url\": \"https://www.amazon.co.jp/X-TRAK-Folding-Bicycle-Carbon-Adjustable/dp/B0CWV9YTLV/ref=sr_1_1_sspa?crid=3MKZ2ALHSLFOM\u0026dib=eyJ2IjoiMSJ9.YnBVPwJ7nLxlNGHktwDTFM5v2evnsXlnZTJHJKuG8dLeeRCILpy0Knr3ofiKpUGQYi6xR6y4tgdtal85DJ8u6DD_n9r1oVCXdVo0NFmNAfStU6E-MhBig5p_gZGjluAYv5HgUIoEPl0v3iMiRxZNRfivqB-utxOkPOOfXIBHLemry17XcltUDTQqtJv-kP-ZqdP29mjD2cRlbkALtHPKU44MvBC9WUrNcUHAMrlAxtTAByuriywMqz-w2P0HCeehcZTJ1EiLf2VR8cxCiwuaUbIOU3tr1kDN6D7yYPrgRn4.6AOdSmJsksZkqLg8kNM6EvWxIFOijCsP2zo5NLHn1P4\u0026dib_tag=se\u0026keywords=Bicycles\u0026qid=1716973495\u0026sprefix=%2Caps%2C851\u0026sr=8-1-spons\u0026sp_csd=d2lkZ2V0TmFtZT1zcF9hdGY\u0026psc=1\"\r\n        },\r\n        {\r\n            \"url\": \"https://www.amazon.in/Watches-Women%EF%BC%8CLadies-Stainless-Waterproof-Luminous/dp/B0D31HBWG1/ref=sr_1_2_sspa?dib=eyJ2IjoiMSJ9.1zFa2vTCZdD-bv6Knt_pWqvcRZPSSTPDwgMClRJNsWqdyGdCmryjEAfWpd-ZhwhC3vvNx9A0G2Gt1R952e7huzlukge2bmJETNf-kHBoWS5kV6g0pUVapEyDOEAGcw5ZvWlkeuLQ9oIwuhckRC6ARCt2yglYV-1HpP7lVGXotK6K6tjrdKxUSAOZJSXeOGP3dGuYPTjo9sllOrwA7FC2GG00aDcsSTzURENFj1c2rS-vNHkYmxOL1JYuwDWK2PJdMpsmkJw3jeMdgaiw7jG5ppMfAjwiETVldQzhHGVUFV8.manfNZwtTUhvDuSGdh32APM1_SmnNiKgOGabyA7rXBo\u0026dib_tag=se\u0026qid=1716973272\u0026rnid=2563505031\u0026s=watch\u0026sr=1-2-spons\u0026sp_csd=d2lkZ2V0TmFtZT1zcF9hdGZfYnJvd3Nl\u0026psc=1\"\r\n        },\r\n    ]\r\n\r\n    # Trigger dataset collection\r\n    snapshot_id = trigger_datasets(API_TOKEN, DATASET_ID, datasets)\r\n\r\n    if snapshot_id:\r\n        # Retrieve the data once the snapshot is ready\r\n        data = get_snapshot_data(API_TOKEN, snapshot_id)\r\n        if data:\r\n            store_data(data)\r\n```\r\n\r\nYou can view the full output by downloading [this sample JSON file](https://github.com/luminati-io/Amazon-scraper/blob/main/output_data/amazon_products_global_dataset.json).\r\n\r\n### Amazon Products Global Dataset - Discover by Category URL\r\nDiscover products by providing a specific category URL.\r\n\r\n\u003cimg width=\"700\" alt=\"bright-data-web-scraper-api-amazon-product-global-category-url\" src=\"https://github.com/luminati-io/Amazon-scraper/blob/main/images/bright-data-web-scraper-api-amazon-product-global-category-url.png\"\u003e\r\n\r\n\r\n#### Key Input Parameters:\r\n| **Parameter** | **Type** | **Description**                               | **Required** |\r\n|---------------|----------|-----------------------------------------------|--------------|\r\n| `url`         | `string` | The category URL from which to scrape products | Yes          |\r\n| `sort_by`     | `string` |Criteria for sorting the results               | No           |\r\n| `zipcode`     | `string` | Zip code for location-specific results         | No           |\r\n\r\n#### Performance:\r\n- Average response time per input: 3 minutes 57 seconds\r\n\r\n#### Sample Output Data:\r\nHere’s an example of the output you will receive after collecting product data:\r\n```json\r\n{\r\n    \"title\": \"De'Longhi Stilosa EC230.BK, Traditional Barista Pump Espresso Machine, Espresso and Cappuccino, 2 cups, Black\",\r\n    \"brand\": \"De'Longhi\",\r\n    \"seller_name\": \"Hughes Electrical\",\r\n    \"initial_price\": 104.99,\r\n    \"final_price\": 94,\r\n    \"currency\": \"GBP\",\r\n    \"availability\": \"Only 1 left in stock.\",\r\n    \"rating\": 3.9,\r\n    \"reviews_count\": 395,\r\n    \"asin\": \"B085J8LV4F\",\r\n    \"url\": \"https://www.amazon.co.uk/dp/B085J8LV4F?th=1\u0026psc=1\",\r\n    \"image_url\": \"https://m.media-amazon.com/images/I/715gqhkOEiL._AC_SL1500_.jpg\",\r\n    \"categories\": [\r\n        \"Cooking \u0026 Dining\",\r\n        \"Coffee, Tea \u0026 Espresso\",\r\n        \"Coffee Machines\",\r\n        \"Espresso \u0026 Cappuccino Machines\"\r\n    ],\r\n    \"delivery\": [\r\n        \"FREE delivery 25 - 28 October\",\r\n        \"Or fastest delivery Tomorrow, 22 October. Order within 3 hrs 59 mins\"\r\n    ],\r\n    \"features\": [\r\n        \"Unleash your inner barista and create all your coffee shop favourites at home\",\r\n        \"15-bar pump espresso maker with a stainless steel boiler for perfect coffee extraction\",\r\n        \"Steam arm to create frothy cappuccinos and smooth lattes\",\r\n        \"Combination of matt and glossy black finish with an anti-drip system\"\r\n    ],\r\n    \"input\": {\r\n        \"url\": \"https://www.amazon.co.uk/DeLonghi-EC230-BK-Traditional-Espresso-Cappuccino/dp/B085J8LV4F/ref=sr_1_4\"\r\n    },\r\n    \"discovery_input\": {\r\n        \"url\": \"https://www.amazon.co.uk/b/?_encoding=UTF8\u0026node=10706951\u0026ref_=Oct_d_odnav_d_13528598031_1\",\r\n        \"sort_by\": \"Best Sellers\",\r\n        \"zipcode\": \"\"\r\n    }\r\n}\r\n```\r\n#### Code Example:\r\nBelow is a Python script that triggers the collection of products by category URL and stores the results in a JSON file:\r\n```python\r\nimport json\r\nimport requests\r\nimport time\r\n\r\n\r\ndef trigger_datasets(api_token, dataset_id, datasets, dataset_type=\"discover_new\", discover_by=\"category_url\", limit_per_input=4):\r\n    headers = {\r\n        \"Authorization\": f\"Bearer {api_token}\",\r\n        \"Content-Type\": \"application/json\",\r\n    }\r\n\r\n    trigger_url = f\"https://api.brightdata.com/datasets/v3/trigger?dataset_id={dataset_id}\u0026type={\r\n        dataset_type}\u0026discover_by={discover_by}\u0026limit_per_input={limit_per_input}\"\r\n\r\n    # Sending API request to trigger dataset collection\r\n    response = requests.post(\r\n        trigger_url, headers=headers, data=json.dumps(datasets))\r\n\r\n    if response.status_code == 200:\r\n        print(\"Data collection triggered successfully!\")\r\n        snapshot_id = response.json().get(\"snapshot_id\")\r\n        return snapshot_id if snapshot_id else print(\"No snapshot ID returned.\")\r\n    else:\r\n        print(f\"Error: {response.status_code} - {response.text}\")\r\n        return None\r\n\r\n\r\ndef get_snapshot_data(api_token, snapshot_id):\r\n    headers = {\"Authorization\": f\"Bearer {api_token}\"}\r\n    snapshot_url = f\"https://api.brightdata.com/datasets/v3/snapshot/{\r\n        snapshot_id}?format=json\"\r\n\r\n    # Polling until the snapshot data is ready\r\n    while True:\r\n        response = requests.get(snapshot_url, headers=headers)\r\n\r\n        if response.status_code == 200:\r\n            return response.json()\r\n        elif response.status_code == 202:\r\n            print(\"Snapshot still processing... retrying.\")\r\n        else:\r\n            print(f\"Error: {response.status_code} - {response.text}\")\r\n            return None\r\n        time.sleep(10)\r\n\r\n\r\ndef store_data(data, filename=\"amazon_category_url_data.json\"):\r\n    if data:\r\n        with open(filename, \"w\") as file:\r\n            json.dump(data, file, indent=4)\r\n        print(f\"Data saved in {filename}.\")\r\n    else:\r\n        print(\"No data to store.\")\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    API_TOKEN = \"API_TOKEN\"\r\n    DATASET_ID = \"gd_lwhideng15g8jg63s7\"\r\n\r\n    # Define the dataset with category URLs, sort_by, and zipcodes\r\n    datasets = [\r\n        {\"url\": \"https://www.amazon.com/s?i=luggage-intl-ship\",\r\n            \"sort_by\": \"Featured\", \"zipcode\": \"10001\"},\r\n        {\"url\": \"https://www.amazon.de/-/en/b/?node=1981001031\u0026ref_=Oct_d_odnav_d_355007011_2\u0026pd_rd_w=OjE3S\u0026content-id=amzn1.sym.0069bc39-a323-47d6-a8fb-7558e4a563e4\u0026pf_rd_p=0069bc39-a323-47d6-a8fb-7558e4a563e4\u0026pf_rd_r=6YXZ7HGFNNEAF0GSDPDH\u0026pd_rd_wg=0yR1G\u0026pd_rd_r=a95cb46c-78ef-4b7b-845d-49fe04556440\", \"sort_by\": \"Price: Low to High\", \"zipcode\": \"\"},\r\n        {\"url\": \"https://www.amazon.co.uk/b/?_encoding=UTF8\u0026node=10706951\u0026bbn=11052681\u0026ref_=Oct_d_odnav_d_13528598031_1\u0026pd_rd_w=LghVp\u0026content-id=amzn1.sym.7414f21e-2c95-4394-9a75-8c1b3641bcea\u0026pf_rd_p=7414f21e-2c95-4394-9a75-8c1b3641bcea\u0026pf_rd_r=EE0PQWMSY2J0G8M032EB\u0026pd_rd_wg=7snrU\u0026pd_rd_r=349e1e79-8bf8-4e00-947d-17eab2942b8d\", \"sort_by\": \"Best Sellers\", \"zipcode\": \"\"},\r\n        {\"url\": \"https://www.amazon.co.jp/-/en/b/?node=377403011\u0026ref_=Oct_d_odnav_d_15314601_0\u0026pd_rd_w=ajUV4\u0026content-id=amzn1.sym.0d505cca-fde9-497c-b5f8-e827c26fad17\u0026pf_rd_p=0d505cca-fde9-497c-b5f8-e827c26fad17\u0026pf_rd_r=92HSETNKKN3RTA615BV7\u0026pd_rd_wg=AwOOk\u0026pd_rd_r=629211d8-6768-478c-94a2-829a0a0ca2a6\", \"sort_by\": \"\", \"zipcode\": \"\"}\r\n    ]\r\n\r\n    # Trigger dataset collection\r\n    snapshot_id = trigger_datasets(API_TOKEN, DATASET_ID, datasets)\r\n\r\n    if snapshot_id:\r\n        # Retrieve the data once the snapshot is ready\r\n        data = get_snapshot_data(API_TOKEN, snapshot_id)\r\n        if data:\r\n            store_data(data)\r\n```\r\nYou can view the full output by downloading [this sample JSON file](https://github.com/luminati-io/Amazon-scraper/blob/main/output_data/amazon_product_global_category_url.json).\r\n\r\n### Amazon Products Global Dataset - Discover by Keywords\r\nDiscover products by using specific keywords across Amazon domains.\r\n\r\n\u003cimg width=\"700\" alt=\"bright-data-web-scraper-api-amazon_global_dataset_by_keyword\" src=\"https://github.com/luminati-io/Amazon-scraper/blob/main/images/bright-data-web-scraper-api-amazon_global_dataset_by_keyword.png\"\u003e\r\n\r\n#### Key Input Parameters:\r\n| **Parameter**      | **Type**   | **Description**                            | **Required** |\r\n|--------------------|------------|--------------------------------------------|--------------|\r\n| `keywords`         | `string`   | The keyword to search for products         | Yes          |\r\n| `domain`           | `string`   | Amazon domain to search within             | Yes          |\r\n| `pages_to_search`  | `number`   | Number of pages to search                  | No           |\r\n\r\n#### Performance:\r\n- Average response time per input: 56 seconds\r\n\r\n#### Sample Output Data:\r\nHere’s an example of the output you will receive after performing a keyword search for products:\r\n```json\r\n{\r\n    \"title\": \"Mitutoyo 500-197-30 Electronic Digital Caliper AOS Absolute Scale Digital Caliper, 0 to 8\\\"/0 to 200mm Measuring Range, 0.0005\\\"/0.01mm Resolution\",\r\n    \"brand\": \"Mitutoyo\",\r\n    \"seller_name\": \"Everly Home \u0026 Gift\",\r\n    \"initial_price\": 157.97,\r\n    \"final_price\": 137.77,\r\n    \"currency\": \"USD\",\r\n    \"availability\": \"In Stock\",\r\n    \"rating\": 4.8,\r\n    \"reviews_count\": 88,\r\n    \"asin\": \"B01N6C3EGR\",\r\n    \"url\": \"https://www.amazon.com/dp/B01N6C3EGR?th=1\u0026psc=1\",\r\n    \"image_url\": \"https://m.media-amazon.com/images/I/61Gigoh3LbL._SL1500_.jpg\",\r\n    \"categories\": [\r\n        \"Industrial \u0026 Scientific\",\r\n        \"Test, Measure \u0026 Inspect\",\r\n        \"Dimensional Measurement\",\r\n        \"Calipers\",\r\n        \"Digital Calipers\"\r\n    ],\r\n    \"delivery\": [\r\n        \"FREE delivery Saturday, October 26\",\r\n        \"Or Prime members get FREE delivery Tomorrow, October 22\"\r\n    ],\r\n    \"features\": [\r\n        \"Hardened stainless steel construction for protection of caliper components\",\r\n        \"Digital, single-value readout LCD display in metric units for readability\",\r\n        \"Measuring Range 0 to 8\\\"/0 to 200mm\",\r\n        \"Measurement Accuracy +/-0.001\",\r\n        \"Resolution 0.0005\\\"/0.01mm\"\r\n    ],\r\n    \"input\": {\r\n        \"url\": \"https://www.amazon.com/Mitutoyo-500-197-30-Electronic-Measuring-Resolution/dp/B01N6C3EGR\"\r\n    },\r\n    \"discovery_input\": {\r\n        \"keywords\": \"Mitutoyo\",\r\n        \"domain\": \"https://www.amazon.com\",\r\n        \"pages_to_search\": 1\r\n    }\r\n}\r\n```\r\n#### Code Example:\r\nBelow is a Python script that triggers the collection of products by keyword search and stores the results in a JSON file:\r\n```python\r\nimport json\r\nimport requests\r\nimport time\r\n\r\n\r\ndef trigger_datasets(\r\n    api_token, dataset_id, datasets, dataset_type=\"discover_new\", discover_by=\"keywords\"\r\n):\r\n    headers = {\r\n        \"Authorization\": f\"Bearer {api_token}\",\r\n        \"Content-Type\": \"application/json\",\r\n    }\r\n\r\n    trigger_url = f\"https://api.brightdata.com/datasets/v3/trigger?dataset_id={\r\n        dataset_id}\u0026type={dataset_type}\u0026discover_by={discover_by}\"\r\n\r\n    # Sending API request to trigger dataset collection\r\n    response = requests.post(trigger_url, headers=headers, data=json.dumps(datasets))\r\n\r\n    if response.status_code == 200:\r\n        print(\"Data collection triggered successfully!\")\r\n        snapshot_id = response.json().get(\"snapshot_id\")\r\n        return snapshot_id if snapshot_id else print(\"No snapshot ID returned.\")\r\n    else:\r\n        print(f\"Error: {response.status_code} - {response.text}\")\r\n        return None\r\n\r\n\r\ndef get_snapshot_data(api_token, snapshot_id):\r\n    headers = {\"Authorization\": f\"Bearer {api_token}\"}\r\n    snapshot_url = f\"https://api.brightdata.com/datasets/v3/snapshot/{\r\n        snapshot_id}?format=json\"\r\n\r\n    # Polling until the snapshot data is ready\r\n    while True:\r\n        response = requests.get(snapshot_url, headers=headers)\r\n\r\n        if response.status_code == 200:\r\n            return response.json()\r\n        elif response.status_code == 202:\r\n            print(\"Snapshot still processing... retrying.\")\r\n        else:\r\n            print(f\"Error: {response.status_code} - {response.text}\")\r\n            return None\r\n        time.sleep(10)\r\n\r\n\r\ndef store_data(data, filename=\"amazon_global_dataset_by_keyword.json\"):\r\n    if data:\r\n        with open(filename, \"w\") as file:\r\n            json.dump(data, file, indent=4)\r\n        print(f\"Data saved in {filename}.\")\r\n    else:\r\n        print(\"No data to store.\")\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    API_TOKEN = \"YOUR_API_TOKEN\"\r\n    DATASET_ID = \"gd_lwhideng15g8jg63s7\"\r\n\r\n    # Define the dataset with keywords, domain, and pages_to_search\r\n    datasets = [\r\n        {\r\n            \"keywords\": \"Mitutoyo\",\r\n            \"domain\": \"https://www.amazon.com\",\r\n            \"pages_to_search\": 1,\r\n        },\r\n        {\r\n            \"keywords\": \"smart watch\",\r\n            \"domain\": \"https://www.amazon.co.uk\",\r\n            \"pages_to_search\": 2,\r\n        },\r\n        {\r\n            \"keywords\": \"football\",\r\n            \"domain\": \"https://www.amazon.in\",\r\n            \"pages_to_search\": 4,\r\n        },\r\n        {\r\n            \"keywords\": \"baby cloth\",\r\n            \"domain\": \"https://www.amazon.de\",\r\n            \"pages_to_search\": 3,\r\n        },\r\n    ]\r\n\r\n    # Trigger dataset collection\r\n    snapshot_id = trigger_datasets(API_TOKEN, DATASET_ID, datasets)\r\n\r\n    if snapshot_id:\r\n        # Retrieve the data once the snapshot is ready\r\n        data = get_snapshot_data(API_TOKEN, snapshot_id)\r\n        if data:\r\n            store_data(data)\r\n```\r\nYou can view the full output by downloading [this sample JSON file](https://github.com/luminati-io/Amazon-scraper/blob/main/output_data/amazon_global_dataset_by_keyword.json).\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluminati-io%2Famazon-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fluminati-io%2Famazon-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluminati-io%2Famazon-scraper/lists"}