https://github.com/oxylabs/ecommerce-category-scraper
AI-Powered E-commerce Category Scraper with AI Studio
https://github.com/oxylabs/ecommerce-category-scraper
ai-studio data-extraction ecommerce low-code price-comparison web-scraping
Last synced: 4 months ago
JSON representation
AI-Powered E-commerce Category Scraper with AI Studio
- Host: GitHub
- URL: https://github.com/oxylabs/ecommerce-category-scraper
- Owner: oxylabs
- Created: 2025-09-22T13:33:52.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-10-01T10:30:13.000Z (9 months ago)
- Last Synced: 2025-10-01T11:32:36.985Z (9 months ago)
- Topics: ai-studio, data-extraction, ecommerce, low-code, price-comparison, web-scraping
- Language: Python
- Homepage: https://aistudio.oxylabs.io/
- Size: 963 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Ecommerce Category Scraper
[](https://aistudio.oxylabs.io/?utm_source=877&utm_medium=affiliate&utm_campaign=ai_studio&groupid=877&utm_content=ai-studio-js-github&transaction_id=102f49063ab94276ae8f116d224b67)
[](https://discord.gg/Pds3gBmKMH) [](https://www.youtube.com/@oxylabs)
## π E-commerce Category Scraper
AI-Powered E-commerce Category Scraper with AI Studio
The E-commerce Category Scraper is an AI-powered, open-source tool built using Oxylabs AI Studio. It automates and streamlines data extraction from e-commerce websites, making it accessible to developers of all skill levels. This solution can also be adapted as a scalable price comparison tool, perfect for analyzing competitor pricing and market trends.
What problems does this tool solve?
- Scraping all products from ecommerce categories without writing custom code;
- Automatically locating ecommerce categories and scraping their products;
## π Key features
- **Cost optimization**: AI Studio ensures task-based scalability.
- **Pre-built solution**: A ready-to-use open-source tool for faster adoption and seamless integration.
- **Low-code automation**: Create automated workflows for web scraping and parsing without needing advanced coding skills.
- **AI-powered capabilities**: Extract structured web data with minimal effort using Oxylabsβ AI Studio apps.
- **Enterprise-ready infrastructure**: Handle CAPTCHAs, bypass IP blocks, and navigate dynamic content effortlessly.
- **Flexible scalability**: Perfect for small tasks using free AI Studio credits or scaling to enterprise-level projects.
## π€ How it works
- **Browser Agent**: Locates category in the website and collects all category pagination URLs.
- **AI-Scraper**: Extracts all product URLs from category listing pages.
- **AI-Scraper**: Extracts structured product data like pricing, titles, and stock availability. Based on user prompt or JSON schema.
- **Final Output**: Clean, structured datasets ready for use in analytics, reporting, or pricing workflows. Which can be saved to JSON file or returned to the user programatically.
## β
Prerequisites
Before you begin, make sure you have Oxylabs AI studio API key. Obtain your API key from [Oxylabs AI Studio](https://aistudio.oxylabs.io/settings/api-key). (1000 credits free).
## π¦ Instalation
- Open your terminal.
- Install the uv package manager:
```bash
# macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
```
- Clone the repository:
```bash
git clone https://github.com/oxylabs/ecommerce-category-scraper.git
```
- Navigate to the repository:
```bash
cd ecommerce-category-scraper
```
- Install the dependencies:
```bash
uv sync
```
- Enable the virtual environment:
```bash
source .venv/bin/activate
```
## π§ͺ Running Tests
Both tests scrape books from `books.toscrape.com` (Sequential art category), extracting book name, price, UPC code, and availability.
- **Test 1:** Accepts a category URL directly and scrapes products from that specific category page.
Replace `` with your actual API key.
```bash
python -m test.test_1 --oxylabs-ai-studio-api-key
```
Results are saved to `test_1_results.json`. View with:
```bash
cat test_1_results.json | python -m json.tool
```
- **Test 2:** Accepts an ecommerce domain URL and automatically searches for and identifies category pages before scraping.
Replace `` with your actual API key.
```bash
python -m test.test_2 --oxylabs-ai-studio-api-key
```
Results are saved to `test_2_results.json`. View with:
```bash
cat test_2_results.json | python -m json.tool
```
**Note:** Modify tests for different domains and settings.
## π Python Interface
Use `scrape_category` function to integrate the scraper into your code.
```python
from ecommerce_category_scraper.process import scrape_category
result = await scrape_category(
api_key="your-api-key",
category_url="https://example.com/category", # OR use ecommerce_domain_url
parsing_prompt="Extract product name, price, and rating",
max_products=50,
)
```
### Parameters
**Required:**
- `api_key` - Oxylabs AI Studio API key
**Category selection (choose one):**
- `category_url` - Direct category URL (starts gathering product URLs immediately)
- `ecommerce_domain_url` + `category_description_prompt` - Domain URL and description of category to search for
**Parsing (choose one):**
- `parsing_prompt` - Text description of data to extract
- `json_schema` - JSON schema for structured extraction (more reliable and deterministic)
**Optional:**
- `geo_location` - IP location in ISO2 format (e.g., `"US"`)
- `render_javascript` - Enable JavaScript rendering (default: `False`)
- `json_filepath` - Save results to file (if not provided, returns list)
- `max_pages` - Maximum category pages to scrape (default: all)
- `max_products` - Maximum products to scrape (default: all)
## π Practical use cases
- **Price comparison tool**: Automate workflows to compare competitor prices by category or region.
- **Price monitoring**: Regularly track competitor pricing trends and fluctuations.
- **Market intelligence**: Collect data for competitive and industry analysis.
- **E-commerce scraping**: Extract essential product details for AI applications or business intelligence.
- **Product detail extraction**: Automate the retrieval of pricing, inventory, and product descriptions.
## π FAQ
- **Can I scrape any website using this tool?**
This tool can scrape most websites, but scraping capabilities depend on adhering to the website's legal and technical restrictions.
- **Is this tool free?**
Yes, the E-commerce Category Scraper is open-source and free to use. Smaller tasks are powered by AI Studioβs free credits, while flexible plans allow scaling for larger workflows.
- **Do I need advanced coding skills to use this tool?**
Advanced coding skills are not required. AI-powered code editor simplifies integration, making it accessible for engineers with basic coding experience.
- **Can I customize this scraper for my needs?**
Yes, the open-source solution can be fully customized to meet specific workflow or business requirements.
- **What are AI Studio free credits?**
AI Studio offers free credits for smaller tasks. For scaling beyond free credits, users can subscribe to flexible plans.
## π₯ Showcased at Oxycon 2025
This E-commerce Category Scraper was featured live at Oxycon 2025. The presentation demonstrated how AI Studio can be used to easily build real-time price comparison tool while showcasing how developers can create scalable scraping workflows for various e-commerce tasks.
## π Learn more
For a deeper dive into features, integrations, and examples, and documentation, visit the [AI Studio](https://aistudio.oxylabs.io/) website.
## π¬ Contact us
If you have questions or need support, reach out to us at hello@oxylabs.io, through [live chat](https://oxylabs.drift.click/oxybot), or join our [Discord community](https://discord.com/invite/Pds3gBmKMH).