An open API service indexing awesome lists of open source software.

https://github.com/aniket-16-s/product-sraper

Scrapping product information from well known websites.
https://github.com/aniket-16-s/product-sraper

automation chromedriver ecommerce-website selenium selenium-python selenium-webdriver webscraping

Last synced: 12 months ago
JSON representation

Scrapping product information from well known websites.

Awesome Lists containing this project

README

          

# โšก Multi-Web Product Scraper (Stealth Mode | Blazing Fast | NLP for queries | Cache Management ) โšก

### Search Amazon, Flipkart, and Myntra โ€” *all at once*, way before your coffee is ready.
### Built with async Playwright for stealth, speed.

---

## ๐Ÿš€ What's Cookinโ€™ :

- ๐Ÿ” **Multi-Site Support**: Scrapes all products from Amazon, Flipkart, and Myntra ( more coming soon . . . ).
- ๐Ÿง  **`Natural Language Processing`** : Uses all-MiniLM-L6-v2 model for understanding user queries.
- ๐Ÿ›ข๏ธ **`Cache Management`** : Uses aiosqlite to cache the searched results to avoid computation expences.
- ๐Ÿ“ˆ **150+ Products in 18 Seconds**: Yup, tested โ€” ~152 products loaded in one go ( 18-20s sec ).
- ๐Ÿ–ผ๏ธ **`Photo Gallery`** : Product photos are saved with dedicated folder.
- โšก **SuperFast Scraping**: Built with asyncio, aiohttp, and Playwright, scrapes *around 150 products within 20 sec* ๐Ÿ’จ
- ๐Ÿ•ถ๏ธ **Headless & Stealth Mode**: Runs in the background and mimics human behavior to dodge bot detection.
- ๐Ÿง  **Why This Slaps**: Manually opening 3 sites, scrolling forever, and remembering deals? Nah. Just type what you want โ€” this gives you 150+ options instantly.

---

## ๐Ÿ’ป How To Run
Clone this repo :
```bash
git clone https://github.com/Aniket-16-S/product-Sraper.git
```
```bash
cd product-Sraper
```
install dependencies :
```bash
pip install -r requirements.txt
```
```bash
playwright install
```
Run the async scraper :
```bash
python main_scraper.py
```
Enter your product keyword and let it cook ๐Ÿ”ฅ for 20 secs.
Note : Please Wait for 40 to 50 sec for initial setup on first run.

---

## ๐Ÿ“‚ Output
#### Products from all 3 sites shown in your terminal.

#### Images of all products store in your File Mansger.

#### Clean, structured format โ€” easy to compare.

#### Great for research, trend tracking, or flexing that dev power.

---

## ๐Ÿ”ฎ Future Goals

#### ๐Ÿ›๏ธ Add support for more sites like Ajio, Meesho, Nykaa, etc.

#### ๐ŸŽ›๏ธ GUI version for non-coders

---
## โญ Why This Could Blow Up
#### Tired of opening 10 tabs to compare products ? This solves that.
#### Has advance features like Natural Language Processing for query matching and Cache Management.
#### Get more than 100 products scraped at your terminal in 16 - 20 secs. ( assuming decent network connectivity. )
#### Itโ€™s async and hence faster.
#### With a clean UI or web version, this could go viral as a price-compare tool.
#### Dev-friendly = easy stars + forks on GitHub.
#### Great side-project flex for portfolios or hackathons.
---

## โš ๏ธ Legal Note
### This project is for educational use only. Respect each siteโ€™s terms of service and robots.txt.

---

## ๐Ÿค Contributions Welcome
Got new site targets or bug fixes? Open a PR or drop ideas in issues!
#### Letโ€™s build the ultimate cross-site scraper ๐Ÿงƒ