https://github.com/aniket-16-s/product-sraper
Scrapping product information from well known websites.
https://github.com/aniket-16-s/product-sraper
automation chromedriver ecommerce-website selenium selenium-python selenium-webdriver webscraping
Last synced: 12 months ago
JSON representation
Scrapping product information from well known websites.
- Host: GitHub
- URL: https://github.com/aniket-16-s/product-sraper
- Owner: Aniket-16-S
- Created: 2025-05-06T03:09:42.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-08T04:09:13.000Z (about 1 year ago)
- Last Synced: 2025-06-08T04:29:34.826Z (about 1 year ago)
- Topics: automation, chromedriver, ecommerce-website, selenium, selenium-python, selenium-webdriver, webscraping
- Language: Python
- Homepage:
- Size: 105 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# โก Multi-Web Product Scraper (Stealth Mode | Blazing Fast | NLP for queries | Cache Management ) โก
### Search Amazon, Flipkart, and Myntra โ *all at once*, way before your coffee is ready.
### Built with async Playwright for stealth, speed.
---
## ๐ What's Cookinโ :
- ๐ **Multi-Site Support**: Scrapes all products from Amazon, Flipkart, and Myntra ( more coming soon . . . ).
- ๐ง **`Natural Language Processing`** : Uses all-MiniLM-L6-v2 model for understanding user queries.
- ๐ข๏ธ **`Cache Management`** : Uses aiosqlite to cache the searched results to avoid computation expences.
- ๐ **150+ Products in 18 Seconds**: Yup, tested โ ~152 products loaded in one go ( 18-20s sec ).
- ๐ผ๏ธ **`Photo Gallery`** : Product photos are saved with dedicated folder.
- โก **SuperFast Scraping**: Built with asyncio, aiohttp, and Playwright, scrapes *around 150 products within 20 sec* ๐จ
- ๐ถ๏ธ **Headless & Stealth Mode**: Runs in the background and mimics human behavior to dodge bot detection.
- ๐ง **Why This Slaps**: Manually opening 3 sites, scrolling forever, and remembering deals? Nah. Just type what you want โ this gives you 150+ options instantly.
---
## ๐ป How To Run
Clone this repo :
```bash
git clone https://github.com/Aniket-16-S/product-Sraper.git
```
```bash
cd product-Sraper
```
install dependencies :
```bash
pip install -r requirements.txt
```
```bash
playwright install
```
Run the async scraper :
```bash
python main_scraper.py
```
Enter your product keyword and let it cook ๐ฅ for 20 secs.
Note : Please Wait for 40 to 50 sec for initial setup on first run.
---
## ๐ Output
#### Products from all 3 sites shown in your terminal.
#### Images of all products store in your File Mansger.
#### Clean, structured format โ easy to compare.
#### Great for research, trend tracking, or flexing that dev power.
---
## ๐ฎ Future Goals
#### ๐๏ธ Add support for more sites like Ajio, Meesho, Nykaa, etc.
#### ๐๏ธ GUI version for non-coders
---
## โญ Why This Could Blow Up
#### Tired of opening 10 tabs to compare products ? This solves that.
#### Has advance features like Natural Language Processing for query matching and Cache Management.
#### Get more than 100 products scraped at your terminal in 16 - 20 secs. ( assuming decent network connectivity. )
#### Itโs async and hence faster.
#### With a clean UI or web version, this could go viral as a price-compare tool.
#### Dev-friendly = easy stars + forks on GitHub.
#### Great side-project flex for portfolios or hackathons.
---
## โ ๏ธ Legal Note
### This project is for educational use only. Respect each siteโs terms of service and robots.txt.
---
## ๐ค Contributions Welcome
Got new site targets or bug fixes? Open a PR or drop ideas in issues!
#### Letโs build the ultimate cross-site scraper ๐ง