An open API service indexing awesome lists of open source software.

https://github.com/oussemabenhassena5/laptop-scraper

πŸ•ΈοΈ Advanced web scraper for extracting comprehensive laptop product information from TunisiaNet using Python, Selenium, and multi-format data export.
https://github.com/oussemabenhassena5/laptop-scraper

python selenium webcrawler webscraping

Last synced: about 2 months ago
JSON representation

πŸ•ΈοΈ Advanced web scraper for extracting comprehensive laptop product information from TunisiaNet using Python, Selenium, and multi-format data export.

Awesome Lists containing this project

README

          

# πŸš€ TunisiaNet Laptop Scraper

## πŸ“Š Advanced Web Scraping Solution for Tech Products

![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)
![Selenium](https://img.shields.io/badge/Selenium-Powered-green.svg)
![Web Scraping](https://img.shields.io/badge/Web-Scraping-orange.svg)

## 🌟 Project Overview

`TunisiaNet Laptop Scraper` is an advanced web scraping tool designed to extract comprehensive product information from TunisiaNet's laptop catalog. This powerful Python script leverages Selenium to navigate, extract, and transform web data into multiple, easily consumable formats.

## ✨ Key Features

- πŸ” Comprehensive Product Scraping
- Extract detailed laptop information
- Navigate through multiple product pages
- Handle dynamic web content

- πŸ“¦ Multiple Output Formats
- JSON
- CSV
- Excel
- Markdown Report
- SQLite Database

- πŸ“ˆ Advanced Data Visualization
- Price distribution plot
- Detailed statistical analysis

- πŸ›‘οΈ Robust Error Handling
- Comprehensive logging
- Flexible data extraction

## πŸ›  Prerequisites

- Python 3.8+
- Chrome Browser
- Chrome WebDriver

## πŸš€ Quick Setup

1. Clone the Repository
```bash
git clone https://github.com/yourusername/Laptop-Scraper.git
cd Laptop-Scraper
```

2. Create Virtual Environment
```bash
python3 -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
```

3. Install Dependencies
```bash
pip install -r requirements.txt
```

## πŸ–₯️ Usage

Run the scraper:
```bash
python scraper.py
```

## πŸ“‚ Project Structure
```
Laptop-Scraper/
β”‚
β”œβ”€β”€ scraper.py # Main scraping script
β”œβ”€β”€ requirements.txt # Project dependencies
β”œβ”€β”€ results/ # Output directory
β”‚ β”œβ”€β”€ products.json
β”‚ β”œβ”€β”€ products.csv
β”‚ β”œβ”€β”€ products.xlsx
β”‚ β”œβ”€β”€ products_report.md
β”‚ └── price_distribution.png
β”‚
└── logs/ # Logging directory
└── tunisianet_scraper_TIMESTAMP.log
```

## 🎯 Output Examples

### πŸ“Š JSON Sample
```json
{
"title": "Pc Portable HP 15-Fd0051nk / I3-N305 / 32 Go / 512 Go SSD / Gold",
"reference": "[A2AN9EA-32]",
"description": "Γ‰cran Full HD 15.6\" (1920 x 1080), antireflet - Processeur Intel Core i3-N305, (jusqu’à 3.8 GHz, 6 Mo de mΓ©moire cache) - MΓ©moire 32 Go DDR4 - Disque SSD NVMe M.2 512 Go - Carte graphique Intel UHD IntΓ©grΓ© - Wi-Fi 6 - Bluetooth 5.3 - Clavier complet gris clair avec pavΓ© numΓ©rique - CamΓ©ra HP True Vision HD 720p - Doubles haut-parleurs - 1x USB-C - 2x USB-A - 1x HDMI 1.4b - 1x prise combinΓ©e casque/microphone - FreeDOS - Couleur Gold - Garantie 1 an",
"price": "1 305,000 DT",
"availability": "En stock",
"img_url": "https://www.tunisianet.com.tn/401810-home/pc-portable-dell-vostro-3530-i3-1305u-24-go-512-go-ssd-noir.jpg"
}
```

### πŸ“ˆ Price Distribution Visualization
![Price Distribution](results/price_distribution.png)

### πŸ“„ Markdown Report Snapshot
- **Total Products:** 714
- **Price Analysis:**
- Minimum Price: 500 DT
- Maximum Price: 13000 DT
- Average Price: 1500 DT

## 🀝 Contributing

1. Fork the repository
2. Create your feature branch
3. Commit your changes
4. Push to the branch
5. Create a Pull Request

## ⚠️ Disclaimer

This tool is for educational purposes. Always respect website terms of service and robots.txt.

## πŸ“œ License

MIT License

---

**Happy Scraping! πŸ•·οΈπŸ“Š**