https://github.com/nikhleshshukla123/web-scraping-using-python

scrapes multiple pages of Amazon search results using python.
https://github.com/nikhleshshukla123/web-scraping-using-python

beautifulsoup4 numpy pandas python

Last synced: 4 months ago
JSON representation

scrapes multiple pages of Amazon search results using python.

Host: GitHub
URL: https://github.com/nikhleshshukla123/web-scraping-using-python
Owner: Nikhleshshukla123
Created: 2025-10-03T13:50:10.000Z (4 months ago)
Default Branch: main
Last Pushed: 2025-10-03T14:09:17.000Z (4 months ago)
Last Synced: 2025-10-03T16:10:05.041Z (4 months ago)
Topics: beautifulsoup4, numpy, pandas, python
Language: Jupyter Notebook
Homepage:
Size: 13.7 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Amazon Product Scraper

## Project Overview
This project is a Python-based web scraper that collects product information from Amazon search results. Using **Requests** and **BeautifulSoup**, it extracts details such as product title, price, rating, number of reviews, and availability status, and saves the data into a CSV file for further analysis.

---

## Features
- Scrapes multiple pages of Amazon search results.
- Extracts product information:
- Product Title
- Price
- Rating
- Number of Reviews
- Availability
- Saves the collected data to a CSV file.
- Includes **error handling** and polite delays to avoid blocking by Amazon.

---

## Technologies Used
- Python 3.x
- Libraries:
- `requests` – for making HTTP requests
- `BeautifulSoup` – for parsing HTML
- `pandas` – for data storage and manipulation
- `numpy` – for handling missing values
- `time` and `random` – for delays between requests

---

## Project Structure
amazon_scraper/
│
├── amazon_scraper.ipynb # Jupyter notebook with the full scraping workflow
├── amazon_data.csv # CSV output file with scraped data
├── README.md # Project documentation
└── requirements.txt # Required Python libraries

---

## Usage
1. Clone the repository or oprn the notebook in kaggle/local jupyter.
2. Install the required libraries

```bash
pip install -r requirements.txt

3. Open the notebook and run the cells in order.
4. Scraped product data will be saved as:

/kaggle/working/amazon_data.csv (Kaggle)
or amazon_data.csv (Local)

5. You can modify the search term and number of pages to scrape:
BASE_URL = "https://www.amazon.com/s?k=playstation+5&crid=3G12O79UMR7B1&sprefix=playstation+5%2Caps%2C414&ref=nb_sb_noss_1"
TOTAL_PAGES = 5

6. Output Example

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/nikhleshshukla123/web-scraping-using-python

Awesome Lists containing this project

README