Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kirolos00daniel/amazon-web-scraping
https://github.com/kirolos00daniel/amazon-web-scraping
amazon jupiter-notebook python webscraping
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/kirolos00daniel/amazon-web-scraping
- Owner: Kirolos00Daniel
- License: mit
- Created: 2024-11-09T13:01:37.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2024-11-09T13:30:10.000Z (2 months ago)
- Last Synced: 2024-11-09T14:20:06.733Z (2 months ago)
- Topics: amazon, jupiter-notebook, python, webscraping
- Language: Jupyter Notebook
- Homepage:
- Size: 869 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Amazon Web Scraping Project
**Overview**
This project involves web scraping Amazon's product listings for "PlayStation 4" to collect data on various listings, including product names, prices, ratings, and availability. The extracted data is stored in a CSV file for further analysis and research purposes.## Table of Contents
- [Overview](#overview)
- [Getting Started](#getting-started)
- [Prerequisites](#prerequisites)
- [Installation](#installation)
- [Usage](#usage)
- [Project Structure](#project-structure)
- [Data Extraction](#data-extraction)
- [Visualizations](#visualizations)
- [Contributing](#contributing)
- [License](#license)
- [Contact](#contact)## Getting Started
These instructions will guide you on setting up the project and running the code on your local machine.## Prerequisites
- **Python 3.7+**
- **Jupyter Notebook** or any compatible IDE to run `.ipynb` files
- **Libraries**: `requests`, `BeautifulSoup`, `pandas`
- Install these libraries using:
```bash
pip install requests beautifulsoup4 pandas
```## Installation
1. **Clone the repository**:
```bash
git clone https://github.com/your-username/amazon-web-scraping.git
```
2. **Navigate to the project directory**:
```bash
cd amazon-web-scraping
```
3. **Open the Jupyter Notebook**:
- Use Jupyter Notebook to open `Amazon Web Scraping.ipynb` or `Amazon Web Scraping Sample.ipynb` files.## Usage
1. **Run the Notebook**:
- Execute the cells in the notebook to scrape the Amazon product page and extract data for "PlayStation 4" listings.
2. **Data Extraction**:
- The scraped data will be saved in a CSV file named `amazon_data.csv`, which includes columns like product name, price, rating, and availability.## Project Structure
- `Amazon Web Scraping.ipynb`: Main Jupyter Notebook for scraping data.
- `Amazon Web Scraping Sample.ipynb`: Sample notebook demonstrating the scraping process.
- `amazon_data.csv`: Output CSV file containing the scraped data.
- `image.png`: Screenshot of the Amazon search results page.
- ![Screenshot 2024-11-09 150014](https://github.com/user-attachments/assets/9a9030ea-edb6-44f2-9091-b329db8e93a1)## Data Extraction
- **Data Fields**: The following fields are extracted for each product:
- **Product Name**
- **Price**
- **Rating**
- **Availability**- **CSV Output**: The extracted data is stored in `amazon_data.csv`, making it easy to analyze and visualize the information.
## Visualizations
The following visualizations were created to provide insights into the data:### Distribution of Product Ratings
![output](https://github.com/user-attachments/assets/a640d24c-4a74-4ec2-af28-1ed0f744e593)
### Distribution of Product Prices
![output](https://github.com/user-attachments/assets/429c3d6b-e254-4802-9928-c8934d1a2294)
### Scatter Plot of Price vs Rating
![output](https://github.com/user-attachments/assets/d9a1dd5b-2449-481f-97cf-e19d9f58c217)
## Contributing
1. **Fork the repository**
2. **Create a feature branch** (`git checkout -b feature/YourFeature`)
3. **Commit your changes** (`git commit -m 'Add feature'`)
4. **Push to the branch** (`git push origin feature/YourFeature`)
5. **Create a Pull Request**## License
This project is licensed under the MIT License.## Contact
For questions or support, please reach out to:
- **Your Name** Kirolos Daniel
- **Email**: [email protected]