Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/thekartikeyamishra/book-scraper
A Python-based project that scrapes book details from the website Books to Scrape using Beautiful Soup and pandas. This project extracts book titles, prices, and availability statuses from the website and exports the data to a CSV file for further analysis or usage.
https://github.com/thekartikeyamishra/book-scraper
beautifulsoup beautifulsoup4 pandas python requests webscraper
Last synced: 29 days ago
JSON representation
A Python-based project that scrapes book details from the website Books to Scrape using Beautiful Soup and pandas. This project extracts book titles, prices, and availability statuses from the website and exports the data to a CSV file for further analysis or usage.
- Host: GitHub
- URL: https://github.com/thekartikeyamishra/book-scraper
- Owner: thekartikeyamishra
- Created: 2025-01-03T16:24:42.000Z (29 days ago)
- Default Branch: main
- Last Pushed: 2025-01-03T16:27:35.000Z (29 days ago)
- Last Synced: 2025-01-03T17:27:22.132Z (29 days ago)
- Topics: beautifulsoup, beautifulsoup4, pandas, python, requests, webscraper
- Language: Python
- Homepage:
- Size: 4.88 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Web Scraper Project
A Python-based project that scrapes book details from the website [Books to Scrape](http://books.toscrape.com/) using Beautiful Soup and pandas. This project extracts book titles, prices, and availability statuses from the website and exports the data to a CSV file for further analysis or usage.
## Features
- Scrape book titles, prices, and availability status from a webpage.
- Use Beautiful Soup for parsing HTML content.
- Export the scraped data to a CSV file.
- Simple and efficient scraping workflow.## Requirements
- Python 3.7 or higher
- Libraries:
- `requests`
- `beautifulsoup4`
- `pandas`## Installation
1. Clone the repository:
```bash
git clone https://github.com/thekartikeyamishra/book-scraper.git
cd book-scraper
```2. Install the required dependencies:
```bash
pip install beautifulsoup4 pandas
```## Usage
1. Run the script in your Python environment (Google Colab is recommended for easy file downloads).
2. The script will scrape the following details from the [Books to Scrape](http://books.toscrape.com/) website:
- Book Title
- Price
- Availability3. The scraped data will be displayed as a DataFrame and saved as a CSV file named `books_data.csv`.
4. The CSV file can be downloaded directly to your local machine if using Google Colab.
## Example Output
**DataFrame Output:**
| Title | Price | Availability |
|--------------------------------------------|--------|----------------|
| A Light in the Attic | £51.77 | In stock |
| Tipping the Velvet | £53.74 | In stock |
| Soumission | £50.10 | In stock |**CSV File:**
The above data will be saved in a CSV file named `books_data.csv`.## Troubleshooting
- **Requests Error**: Ensure that the website is accessible and that you have an active internet connection.
- **Empty Data**: Check if the website structure has changed, which may require updating the HTML parsing logic.
- **Dependency Issues**: Ensure all required libraries are installed using `pip install`.## Contributing
Contributions are welcome! Feel free to fork the repository and submit a pull request for improvements or new features.
## Contact
For questions or feedback, please contact:
- **Author**: Kartikeya Mishra
- **GitHub**: [thekartikeyamishra](https://github.com/thekartikeyamishra)