An open API service indexing awesome lists of open source software.

https://github.com/itsrummmy/books-to-scrape-webscraping-project

Fully scrape website acquiring specific bits of information.
https://github.com/itsrummmy/books-to-scrape-webscraping-project

beautifulsoup4 requests-library-python

Last synced: 3 months ago
JSON representation

Fully scrape website acquiring specific bits of information.

Awesome Lists containing this project

README

          

# Webscraping Project: Books To Scrape

Table of Contents



  1. Project Overview


  2. Tools Used


  3. Deliverables


  4. Outcomes


  5. Conclusion

## Project Overview

To develop a Python-based web scraping solution to extract comprehensive data for all books listed on the target website - [Books to Scrape](https://books.toscrape.com/index.html)

## Tools Used

* **Python**: For exploratory data analysis and modeling.
* **Libraries**:
* `pandas`
* `beautifulsoup`
* `requests`

## Deliverables
1) Automated Data Retrieval
2) Develop a web scraping solution to extract and store the following data for all books on the website and store in a single csv file:
- `Book Name`
- `Book URL`
- `Stock Availability`
- `UPC`
- `Tax`
- `Number of Reviews`
- `Category Name`
- `Category Link`
- `Book Description`

## Outcomes

Successfully extracted and compiled a comprehensive dataset of all 1000 books available on the target website, including key attributes such as name, price, availability, category, and detailed descriptions. Data successfully saved into csv file

## Conclusion

This project successfully demonstrated the ability to effectively scrape and extract comprehensive data from a target website using Python and the Beautiful Soup library.

(back to top)