Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dms-codes/get_book_data_gramediacom

Web Scraping Book Data This Python script is used for web scraping book data from a specified URL and saving it to a CSV file.
https://github.com/dms-codes/get_book_data_gramediacom

gramedia python webscraper

Last synced: 2 days ago
JSON representation

Web Scraping Book Data This Python script is used for web scraping book data from a specified URL and saving it to a CSV file.

Host: GitHub
URL: https://github.com/dms-codes/get_book_data_gramediacom
Owner: dms-codes
Created: 2022-10-01T02:03:05.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-10-01T10:24:18.000Z (over 1 year ago)
Last Synced: 2023-10-01T11:39:22.467Z (over 1 year ago)
Topics: gramedia, python, webscraper
Language: Python
Homepage: https://github.com/dms-codes/get_book_data_gramediacom
Size: 4.88 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Web Scraping Book Data

This Python script is used for web scraping book data from a specified URL and saving it to a CSV file.

## Usage

1. Ensure you have the required libraries installed. You can install them using pip:

```bash
pip install selenium pandas
```

2. Configure the script with the necessary details:

- `FILENAME`: Name of the CSV file to store the scraped data.
- `COLUMNS`: Column names for the CSV file.
- `START_ROW_NUM`: The starting row number to begin data retrieval.
- `HEADER`: HTTP request headers for the web scraping requests.
- `url`: The URL of the book product page you want to scrape.

3. Run the script:

```bash
python script.py
```

4. The script will open a web browser, navigate to the specified URL, and scrape the book data, including title, author, URL, description, details, price, discount, discounted price, and image URL.

5. The scraped data will be saved to a CSV file named `data.csv` (you can change the file name in the script).

6. The script will handle multiple URLs by reading them from a CSV file and scraping data for each URL.

## Requirements

- Python 3.x
- Selenium
- Pandas

## License

This script is provided under the [MIT License](LICENSE).
```

You can customize the script and README.md file as needed, and use it to scrape book data from different URLs and store it in a CSV file.