Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dms-codes/get_book_data_gramediacom
Web Scraping Book Data This Python script is used for web scraping book data from a specified URL and saving it to a CSV file.
https://github.com/dms-codes/get_book_data_gramediacom
gramedia python webscraper
Last synced: 2 days ago
JSON representation
Web Scraping Book Data This Python script is used for web scraping book data from a specified URL and saving it to a CSV file.
- Host: GitHub
- URL: https://github.com/dms-codes/get_book_data_gramediacom
- Owner: dms-codes
- Created: 2022-10-01T02:03:05.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-10-01T10:24:18.000Z (over 1 year ago)
- Last Synced: 2023-10-01T11:39:22.467Z (over 1 year ago)
- Topics: gramedia, python, webscraper
- Language: Python
- Homepage: https://github.com/dms-codes/get_book_data_gramediacom
- Size: 4.88 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Web Scraping Book Data
This Python script is used for web scraping book data from a specified URL and saving it to a CSV file.
## Usage
1. Ensure you have the required libraries installed. You can install them using pip:
```bash
pip install selenium pandas
```2. Configure the script with the necessary details:
- `FILENAME`: Name of the CSV file to store the scraped data.
- `COLUMNS`: Column names for the CSV file.
- `START_ROW_NUM`: The starting row number to begin data retrieval.
- `HEADER`: HTTP request headers for the web scraping requests.
- `url`: The URL of the book product page you want to scrape.3. Run the script:
```bash
python script.py
```4. The script will open a web browser, navigate to the specified URL, and scrape the book data, including title, author, URL, description, details, price, discount, discounted price, and image URL.
5. The scraped data will be saved to a CSV file named `data.csv` (you can change the file name in the script).
6. The script will handle multiple URLs by reading them from a CSV file and scraping data for each URL.
## Requirements
- Python 3.x
- Selenium
- Pandas## License
This script is provided under the [MIT License](LICENSE).
```You can customize the script and README.md file as needed, and use it to scrape book data from different URLs and store it in a CSV file.