Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dms-codes/scrape_tripsantai
Trip Santai Tour Data Scraper This Python script is a web scraper designed to extract and collect information about tours from the Trip Santai website. It utilizes the requests library to fetch web pages, BeautifulSoup for parsing HTML, and writes the collected data to a CSV file.
https://github.com/dms-codes/scrape_tripsantai
beautifulsoup4 data python requests scraper webscraper
Last synced: 2 days ago
JSON representation
Trip Santai Tour Data Scraper This Python script is a web scraper designed to extract and collect information about tours from the Trip Santai website. It utilizes the requests library to fetch web pages, BeautifulSoup for parsing HTML, and writes the collected data to a CSV file.
- Host: GitHub
- URL: https://github.com/dms-codes/scrape_tripsantai
- Owner: dms-codes
- Created: 2023-10-12T03:49:34.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-10-12T03:55:46.000Z (over 1 year ago)
- Last Synced: 2023-10-12T20:46:32.740Z (over 1 year ago)
- Topics: beautifulsoup4, data, python, requests, scraper, webscraper
- Language: Python
- Homepage: https://github.com/dms-codes/scrape_tripsantai
- Size: 20.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Trip Santai Tour Data Scraper
This Python script is a web scraper designed to extract and collect information about tours from the [Trip Santai](https://www.tripsantai.com/) website. It utilizes the `requests` library to fetch web pages, `BeautifulSoup` for parsing HTML, and writes the collected data to a CSV file.
## Prerequisites
Before using this script, make sure you have the following Python libraries installed:
- `requests`
- `BeautifulSoup`You can install these libraries using pip:
```bash
pip install requests beautifulsoup4
```## Usage
1. Clone this repository to your local machine.
2. Modify the `BASE_URL` in the script to the specific URL of the tours you want to scrape on the Trip Santai website.
3. Run the script:
```bash
python trip_santai_scraper.py
```4. The script will fetch tour data, including tour name, category, destination, duration, pricing, itinerary, and inclusions/exclusions.
5. The collected data will be written to a CSV file named `data_tour_tripsantai.csv`.
## Code Structure
- `constants.py`: Contains constants like `BASE_URL` and `TIMEOUT`.
- `utils.py`: Contains utility functions for extracting and cleaning text from HTML elements.
- `trip_santai_scraper.py`: The main script for scraping tour data.
- `requirements.txt`: Lists the required Python libraries.## Contact
If you have any questions or suggestions, please feel free to contact us.
Happy web scraping!