Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/giuseppegambino/Scraping-TripAdvisor-with-Python-2020
Python implementation of web scraping of TripAdvisor with Selenium in a new 2019 website
https://github.com/giuseppegambino/Scraping-TripAdvisor-with-Python-2020
python selenium tripadvisor tripadvisor-scraper tripadvisorreview webscraper webscraper-website webscraping
Last synced: 11 days ago
JSON representation
Python implementation of web scraping of TripAdvisor with Selenium in a new 2019 website
- Host: GitHub
- URL: https://github.com/giuseppegambino/Scraping-TripAdvisor-with-Python-2020
- Owner: giuseppegambino
- License: mit
- Created: 2019-05-07T14:49:35.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-07-06T18:55:22.000Z (over 2 years ago)
- Last Synced: 2024-08-01T16:16:54.701Z (4 months ago)
- Topics: python, selenium, tripadvisor, tripadvisor-scraper, tripadvisorreview, webscraper, webscraper-website, webscraping
- Language: Python
- Homepage:
- Size: 1.83 MB
- Stars: 87
- Watchers: 7
- Forks: 46
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Scraping TripAdvisor with Python 2020 *
Python implementation of web scraping of TripAdvisor with Selenium in a new 2020 website.
There are two scripts:
- "restaurants_scraper.py" to scrape restaurant
- "things_to_do_scraper.py" to scrape hotels, attraction and monuments.
The python function is commented, write me if you have doubts.
If you have a slow connection and you encounter code problems, try increasing the seconds of time.sleep () functionFeatures implemented:
- The click function to open the "more" button of the reviews
- The click function to change the page
- Csv file with the date, the score, the title and the full review!
How to use:
- First approach: download the python file, open it and edit the default fields (csv file path, number of pages, tripadvisor url)
- Second approach: download the file and launch it directly from the terminal, passing:
- the path of your csv file where the reviews will be stored
- the number of pages of the desired website that you want to scrape
- the url of tripadvisor website that you want to scrapeCode to paste into terminal: python3 path_to_downloaded_script/things_to_do_scraper.py desktop/reviews.csv 50 https://www.tripadvisor.com/Attraction_Review-g187791-d192285-Reviews-Colosseum-Rome_Lazio.html
What I used:
- Python 3.8.2
- Selenium 3.141.0
- Safari 14.0.1
- Visual Studio Code 1.51.1
- Macbook Pro 13" M1 2020 with macOS Big Sur 11.0.1*This activity has been supported by a grant from the Project IDEHA - PON "Ricerca e Innovazione" 2014-2020 - Innovation for Data Elaboration in Heritage Areas, Azione II