https://github.com/devapurva/zomato-menu-scraper
This Python script extracts menu data from a Zomato restaurant page and saves it in both JSON and CSV formats.
https://github.com/devapurva/zomato-menu-scraper
Last synced: about 1 year ago
JSON representation
This Python script extracts menu data from a Zomato restaurant page and saves it in both JSON and CSV formats.
- Host: GitHub
- URL: https://github.com/devapurva/zomato-menu-scraper
- Owner: devapurva
- License: gpl-3.0
- Created: 2024-04-12T06:55:56.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-12T06:57:59.000Z (about 2 years ago)
- Last Synced: 2024-04-12T14:47:42.988Z (about 2 years ago)
- Language: Python
- Size: 16.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: ReadMe.md
- License: LICENSE
Awesome Lists containing this project
README
# Zomato Menu Scraper
This Python script extracts menu data from a Zomato restaurant page and saves it in both JSON and CSV formats.
## Requirements
- Python 3.x
- BeautifulSoup4
- Requests
## How to Run
1. Clone or download the repository.
2. Navigate to the directory containing the script.
3. Install the required dependencies using `pip install -r requirements.txt`.
4. Run the script using `python zomato_menu_scraper.py`.
5. Enter the Zomato link when prompted (e.g., `https://www.zomato.com/mumbai/rasoi-dadar-east`).
6. The script will scrape the webpage, extract menu data, and save it in JSON and CSV formats.
## Script Details
### Libraries Used:
- `json`: For JSON serialization and deserialization.
- `requests`: For making HTTP requests to fetch webpage data.
- `pandas`: For handling data in tabular format.
- `csv`: For reading and writing CSV files.
- `BeautifulSoup`: For parsing HTML content.
- `re`: For regular expression-based pattern matching.
- `html`: For unescaping HTML entities.
### Functions:
- `save_json(name, data)`: Saves JSON data to a file.
- `extract_needed_data(json_data)`: Extracts relevant menu data from JSON.
- `json_to_csv(json_data, csv_filepath)`: Converts JSON data to CSV format.
- `get_menu(url, save=True)`: Scrapes Zomato webpage, extracts data, and saves it.
### Running the Script:
1. The user provides the Zomato link.
2. The script fetches the webpage.
3. HTML content is parsed to find menu data.
4. Menu data is extracted and saved in JSON and CSV formats.
5. Output files are named after the restaurant.
## Additional Comments:
- The script uses a user-agent header to mimic a web browser's request.
- It extracts menu data using regular expressions to find JSON content embedded in the webpage.
- The extracted data includes menu names, categories, item names, prices, descriptions, and dietary slugs.