Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/paritoshtripathi935/grabfood-scraper

web scraping of Food Grab website using XHR requests
https://github.com/paritoshtripathi935/grabfood-scraper

chromedriver food-delivery python selenium web webscraping xhr-requests

Last synced: 14 days ago
JSON representation

web scraping of Food Grab website using XHR requests

Host: GitHub
URL: https://github.com/paritoshtripathi935/grabfood-scraper
Owner: paritoshtripathi935
Created: 2022-07-23T06:59:26.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2022-08-01T08:09:58.000Z (over 2 years ago)
Last Synced: 2025-01-21T04:41:39.840Z (22 days ago)
Topics: chromedriver, food-delivery, python, selenium, web, webscraping, xhr-requests
Language: Python
Homepage:
Size: 25.4 KB
Stars: 3
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

FoodGrab_Scrapping

Github top language

Github language count

Repository size

License

## :dart: About ##
Web scraping of Food Grab website using XHR requests

## :sparkles: Approach ##

### Approach - 1
- in devtools if go to the network tab and click on XHR request, so when i click loadmore button [Food Grab](https://food.grab.com/sg/en/restaurants) we can see this request sent to "https://portal.grab.com/foodweb/v2/search"

- so if we go to response section you can see
```
{"searchResult":{"searchID":"f621f57c6c324a03a9f28eb4231c8395"
```

- get restaurant_id from this response, each restaurant_id stored with id key, then when you click on one restaurants a request sent to https://portal.grab.com/foodweb/v2/merchants/{merchnt_id}

where 2-CYKCVZNZJTDFLE is ‍restaurant_id‍
```
0: {id: "SGDD00739", address: {name: "Lucky Saigon - North Canal Road"},…}
address: {name: "Lucky Saigon - North Canal Road"}
businessType: "FOOD"
chainID: "729_Lucky_Saigon"
chainName: "Lucky Saigon"
estimatedDeliveryFee: {currency: {code: "SGD", symbol: "SGD", exponent: 2}, price: 300, priceDisplay: "S$3.00",…}
estimatedDeliveryTime: 30
id: "SGDD00739"
latlng: {latitude: 1.2862877, longitude: 103.84841596}
```

- you can get latitude and longitude from here, make the https://portal.grab.com/foodweb/v2/search request and https://portal.grab.com/foodweb/v2/merchants/{id} with python, but insure all of the http headers must be same with http headers that in the chrome dev tools

but since use of selenium was requested i tried a diffrent way

### Approach - 2
- So since i have to capture a XHR(XMLHttpRequest) request, i have used selenium wire for this for capturing the XHR request, i have used chrome driver for this.
- Solution Desgin
```
1. Load the python libraries needed
2. def load_more - Load the food.grab.com page and automatically activate the "Load More" button until the page contains all the restaurants in the Singapore area
3. def capture_post_response - Use driver to make a POST request for the "grab_internal_post_api" and then decode the data and store it in json format in post_data.
4. def get_restaurant_latlng - remove all the extra and keep name and location only, then store it in a list of dictionaries.
```
- Given a base_url, capture all restaurants (based on user's submitted location, e.g., sg) latitude & longitude
by intercepting grab-foods internal POST request. self.grab_internal_post_api is found by manually inspecting all XHR made my grab-foods, using chrome dev tools.

- I think aprroach 1 will be easier but will have to pass recapta test i haven't thoufht about this yet.

- I have taken help for various resources since i have to use selenium wire for this and get data from XHR request which i haven't done yet.

## :rocket: Technologies ##

The following tools were used in this project:

- [Python](https://www.python.org/)
- [Selenium](https://pypi.org/project/selenium-wire/)
- [Chromedriver](https://chromedriver.chromium.org/)

## :white_check_mark: Requirements ##

Before starting :checkered_flag:, you need to have [Git](https://git-scm.com) and [python](https://python.org/) installed.

## :checkered_flag: Starting ##

```bash
# Clone this project
$ git clone https://github.com/{{YOUR_GITHUB_USERNAME}}/foodgrab_scrapping

# Access
$ cd foodgrab_scrapping

# Setup virtual environment
$ python3 -m venv venv

# Install dependencies
$ pip install -r requirements.txt

# Run the project
$ run XHR.py file

```

## :memo: License ##

This project is under license from MIT. For more details, see the [LICENSE](LICENSE.md) file.

Made with :heart: by Paritosh Tripathi