https://github.com/farukalamai/yelp-scraper-scrapy-python

Yelp Restaurant data scraping using python, scrapy spider
https://github.com/farukalamai/yelp-scraper-scrapy-python

ai-bot data-extraction data-mining data-scraper data-scraping python python-scraper scrapy scrapy-crawler scrapy-spider web-scraper web-scraping web-scraping-python web-scraping-software yelp yelp-api yelp-restaurants yelp-resturant-data-scraping yelp-scraper

Last synced: 6 months ago
JSON representation

Yelp Restaurant data scraping using python, scrapy spider

Host: GitHub
URL: https://github.com/farukalamai/yelp-scraper-scrapy-python
Owner: farukalamai
License: mit
Created: 2023-07-09T04:58:24.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2023-07-09T16:49:33.000Z (almost 2 years ago)
Last Synced: 2024-11-07T20:20:01.317Z (8 months ago)
Topics: ai-bot, data-extraction, data-mining, data-scraper, data-scraping, python, python-scraper, scrapy, scrapy-crawler, scrapy-spider, web-scraper, web-scraping, web-scraping-python, web-scraping-software, yelp, yelp-api, yelp-restaurants, yelp-resturant-data-scraping, yelp-scraper
Language: Python
Homepage: https://www.linkedin.com/in/farukalamai/
Size: 23.4 KB
Stars: 3
Watchers: 2
Forks: 2
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Yelp Restaurant data scraping using python, scrapy spider
![Top-10-Best-Restaurants-in-San-Francisco-CA-July-2023-Yelp](https://github.com/farukalampro/yelp-webscraper-using-scrapy-python-spider/assets/92469073/e3b0e25f-d55b-44b5-b496-828832240397)

## Deployment

#### 1. Clone Repository

```bash
git clone https://github.com/farukalampro/yelp-webscraper-using-scrapy-python.git
```
```bash
cd yelp-webscraper-using-scrapy-python
```
#### 2. Create Virtual Environment
```bash
python -m venv env
```
- For Windows:
```bash
.\env\Scripts\activate
```
- For macOS/Linux:
```bash
source env/bin/activate
```

#### 3. To install required packages

```bash
pip install -r requirements.txt
```

#### 4. Input your own link from yelp.com

- Go to the **data.py** file. Insert link from Yelp
- I have added one link in data.py as a sample. You can insert as many links as you want.
```bash
start_urls = [
# This is the sample URL
# Here you have to put your own search link
'https://www.yelp.com/search?find_desc=Restaurants&find_loc=San+Francisco%2C+CA'
]
```

#### 5. Run the command in the terminal
```bash
scrapy crawl data -o sample_file.csv
```
- you can download the data in any format. I have given the format below
```bash
scrapy crawl "spider name" -o file_name.csv/json/xml
```
- Here we have scraped some restaurant data which is in the **Sample File** folder

## Important Note
- As Yelp is continuously updating its website, so make sure you are updating **xpath**

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/farukalamai/yelp-scraper-scrapy-python

Awesome Lists containing this project

README