https://github.com/vipulbunny/restaurant-insight-analysis
A comprehensive data analysis project exploring restaurant ratings, locations, and customer sentiments. This project includes data preprocessing, descriptive analysis, geospatial mapping, sentiment analysis, and price-rating correlations using Python and visualization tools.
https://github.com/vipulbunny/restaurant-insight-analysis
data-analysis data-preprocessing data-visualization folium geospatial geospatial-analysis geospatial-visualization machine-learning nlp pandas python restaurant-insights seaborn sentiment-analysis
Last synced: about 2 months ago
JSON representation
A comprehensive data analysis project exploring restaurant ratings, locations, and customer sentiments. This project includes data preprocessing, descriptive analysis, geospatial mapping, sentiment analysis, and price-rating correlations using Python and visualization tools.
- Host: GitHub
- URL: https://github.com/vipulbunny/restaurant-insight-analysis
- Owner: VIPULbunny
- Created: 2025-03-01T15:51:44.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-01T16:32:11.000Z (over 1 year ago)
- Last Synced: 2025-07-15T11:33:22.068Z (11 months ago)
- Topics: data-analysis, data-preprocessing, data-visualization, folium, geospatial, geospatial-analysis, geospatial-visualization, machine-learning, nlp, pandas, python, restaurant-insights, seaborn, sentiment-analysis
- Language: Jupyter Notebook
- Homepage:
- Size: 2.42 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# π Restaurant Data Analysis
## π Project Overview
This project focuses on analyzing a dataset of restaurants, examining their distribution, ratings, and geospatial locations. The analysis is structured into two parts:
### **πΉ Part 1: Data Analysis**
- **Task 1**: Data Exploration & Preprocessing
- **Task 2**: Descriptive Analysis
- **Task 3**: Geospatial Analysis
### **πΉ Part 2: Advanced Insights**
- **Task 4**: Sentiment Analysis
- **Task 5**: Price vs Rating Correlation
- **Task 6**: Restaurant Rating Prediction
The dataset is stored in:
```
Dataset.csv
```
---
## π Dataset Details
The dataset contains the following key columns:
- **Restaurant ID**: Unique identifier for each restaurant
- **Name**: Restaurant name
- **City**: The city where the restaurant is located
- **Country Code**: Country identifier
- **Cuisines**: Types of cuisine served
- **Aggregate Rating**: Overall restaurant rating
- **Latitude & Longitude**: Geospatial coordinates
- **Review Text** (for sentiment analysis)
- **Price Range** (for pricing insights)
---
## π Analysis Breakdown
### **πΉ Part 1: Data Analysis**
#### π·οΈ Task 1: Data Exploration & Preprocessing
π **File Location**: `Task1.ipynb`
- **Loading Data**: Reads `Dataset.csv` using Pandas.
- **Handling Missing Values**:
- Identifies missing values in the "Cuisines" column.
- Drops rows where "Cuisines" data is unavailable.
- **Statistical Overview**:
- Uses `.head()`, `.info()`, `.describe()` for a summary.
- Plots the distribution of "Aggregate Rating" using Matplotlib.
π· **Visualization**:
- Histogram of Aggregate Ratings

#### π Task 2: Descriptive Analysis
π **File Location**: `Task2.ipynb`
- **Loading Preprocessed Data**: Reads cleaned data from `Dataset.csv`.
- **City & Country Analysis**:
- Finds the most common restaurant locations.
- Groups data by "City" and "Country Code".
- **Visualizing Trends**:
- Bar plots of the top 10 countries and cities using Seaborn.
π· **Visualizations**:
- Top 10 Countries by Restaurant Count

- Top 10 Cities by Restaurant Count

- Top 10 Cuisines

#### πΊοΈ Task 3: Geospatial Analysis
π **File Location**: `Task3.ipynb`
- **Loading Data**: Reads `Dataset.csv`.
- **Mapping Restaurants**:
- Extracts latitude and longitude data.
- Creates an interactive map with restaurant locations using Folium.
- **Visualization**:
- Displays restaurants as clusters on an interactive map.
π· **Visualizations**:
- Restaurant Location Map

---
### **πΉ Part 2: Advanced Insights**
#### π¬ Task 4: Sentiment Analysis
π **File Location**: `Task4.ipynb`
- **Objective**: Analyze customer reviews to determine restaurant sentiment.
- **Approach**:
- Cleans text data (removes stopwords, punctuation, etc.).
- Applies sentiment analysis using NLP libraries (e.g., VADER, TextBlob).
- Categorizes reviews into Positive, Neutral, and Negative.
- **Visualization**:
- Pie charts and bar graphs to show sentiment distribution.
π· **Visualizations**:
- Aggregate Rating Of Restaurants With Table Booking And Those Without.

- Availability of Online Delivery Among Restaurants With Different Price Ranges.

#### π° Task 5: Price vs Rating Correlation
π **File Location**: `Task5.ipynb`
- **Objective**: Examine how price affects restaurant ratings.
- **Approach**:
- Compares price range with average aggregate rating.
- Uses scatter plots and correlation heatmaps.
- **Findings**:
- Identifies whether higher-priced restaurants have better ratings.
π· **Visualizations**:
- Price vs Rating Scatter Plot

- Correlation Heatmap

#### π Task 6: Restaurant Rating Prediction
π **File Location**: `Task6.ipynb`
- **Objective**: Convert categorical data into numerical format for further analysis.
- **Approach**:
- Encodes categorical variables using techniques like One-Hot Encoding or Label Encoding.
- Ensures the dataset is in a structured numerical format.
- Exports the processed data as CSV or Excel for machine learning models.
- **Visualization**:
-Displays summary statistics of transformed data.
---
## βοΈ Setup & Installation
To run this project, install the required libraries:
```bash
pip install numpy pandas matplotlib seaborn folium nltk textblob scikit-learn
```
Run the Jupyter notebooks in sequence:
1. `Task1.ipynb`
2. `Task2.ipynb`
3. `Task3.ipynb`
4. `Task4.ipynb`
5. `Task5.ipynb`
6. `Task6.ipynb`
---
## π€ Contributing
Feel free to fork this repository and open pull requests with improvements!
---
## π License
This project is open-source under the MIT License.