https://github.com/csengupta1101/zomato-kolkata-eda

The Zomato dataset (Kolkata) contains 7388 rows and 7 columns. The repository is an EDA(exploratory data analysis) on the given Dataset.
https://github.com/csengupta1101/zomato-kolkata-eda

eda jupyter-notebook kolkata python3 restaurant vscode zomato

Last synced: 3 months ago
JSON representation

The Zomato dataset (Kolkata) contains 7388 rows and 7 columns. The repository is an EDA(exploratory data analysis) on the given Dataset.

Host: GitHub
URL: https://github.com/csengupta1101/zomato-kolkata-eda
Owner: Csengupta1101
Created: 2021-11-14T07:42:57.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2021-11-14T07:49:24.000Z (over 3 years ago)
Last Synced: 2024-12-29T04:50:43.964Z (5 months ago)
Topics: eda, jupyter-notebook, kolkata, python3, restaurant, vscode, zomato
Language: Jupyter Notebook
Homepage:
Size: 890 KB
Stars: 1
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

![zomato](https://github.com/Csengupta1101/Zomato-Kolkata-EDA/blob/main/ZomatoKol.png)
# Kolkata Zomato Dataset Analysis

The dataset contains 7388 rows and 7 columns. The details of the restaurant those provided are –

 'name' [Name of the restaurant]
 'voteCount' [Number of votes received from the user]
 'rating' [rating given to the restaurant by the user]
 'address' [Address of the restaurant]
 'cusine' [Category of cuisine offered]
 'cost' [Average price of the restaurant]
 'timing' [Opening and closing time of the restaurant]

We will be performing EDA(exploratory data analysis) on the given Dataset.

Libraries required –
 Pandas – Data exploration
 Numpy – Mathematical operation
 Matplotlib – Data Visualization
 Seaborn – Data Visualization

## WorkFlow –

Our workflow will primarily consist of two segments. Data cleaning and data visualization. The data cleaning process will consume more time than the visualization.

### Data Cleaning –

 Vote count , rating and cost , these three columns are type casted as object. We need to convert them in numerical category for our calculative functionality.
 We need to find the missing values and then handle them with mean, median or mode operation based on it’s relevance.
 We need to look for duplicate values in the dataset.
 Let’s create different data frames based on given conditions –

• high_end_restos – 1896 , AvgRating – 3.17 , AvgVotecount - 288
• cheap_restos – 2193 , AvgRating – 2.11 , AvgVotecount - 32
• midnight_restos – 670, AvgRating – 2.67 , AvgVotecount - 128
• genral_timed_resto – 6718 , AvgRating – 2.49 , AvgVotecount - 107

 Let’s find out the most popular cuisines in the market –

• North Indian
• Chinese
• Fast Food
• Café
• Biriyani
• Bengali

Data Visualization –

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/csengupta1101/zomato-kolkata-eda

Awesome Lists containing this project

README