An open API service indexing awesome lists of open source software.

https://github.com/derak-isaack/ubereatsanalytics

Analyze Uber Eats Menu big data for various analytics
https://github.com/derak-isaack/ubereatsanalytics

apache duckdb kaggle olap-database parquet pyarrow python3 seaborn sql uber-eats

Last synced: 7 months ago
JSON representation

Analyze Uber Eats Menu big data for various analytics

Awesome Lists containing this project

README

          

##


UBER-EATS RESTAURANTS BIG DATA MENU PRICE ANALYSIS

![Duckdb](https://img.shields.io/badge/DuckDB-FFF000?logo=duckdb&logoColor=000&style=for-the-badge)
![sql](https://img.shields.io/badge/SQLite-003B57?logo=sqlite&logoColor=fff&style=for-the-badge)
![UberEats](https://img.shields.io/badge/Uber%20Eats-06C167?logo=ubereats&logoColor=fff&style=for-the-badge)
![parquet](https://img.shields.io/badge/Apache%20Parquet-50ABF1?logo=apacheparquet&logoColor=fff&style=for-the-badge)
![Kaggle](https://img.shields.io/badge/Kaggle-20BEFF?logo=kaggle&logoColor=fff&style=for-the-badge)
![Python](https://img.shields.io/badge/Python-3776AB?logo=python&logoColor=fff&style=for-the-badge)

###


Project Overview

`Big data analytics` for delivery companies is always very necessary to uncover customer trends and patterns. Trends help these companies in optimizing their services to ensure a smooth flow of operations between riders as well as forecast future demand of various products and services.

This project aimed to analyze the menus of various restaurants that use `uber-eats courier` as their supply company. As this is big data, the use of `OLAP-DATABASES` comes in handy to perfrom heavy analytics using `SQL`. These databases offer:

* `Scalability`: Can easily manage massive amounts of data efficiently.
* `Low latency`: Optimized for complex analytical queries due to their latency structure and optimized data structures and indexing.
* `Batch processing`: Good for `large-scale` data processing tasks efficiently.

`Uber-Eats` customers also have the right to investigate the prices of various items on the menu to find the cheapest and most expensive restaurants to save on expenditure. The data was sourced from `kaggle` and can be accessed by using the following [link](https://www.kaggle.com/datasets/ahmedshahriarsakib/uber-eats-usa-restaurants-menus)

###


Objectives

1. Investigate the `number of restaurants` that use `UBER-EATS`.
2. Investigate the `top restaurants with the largest number of items.`
3. Investigate the `price range of items` for the restaurant with the highest number of items.
4. Investigate the `average price of items` across various restaurants
5. Investigate the `most expensive restaurant.`
6. `Snacks items` analysis. Analyze the price of snack items across various restaurants.

###


Findings

![avg-price](images/avg_price.png)
![snacks](images/snacks_prices.png)
![prices](images/Items_prices.png)

###


Future steps

1. Investigate the correlation between location i.e `latitude` & `longitude` and `menu-prices`.
2. Outsource data for `customer orders` and analyze `consumer purchasing trends`.

###


Getting started

All the requirements to run this project are listed in the file [requirements-file](requirements.txt) together with the library versions.

1. Download the `zip-file` from `kaggle`
2. Make a project directory using the `mkdir` command on the `terminal` and extract the filesin the zip file.
3. The files are in zip formart having a size of `896 mb`.
4. Follow along through the parquet convert [file](parquet_convert.py) which uses `apache pyarrow`.
5. Load the parquet file into an `OLAP database` namely `DuckDB` in this case for quick analytical queries.