Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jendives2000/regressions

Performing of a Linear Regression analysis to determine the strength of the relationship between the number of reviews and sales for a retail company.
https://github.com/jendives2000/regressions

data-analysis linear-regression pearson-correlation-coefficient regression

Last synced: 4 days ago
JSON representation

Performing of a Linear Regression analysis to determine the strength of the relationship between the number of reviews and sales for a retail company.

Awesome Lists containing this project

README

        

# Project Description:
## Practice Exercise #2:
You are working as a data scientist for a **retail company**. The management wants to:
* understand the relationship between:
- the number of online customer reviews for a product
- and the monthly sales figures for that product.

**They believe that more reviews should correlate with higher sales** but want to quantify this
relationship to guide marketing strategies.

### Objectives
1. Determine the stength of the relationship between:
- the number of customer reviews (independent variable 𝑋)
- and monthly sales (dependent variable 𝑌).

2. Present your conclusions and recommendations

### Dataset:
The dataset was **randomly created using pandas**, this ensures we do **not bring bias** into the set.
It has 120 rows.

## Practice Exercise #1:
You are a data scientist working for an e-commerce company.
The marketing team wants to:
* understand the relationship between:
- the amount of money spent on online advertising
- and the revenue generated from those ads

**They believe that more spendings on ad should correlate with higher revenues** but want to quantify this
relationship to guide marketing strategies.

### Objectives
1. Determine the stength of the relationship between:
- the amount of money spent on online advertising (independent variable 𝑋)
- and the revenue generated from those ads (dependent variable 𝑌).

2. Present your conclusions and recommendations

### Dataset:
The dataset was uploaded from an external source as part of a specialized course.
It has 10 rows.

# Methodology:
For relationship strength evaluation between 2 variables, the **Pearson correlation coefficient** is the way to go.
For practice and memorization purposes, I added the **mathematical formula** whenever it is needed using the **LaTex** language.
For calculations I used the **numpy library**.

For visualizations, I used **matplotlib and scipy** to add the function itself.

# Contact me:
* LinkedIn: https://www.linkedin.com/in/jytran-datascience

My name is Jean-Yves TRAN, I bring 9 years of promotional video creation and project management into Data. My goal is to
first become an ML Engineer and leverage that experience to mature into a Data Scientist.