https://github.com/quantum-software-development/integrated_project-business
https://github.com/quantum-software-development/integrated_project-business
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/quantum-software-development/integrated_project-business
- Owner: Quantum-Software-Development
- License: mit
- Created: 2025-02-24T13:38:46.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-02-24T14:39:31.000Z (3 months ago)
- Last Synced: 2025-02-24T14:40:34.970Z (3 months ago)
- Size: 352 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
#
Integrated Project for [Business]()
$$\Huge {\textbf{\color{DodgerBlue} GOOD DECISIONS = GOOD RESULTS}}$$
###
[](https://github.com/sponsors/Quantum-Software-Development)
## [Standardization of a Range of Values]()
t describes the process of scaling or normalizing data within a specific range, typically to a standardized scale, for example, from 0 to 1. This is a common technique in data analysis and machine learning.
###
[Mathematical Formula]()
$$X_{normalized} = \frac{X - X_{\min}}{X_{\max} - X_{\min}}$$
###
[Where]():
$$X_{\max} - X_{\min} = \text{Amplitude}$$
####
Is the `amplitude`, a way to represent the range of data values before normalization.
## [Explanation]():
To calculate the standardization of the variables salario, n_filhos, and idade using both the Z-Score and Range methods, and to evaluate the mean, standard deviation, maximum, and minimum before and after standardization, we can follow these steps:
### [Before Standardization]():
Compute the mean, standard deviation, maximum, and minimum for each of the variables (n_filhos, salario, idade).
#### [Z-Score Standardization]():
We standardize the variables using the Z-Score method, which is computed as:
$Z$ = $\frac{X - \mu}{\sigma}$
```latex
Z = \frac{X - \mu}{\sigma}
```Where:
- $\( \mu \)$ is the mean,
- $\( \sigma \)$ is the standard deviation.
#### [Range Standardization (Min-Max Scaling)]():
We scale the data using the Min-Max method, which scales the values to a [0, 1] range using:
$X'$ = $\frac{X - \min(X)}{\max(X) - \min(X)}$
```latex
X' = \frac{X - \min(X)}{\max(X) - \min(X)}
```
Where:
- X is the original value,
- min(X) is the minimum value,
- max(X) is the maximum value.
### [After Standardization]():
Compute the mean, standard deviation, maximum, and minimum of the standardized data for both Z-Score and Range methods.
The output will provide the descriptive statistics before and after each standardization method, allowing you to compare the effects of Z-Score and Range standardization on the dataset.
## Practical Example for Calculating this Normalized Value in [Python]():
#### Use this [dataset](https://github.com/Quantum-Software-Development/Integrated_Project-Business/blob/f2d7abe6ee5853ae29c750170a01e429334f6fe5/HomeWork/1-Z-Score-Range/cadastro_funcionarios.xlsx)
The code demonstrates how to apply Z-Score and Range (Min-Max) standardization to the variables salario, n_filhos, and idade in a dataset. It also evaluates and compares the mean, standard deviation, minimum, and maximum values before and after the standardization methods are applied.
#### Cell 1: [Import necessary libraries]()
```python
# Importing the necessary libraries
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
```
#### Cell 2: [Load the dataset from the Excel file]()
```python
# Load the data from the Excel file
# df = pd.read_excel('use-your-own-dataset.xlsx') - optional
df = pd.read_excel('cadastro_funcionarios.xlsx')
df.head() # Displaying the first few rows of the dataset to understand its structure
```
#### Cell 3: [Evaluate the statistics before standardization]()
```python
# Step 1: Evaluate the mean, std, max, and min before standardization
before_std_stats = {
'mean_n_filhos': df['n_filhos'].mean(),
'std_n_filhos': df['n_filhos'].std(),
'min_n_filhos': df['n_filhos'].min(),
'max_n_filhos': df['n_filhos'].max(),
'mean_salario': df['salario'].mean(),
'std_salario': df['salario'].std(),
'min_salario': df['salario'].min(),
'max_salario': df['salario'].max(),
'mean_idade': df['idade'].mean(),
'std_idade': df['idade'].std(),
'min_idade': df['idade'].min(),
'max_idade': df['idade'].max(),
}# Display the statistics before standardization
before_std_stats
```
Cell 4: [Apply Z-Score standardization]()
```python
# Step 2: Z-Score Standardization
df_zscore = df[['n_filhos', 'salario', 'idade']].apply(lambda x: (x - x.mean()) / x.std())# Display the standardized data
df_zscore.head()
```
Cell 5: [Evaluate the statistics after Z-Score standardization]()
```python
# Step 3: Evaluate the mean, std, max, and min after Z-Score standardization
after_zscore_stats = {
'mean_n_filhos_zscore': df_zscore['n_filhos'].mean(),
'std_n_filhos_zscore': df_zscore['n_filhos'].std(),
'min_n_filhos_zscore': df_zscore['n_filhos'].min(),
'max_n_filhos_zscore': df_zscore['n_filhos'].max(),
'mean_salario_zscore': df_zscore['salario'].mean(),
'std_salario_zscore': df_zscore['salario'].std(),
'min_salario_zscore': df_zscore['salario'].min(),
'max_salario_zscore': df_zscore['salario'].max(),
'mean_idade_zscore': df_zscore['idade'].mean(),
'std_idade_zscore': df_zscore['idade'].std(),
'min_idade_zscore': df_zscore['idade'].min(),
'max_idade_zscore': df_zscore['idade'].max(),
}# Display the statistics after Z-Score standardization
after_zscore_stats
```
Cell 6: [Apply Range Standardization]() (Min-Max Scaling)
```python
# Step 4: Range Standardization (Min-Max Scaling)
scaler = MinMaxScaler()
df_range = pd.DataFrame(scaler.fit_transform(df[['n_filhos', 'salario', 'idade']]), columns=['n_filhos', 'salario', 'idade'])# Display the scaled data
df_range.head()
```
Cell 7: [Evaluate the statistics after Range standardization]()
```python
# Step 5: Evaluate the mean, std, max, and min after Range standardization
after_range_stats = {
'mean_n_filhos_range': df_range['n_filhos'].mean(),
'std_n_filhos_range': df_range['n_filhos'].std(),
'min_n_filhos_range': df_range['n_filhos'].min(),
'max_n_filhos_range': df_range['n_filhos'].max(),
'mean_salario_range': df_range['salario'].mean(),
'std_salario_range': df_range['salario'].std(),
'min_salario_range': df_range['salario'].min(),
'max_salario_range': df_range['salario'].max(),
'mean_idade_range': df_range['idade'].mean(),
'std_idade_range': df_range['idade'].std(),
'min_idade_range': df_range['idade'].min(),
'max_idade_range': df_range['idade'].max(),
}# Display the statistics after Range standardization
after_range_stats
```
## Pratical Example for Calculating this Normalized Value in [Excel]()
#### Use this [dataset](https://github.com/Quantum-Software-Development/Integrated_Project-Business/blob/f2d7abe6ee5853ae29c750170a01e429334f6fe5/HomeWork/1-Z-Score-Range/cadastro_funcionarios.xlsx)
To standardize the variables (salary, number of children, and age) in Excel using the Z-Score and Range methods, you can follow these steps:
## I. [Z-Score Standardization]()
### Steps for Z-Score in Excel:
#### 1. [Find the Mean (µ)]():
Use the AVERAGE function to calculate the mean of the column. For example, to find the mean of the salary (column E), use:
```excel
=AVERAGE(E2:E351)
```
#### 2. [Find the Standard Deviation (σ)]():
Use the STDEV.P function to calculate the standard deviation of the column. For example, to find the standard deviation of the salary (column E), use:```excel
=STDEV.P(E2:E351)
```
#### 3. [Apply the Z-Score Formula]():
For each value in the column, apply the Z-Score formula. In the first row of the new column, use:
```excel
=(E2 - AVERAGE(E$2:E$351)) / STDEV.P(E$2:E$351)
```
#### 4.[Drag the formula down to calculate the Z-Score for all the rows]():
Example for Salary:
In cell H2 (new column for standardized salary), write
```excel
=(E2 - AVERAGE(E$2:E$351)) / STDEV.P(E$2:E$351)
```Then, drag it down to the rest of the rows.
Repeat the same steps for the variables n_filhos (column D) and idade (column F).
## II. [Range Standardization]()
Steps for Range Standardization in Excel:
#### 1. [Find the Min and Max]():
Use the MIN and MAX functions to find the minimum and maximum values of the column. For example, to find the min and max of salary (column E), use:
```excel
=MIN(E2:E351)
=MAX(E2:E351)
```
#### 2. [Apply the Range Formula]():
For each value in the column, apply the range formula. In the first row of the new column, use:
```excel
=(E2 - MIN(E$2:E$351)) / (MAX(E$2:E$351) - MIN(E$2:E$351))
```
#### 3.[Drag the formula down to calculate the range standardized values for all the rows]():
Example for Salary:
In cell I2 (new column for range standardized salary), write:
```excel
=(E2 - MIN(E$2:E$351)) / (MAX(E$2:E$351) - MIN(E$2:E$351))
```Then, drag it down to the rest of the rows.
Repeat the same steps for the variables n_filhos (column D) and idade (column F).
## Summary of the Process
[Z-Score Standardization]() centers the data around [zero]() and scales it based on the [standard deviation]().
[Range Standardization (Min-Max Scaling)]() rescals the data to a [[0, 1] range]().
Both techniques were applied (given dataset) to the [columns n_filhos](), [salario](), and [idade](), and the statistics (mean, std, min, max) were calculated before and after the standardization methods.
#
######
Copyright 2025 Quantum Software Development. Code released under the [MIT License license.](https://github.com/Quantum-Software-Development/Math/blob/3bf8270ca09d3848f2bf22f9ac89368e52a2fb66/LICENSE)