https://github.com/cano1998/daibetes-multiple-linear-regression
In this project I focused on applying multiple linear regression to analyze and interpret factors influencing diabetes outcomes. The project also evaluates the model's fit using the R-squared (R²) metric.
https://github.com/cano1998/daibetes-multiple-linear-regression
data-preprocessing jupyter-notebook model-evaluation multiple-linear-regression python standardization
Last synced: about 1 month ago
JSON representation
In this project I focused on applying multiple linear regression to analyze and interpret factors influencing diabetes outcomes. The project also evaluates the model's fit using the R-squared (R²) metric.
- Host: GitHub
- URL: https://github.com/cano1998/daibetes-multiple-linear-regression
- Owner: Cano1998
- Created: 2024-06-25T15:18:28.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-06-25T15:30:05.000Z (about 2 years ago)
- Last Synced: 2025-01-04T21:31:11.935Z (over 1 year ago)
- Topics: data-preprocessing, jupyter-notebook, model-evaluation, multiple-linear-regression, python, standardization
- Language: Jupyter Notebook
- Homepage:
- Size: 874 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Daibetes-multiple-linear-regression
The objective of this project is to build a multiple linear regression model to understand the relationship between various predictors and diabetes outcomes. By interpreting the regression coefficients and assessing the model fit using the R² value, we gain insights into the factors that significantly impact diabetes.
The dataset includes various features related to patients' health and diabetes measurements, such as:
Age
BMI (Body Mass Index)
Blood Pressure
Serum Insulin
Blood Glucose Levels
Diabetes Pedigree Function
Other relevant health indicators
## Analysis and model
Data Preprocessing: Cleaning the data and preparing it for model training.
Exploratory Data Analysis: Understanding the distributions and relationships between variables.
Standardizing the dataset using the StandardScaler method to ensure all features have a mean of 0 and a standard deviation of 1.
Multiple Linear Regression: Building and fitting the regression model using multiple predictors.
Model Interpretation: Interpreting the regression coefficients to understand the impact of each predictor.
Model Evaluation: Using R² to assess the goodness-of-fit of the model.
## Key findings
Model Coefficients: Each coefficient in the regression model represents the change in the diabetes outcome for a one-unit change in the predictor, holding other predictors constant.
Standardization: All features were standardized using the StandardScaler method to ensure consistent scaling.
Significant Predictors: Identification of significant predictors that have a notable impact on diabetes outcomes.
R² Value: The R² value indicates the proportion of the variance in the dependent variable that is predictable from the independent variables. A higher R² value suggests a better fit of the model to the observations.