Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/avinash793/regression-analysis-examples
Detailed implementation of various regression analysis models and concepts on real dataset.
https://github.com/avinash793/regression-analysis-examples
autocorrelation cooks-distance feature-engineering gls-regression-model goodness-of-fit heteroskedasticity influential-cases leverages multicollinearity multiple-linear-regression-model ols-regression-model outliers poisson-regression-model python3 q-q-plot quantile-regression regression-analysis regression-diagnostics residuals simple-linear-regression
Last synced: about 2 months ago
JSON representation
Detailed implementation of various regression analysis models and concepts on real dataset.
- Host: GitHub
- URL: https://github.com/avinash793/regression-analysis-examples
- Owner: Avinash793
- Created: 2024-01-15T14:55:28.000Z (11 months ago)
- Default Branch: master
- Last Pushed: 2024-01-15T20:48:16.000Z (11 months ago)
- Last Synced: 2024-01-17T00:58:40.675Z (11 months ago)
- Topics: autocorrelation, cooks-distance, feature-engineering, gls-regression-model, goodness-of-fit, heteroskedasticity, influential-cases, leverages, multicollinearity, multiple-linear-regression-model, ols-regression-model, outliers, poisson-regression-model, python3, q-q-plot, quantile-regression, regression-analysis, regression-diagnostics, residuals, simple-linear-regression
- Language: Jupyter Notebook
- Homepage:
- Size: 3.55 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Regression Analysis Examples
Detailed implementation of various regression analysis models and concepts on real datasets.
### Regression Models Covered
* Simple Linear Regression Model
* Multiple Linear Regression Model
* Count Based Dataset Regression Models - Poisson, Negative Binomial, Generalised Poisson
* Quantile Regression Model
* Ordinary Least Squares (OLS) Estimate Based Regression Model
* Generalised Least Squares (GLS) Estimate Based Regression Model### Regression Concepts Covered
You will find implementation of below concepts which can be used for your reference:
* Log Returns
* Ordinary Least Squares (OLS) Estimate
* Generalised Least Squares (GLS) Estimate
* MultiCollinearity
* Variance Inflation Factor (VIF)
* Standardized Residuals
* Studentized Residuals
* Leverages
* Outliers
* Influential Cases
* Cook's Distance
* Regression Diagnostics
* Residual Diagnostics
* Feature Engineering
* Goodness of Fit - Deviance and Pearson Chi-Squared
* Regression Parameters Significant Tests
* Heteroskedasticity
* White's Heteroskedasticity Consistent (HC) Estimator
* Heteroskedasticity & Autocorrelation Consistent (HAC) Estimator
* Q-Q Plot
* LOESS smoothed estimate
* AutoCorrelation
* Simple Linear Regression Model
* Multiple Linear Regression Model
* Poisson Regression Model
* Binomial Regression Model
* Generalised Poisson Regression Model
* Quantile Regression Model### General Regression Analysis Steps
(It's overall guidance not strict, only for overview)
1. Load Dataset
2. Visualise dataset (if possible)
3. Feature Engineering (if required)
4. Define Regression Model
* Response variable (y)
* Explanatory variables or features (X)
* Residual assumption (start with Gauss Markov Assumption)
5. Feature Selection
* check for MultiCollinearity and take action
* Apply PCA
Goal: features should be independent
6. Fit Regression Model (starts with OLS Estimate)
7. Regression Diagnostics
* Regression parameters significance using t-test & generalised linear F-Test.
* Leverages, Outliers and Influential cases
* R-Squared
* For Count based dataset, goodness of fit (Deviance and Pearson Chi-Squared)
Goal: All regression parameters are statistically significant, High R-Squared,
model shouldn't be affected by influential cases.
8. Residual Diagnostics
* Plots:
* Residual plot only
* Residual vs fitted values
* Q-Q plot
* ACF plot
* Tests:
* Homoskedasticity
* Normal Distribution
* Auto-Correlation
Goal: Residual should be white noise.
9. If required delete influential cases, modify model in terms of explanatory
variables and | or residual assumption. Go to 6-7-8 step again.
10. Once we have satisfied model, do forecasting + performance metrics on test dataset.