https://github.com/dpb24/netflix-global-top-10-performance-predictor-lr
Netflix Global Top 10 Performance Predictor - linear regression (Python)
https://github.com/dpb24/netflix-global-top-10-performance-predictor-lr
linear-regression-python multiple-linear-regression sklearn
Last synced: 3 months ago
JSON representation
Netflix Global Top 10 Performance Predictor - linear regression (Python)
- Host: GitHub
- URL: https://github.com/dpb24/netflix-global-top-10-performance-predictor-lr
- Owner: dpb24
- Created: 2025-03-16T20:52:16.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-03-28T17:22:22.000Z (3 months ago)
- Last Synced: 2025-03-28T17:46:02.842Z (3 months ago)
- Topics: linear-regression-python, multiple-linear-regression, sklearn
- Language: Jupyter Notebook
- Homepage:
- Size: 5.29 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Netflix Global Top 10 Performance Predictor (Python)
### 🎬 Predicting Netflix Title Performance with linear regression 📊
Can we predict how many hours a Netflix title will be viewed based on its first two weeks in the Global Top 10? What trends influence what we watch?
### 💡Key results:
- **77.0%** of the variation in total hours viewed is explained by multiple linear regression model
- **All predictors** were statistically significant, meaning they have a measurable impact on total hours viewed
- **Category matters:** 📺 TV shows tend to perform better than 🍿 Films### 📊 Statistical Approach:
- **Data Preprocessing:** applied log transformation and removed outliers beyond IQR for model robustness
- **ANOVA & Tukey’s HSD Test:** identified significant differences in viewing trends across content categories
- **Regression Diagnostics:** assessed homoscedasticity, multicollinearity, and normality (Q-Q plot, histogram)
- **Model Evaluation:** computed R², adjusted R², F-statistic, and Mean Squared Error (MSE)### 🔗 Project Resources:
📖 Jupyter Notebook: [GitHub](https://github.com/dpb24/netflix-global-top-10-performance-predictor-lr/blob/main/netflix-global-top-10-performance-predictor-lr.ipynb) | [Kaggle](https://www.kaggle.com/code/davidpbriggs/netflix-global-top-10-performance-predictor-lr)
📂 Dataset: [Netflix Global Top 10 weekly data](https://www.kaggle.com/datasets/davidpbriggs/most-popular-netflix-shows)
![]()
![]()
![]()