An open API service indexing awesome lists of open source software.

https://github.com/mehrab-kalantari/news-popularity-prediction

News popularity prediction dataset analysis and modeling
https://github.com/mehrab-kalantari/news-popularity-prediction

data-preprocessing data-understanding feature-engineering feature-extraction feature-selection hypothesis-testing machine-learning regression-models supervised-learning

Last synced: 3 months ago
JSON representation

News popularity prediction dataset analysis and modeling

Awesome Lists containing this project

README

        

# News Popularity Prediction
[Dataset on kaggle](https://www.kaggle.com/datasets/thehapyone/uci-online-news-popularity-data-set)

## Contents
### Data understanding and EDA
* Histogram plot
* Data queries
* Box plot
* Correlation matrix

### Hypothesis tests for a better understanding
* Pearson correlation test
* Spearman correlation test
* Kendall-tau correlation test
* T test
* Z test

### Data preprocessing and feature selection
* Missing data values
* Categorical to numerical
* OHE
* Outlier detection
* K-sigma method
* Feature scaling
* Standard scaling
* Min-max normalization
* Robust scaling
* Feature selection
* Forward selection
* Backward selection
* Feature extraction
* PCA

### Modeling (Regression)
* Linear regression
* Polynomial regression
* Ridge regression
* Lasso regression

### Evaluation
* R2 score