https://github.com/imswappy/ads-eda-predictor
Interactive Streamlit app for marketing campaign analytics and prediction. Includes dashboards, EDA, econometrics tests (ADF, cointegration, OLS diagnostics), ML pipelines with preprocessing, CV, and persistence. Predict outcomes with Linear/Ridge/Lasso/Random Forest regressors.
https://github.com/imswappy/ads-eda-predictor
adf breusch-pagan cointegration lasso-regression linear-regression matplotlib-pyplot numpy pandas random-forest ridge-regression seaborn sklearn
Last synced: about 1 month ago
JSON representation
Interactive Streamlit app for marketing campaign analytics and prediction. Includes dashboards, EDA, econometrics tests (ADF, cointegration, OLS diagnostics), ML pipelines with preprocessing, CV, and persistence. Predict outcomes with Linear/Ridge/Lasso/Random Forest regressors.
- Host: GitHub
- URL: https://github.com/imswappy/ads-eda-predictor
- Owner: Imswappy
- Created: 2025-09-29T20:30:38.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2025-09-29T21:01:37.000Z (about 2 months ago)
- Last Synced: 2025-09-29T22:30:32.075Z (about 2 months ago)
- Topics: adf, breusch-pagan, cointegration, lasso-regression, linear-regression, matplotlib-pyplot, numpy, pandas, random-forest, ridge-regression, seaborn, sklearn
- Language: Jupyter Notebook
- Homepage: https://ads-eda-predictor.streamlit.app/
- Size: 329 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🤖 Marketing Data Science Studio — EDA, Econometrics & Predictive Pipelines
[](https://ads-eda-predictor.streamlit.app/)
[](https://github.com/Imswappy/ads-eda-predictor)
---
## 📌 Overview
This project extends Jupyter notebook analyses into an **interactive Streamlit dashboard** for **marketing campaign analytics**.
It integrates **fundamentals of data exploration**, **statistical inference**, **econometrics tests**, and **machine learning pipelines**.
The focus is on comparing **Facebook Ads** vs **AdWords Ads** campaigns — analyzing clicks, conversions, costs, and predicting ad performance.
🔗 **Live App:** [ads-eda-predictor.streamlit.app](https://ads-eda-predictor.streamlit.app/)
🔗 **Source Code:** [github.com/Imswappy/ads-eda-predictor](https://github.com/Imswappy/ads-eda-predictor)
---
## 📂 Pages in the App
### 1️⃣ Dashboard — KPIs & Time-Series
- **KPIs:**
- Mean:
$$\bar{X} = \frac{1}{n}\sum_{i=1}^n X_i$$
- Variance:
$$s^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})^2$$
- Standard Error:
$$\text{SEM} = \frac{s}{\sqrt{n}}$$
- **Time-series:** detects seasonality, trends, and structural breaks.
- **Scatter + OLS Regression:**
$$\hat\beta_1 = \frac{\sum (X_i - \bar X)(Y_i - \bar Y)}{\sum (X_i - \bar X)^2},\qquad \hat\beta_0 = \bar Y - \hat\beta_1 \bar X.$$
---
### 2️⃣ Exploratory Data Analysis (EDA)
- **Distributions & Moments:**
- Skewness:
$$\text{Skew} = \frac{1}{n}\sum\left(\frac{X_i - \bar X}{s}\right)^3$$
- Kurtosis:
$$\text{Kurt} = \frac{1}{n}\sum\left(\frac{X_i - \bar X}{s}\right)^4$$
- **Histogram & KDE:**
$$\hat f(x) = \frac{1}{nh}\sum_{i=1}^n K\left(\frac{x - X_i}{h}\right)$$
- **Correlation (Pearson):**
$$r_{XY} = \frac{\sum (X_i - \bar X)(Y_i - \bar Y)}{\sqrt{\sum (X_i - \bar X)^2}\sqrt{\sum (Y_i - \bar Y)^2}}.$$
---
### 3️⃣ Statistical Tests & Regression Diagnostics
- **ADF (Stationarity):**
$$\Delta Y_t = \alpha + \beta t + \gamma Y_{t-1} + \sum_{i=1}^p \delta_i \Delta Y_{t-i} + \varepsilon_t.$$
- **Cointegration (Engle-Granger):**
- Regress residuals and check for stationarity.
- **Breusch–Pagan Test (Heteroscedasticity):**
$$H_0: \text{Var}(\varepsilon) = \sigma^2 \quad vs \quad H_1: \text{Var}(\varepsilon) = f(X)$$
- **Diagnostics:** residual plots, robust SEs, adjusted $$R^2$$, AIC/BIC.
---
### 4️⃣ Notebook Reproductions
- Replicates matplotlib & seaborn plots for **validation**.
- **LOWESS smoothing:**
$$\hat f(x_0) = \arg\min_\beta \sum w_i(x_0)(y_i - \beta_0 - \beta_1 x_i)^2$$
where weights decay with distance from $$x_0$$.
---
### 5️⃣ Predictor — Pipelines, CV & Model Persistence
- **Preprocessing (ColumnTransformer):**
- Numeric: Median imputation + scaling.
- StandardScaler: $$X' = \frac{X - \mu}{\sigma}$$
- MinMaxScaler: $$X' = \frac{X - X_{\min}}{X_{\max}-X_{\min}}$$
- Categorical: Rare-category grouping → most-frequent imputation → OneHotEncoding.
- **Models:**
- Linear Regression (OLS)
- Ridge (L2):
$$\min_\beta \sum (y_i - X_i\beta)^2 + \alpha \|\beta\|_2^2$$
- Lasso (L1):
$$\min_\beta \sum (y_i - X_i\beta)^2 + \alpha \|\beta\|_1$$
- Random Forest (Ensemble):
$$\hat f(x) = \frac{1}{B}\sum_{b=1}^B T_b(x)$$
- **Evaluation:**
- RMSE:
$$\text{RMSE} = \sqrt{\frac{1}{n}\sum (y_i - \hat y_i)^2}$$
- $$R^2 = 1 - \frac{\sum (y_i - \hat y_i)^2}{\sum (y_i - \bar y)^2}$$
- **Persistence:** trained pipelines saved with `joblib` for reuse.
---
## 🚀 Deployment
- Built with **Streamlit** (multi-page app).
- Deployed on **Streamlit Cloud**: [ads-eda-predictor.streamlit.app](https://ads-eda-predictor.streamlit.app/)
- Repository: [github.com/Imswappy/ads-eda-predictor](https://github.com/Imswappy/ads-eda-predictor)
---
## 🛠️ Skills Involved
- **Python** (pandas, numpy, matplotlib, seaborn)
- **Machine Learning** (Linear/Ridge/Lasso, Random Forest, pipelines, CV)
- **Statistical Inference** (ADF, Cointegration, Breusch–Pagan, OLS diagnostics)
- **Data Visualization** (Plotly, Altair, Seaborn, Matplotlib)
- **Streamlit** (multi-page UI, state management, deployment)
- **Model Deployment** (joblib persistence, end-to-end reproducibility)
---
## 📸 Screenshots

---
## ⚡ Getting Started
```bash
# Clone repo
git clone https://github.com/Imswappy/ads-eda-predictor.git
cd ads-eda-predictor
# Install dependencies
pip install -r requirements.txt
# Run Streamlit app locally
streamlit run streamlit_app.py