https://github.com/imswappy/ads-eda-predictor

Interactive Streamlit app for marketing campaign analytics and prediction. Includes dashboards, EDA, econometrics tests (ADF, cointegration, OLS diagnostics), ML pipelines with preprocessing, CV, and persistence. Predict outcomes with Linear/Ridge/Lasso/Random Forest regressors.
https://github.com/imswappy/ads-eda-predictor

adf breusch-pagan cointegration lasso-regression linear-regression matplotlib-pyplot numpy pandas random-forest ridge-regression seaborn sklearn

Last synced: about 1 month ago
JSON representation

Host: GitHub
URL: https://github.com/imswappy/ads-eda-predictor
Owner: Imswappy
Created: 2025-09-29T20:30:38.000Z (about 2 months ago)
Default Branch: main
Last Pushed: 2025-09-29T21:01:37.000Z (about 2 months ago)
Last Synced: 2025-09-29T22:30:32.075Z (about 2 months ago)
Topics: adf, breusch-pagan, cointegration, lasso-regression, linear-regression, matplotlib-pyplot, numpy, pandas, random-forest, ridge-regression, seaborn, sklearn
Language: Jupyter Notebook
Homepage: https://ads-eda-predictor.streamlit.app/
Size: 329 KB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # 🤖 Marketing Data Science Studio — EDA, Econometrics & Predictive Pipelines

[![Streamlit App](https://img.shields.io/badge/Streamlit-Deployed-brightgreen)](https://ads-eda-predictor.streamlit.app/)  

[![GitHub Repo](https://img.shields.io/badge/GitHub-Code-blue)](https://github.com/Imswappy/ads-eda-predictor)

---

## 📌 Overview

This project extends Jupyter notebook analyses into an **interactive Streamlit dashboard** for **marketing campaign analytics**.  

It integrates **fundamentals of data exploration**, **statistical inference**, **econometrics tests**, and **machine learning pipelines**.  

The focus is on comparing **Facebook Ads** vs **AdWords Ads** campaigns — analyzing clicks, conversions, costs, and predicting ad performance.

🔗 **Live App:** [ads-eda-predictor.streamlit.app](https://ads-eda-predictor.streamlit.app/)  

🔗 **Source Code:** [github.com/Imswappy/ads-eda-predictor](https://github.com/Imswappy/ads-eda-predictor)

---

## 📂 Pages in the App

### 1️⃣ Dashboard — KPIs & Time-Series

- **KPIs:**

  - Mean:  

    $$\bar{X} = \frac{1}{n}\sum_{i=1}^n X_i$$

  - Variance:  

    $$s^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})^2$$

  - Standard Error:  

    $$\text{SEM} = \frac{s}{\sqrt{n}}$$

- **Time-series:** detects seasonality, trends, and structural breaks.

- **Scatter + OLS Regression:**

  $$\hat\beta_1 = \frac{\sum (X_i - \bar X)(Y_i - \bar Y)}{\sum (X_i - \bar X)^2},\qquad \hat\beta_0 = \bar Y - \hat\beta_1 \bar X.$$

---

### 2️⃣ Exploratory Data Analysis (EDA)

- **Distributions & Moments:**

  - Skewness:  

    $$\text{Skew} = \frac{1}{n}\sum\left(\frac{X_i - \bar X}{s}\right)^3$$

  - Kurtosis:  

    $$\text{Kurt} = \frac{1}{n}\sum\left(\frac{X_i - \bar X}{s}\right)^4$$

- **Histogram & KDE:**

  $$\hat f(x) = \frac{1}{nh}\sum_{i=1}^n K\left(\frac{x - X_i}{h}\right)$$

- **Correlation (Pearson):**

  $$r_{XY} = \frac{\sum (X_i - \bar X)(Y_i - \bar Y)}{\sqrt{\sum (X_i - \bar X)^2}\sqrt{\sum (Y_i - \bar Y)^2}}.$$

---

### 3️⃣ Statistical Tests & Regression Diagnostics

- **ADF (Stationarity):**

  $$\Delta Y_t = \alpha + \beta t + \gamma Y_{t-1} + \sum_{i=1}^p \delta_i \Delta Y_{t-i} + \varepsilon_t.$$

- **Cointegration (Engle-Granger):**

  - Regress residuals and check for stationarity.

- **Breusch–Pagan Test (Heteroscedasticity):**

  $$H_0: \text{Var}(\varepsilon) = \sigma^2 \quad vs \quad H_1: \text{Var}(\varepsilon) = f(X)$$

- **Diagnostics:** residual plots, robust SEs, adjusted $$R^2$$, AIC/BIC.

---

### 4️⃣ Notebook Reproductions

- Replicates matplotlib & seaborn plots for **validation**.

- **LOWESS smoothing:**  

  $$\hat f(x_0) = \arg\min_\beta \sum w_i(x_0)(y_i - \beta_0 - \beta_1 x_i)^2$$  

  where weights decay with distance from $$x_0$$.

---

### 5️⃣ Predictor — Pipelines, CV & Model Persistence

- **Preprocessing (ColumnTransformer):**

  - Numeric: Median imputation + scaling.  

    - StandardScaler: $$X' = \frac{X - \mu}{\sigma}$$  

    - MinMaxScaler: $$X' = \frac{X - X_{\min}}{X_{\max}-X_{\min}}$$

  - Categorical: Rare-category grouping → most-frequent imputation → OneHotEncoding.

- **Models:**

  - Linear Regression (OLS)  

  - Ridge (L2):  

    $$\min_\beta \sum (y_i - X_i\beta)^2 + \alpha \|\beta\|_2^2$$

  - Lasso (L1):  

    $$\min_\beta \sum (y_i - X_i\beta)^2 + \alpha \|\beta\|_1$$

  - Random Forest (Ensemble):  

    $$\hat f(x) = \frac{1}{B}\sum_{b=1}^B T_b(x)$$

- **Evaluation:**

  - RMSE:  

    $$\text{RMSE} = \sqrt{\frac{1}{n}\sum (y_i - \hat y_i)^2}$$

  - $$R^2 = 1 - \frac{\sum (y_i - \hat y_i)^2}{\sum (y_i - \bar y)^2}$$

- **Persistence:** trained pipelines saved with `joblib` for reuse.

---

## 🚀 Deployment

- Built with **Streamlit** (multi-page app).  

- Deployed on **Streamlit Cloud**: [ads-eda-predictor.streamlit.app](https://ads-eda-predictor.streamlit.app/)  

- Repository: [github.com/Imswappy/ads-eda-predictor](https://github.com/Imswappy/ads-eda-predictor)

---

## 🛠️ Skills Involved

- **Python** (pandas, numpy, matplotlib, seaborn)  

- **Machine Learning** (Linear/Ridge/Lasso, Random Forest, pipelines, CV)  

- **Statistical Inference** (ADF, Cointegration, Breusch–Pagan, OLS diagnostics)  

- **Data Visualization** (Plotly, Altair, Seaborn, Matplotlib)  

- **Streamlit** (multi-page UI, state management, deployment)  

- **Model Deployment** (joblib persistence, end-to-end reproducibility)

---

## 📸 Screenshots



















---

## ⚡ Getting Started

```bash

# Clone repo

git clone https://github.com/Imswappy/ads-eda-predictor.git

cd ads-eda-predictor

# Install dependencies

pip install -r requirements.txt

# Run Streamlit app locally

streamlit run streamlit_app.py

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/imswappy/ads-eda-predictor

Awesome Lists containing this project

README