https://github.com/francescopiocirillo/bayesian-map-vs-logistic-regression
Binary classification project in Python comparing a MAP classifier (with known distributions) vs. logistic regression. Includes analytical posterior derivation, Monte Carlo simulation, and gradient-based training. Focus on error probability, variance impact, and model performance.
https://github.com/francescopiocirillo/bayesian-map-vs-logistic-regression
bayesian-classifier binary-classification classification-algorithm data-science gradient-descent jupyter-notebook logistic-regression machine-learning map-estimation monte-carlo-simulation probabilistic-models python statistical-modeling supervised-learning
Last synced: 17 days ago
JSON representation
Binary classification project in Python comparing a MAP classifier (with known distributions) vs. logistic regression. Includes analytical posterior derivation, Monte Carlo simulation, and gradient-based training. Focus on error probability, variance impact, and model performance.
- Host: GitHub
- URL: https://github.com/francescopiocirillo/bayesian-map-vs-logistic-regression
- Owner: francescopiocirillo
- License: mit
- Created: 2025-08-07T16:52:29.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-08-07T18:25:36.000Z (10 months ago)
- Last Synced: 2025-08-07T20:34:13.439Z (10 months ago)
- Topics: bayesian-classifier, binary-classification, classification-algorithm, data-science, gradient-descent, jupyter-notebook, logistic-regression, machine-learning, map-estimation, monte-carlo-simulation, probabilistic-models, python, statistical-modeling, supervised-learning
- Language: Jupyter Notebook
- Homepage:
- Size: 1.86 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# π Bayesian MAP vs Logistic Regression for Binary Classification
[](https://www.python.org/)
[](https://jupyter.org/)
[](https://opensource.org/licenses/MIT)
π§ This project compares two approaches to binary classification: a **MAP classifier** with known probabilistic distributions, and a **logistic regression model** trained via stochastic gradient descent. It includes analytical derivations, visualizations, Monte Carlo simulations, and performance comparisons.
> This project showcases proficiency in machine learning, statistical modeling, and Python programming. Includes implementation of probabilistic classifiers (MAP, logistic regression), use of stochastic gradient descent, and performance evaluation via Monte Carlo simulation and error metrics.
π This project is part of a series of logistic regression case studies. For another similar project, see [bit-decoding-sgd-logistic-regression](https://github.com/francescopiocirillo/bit-decoding-sgd-logistic-regression).
## π Objective
The goal is to analyze and compare two different settings for binary classification:
1. **Case 1 β Fully known model**:
- Derive posterior probabilities analytically.
- Visualize `p(y=+1 | x)` for different variances (easy vs. difficult classification).
- Estimate classification error using Monte Carlo simulation for both scenarios.
2. **Case 2 β Supervised learning**:
- Train a logistic regression classifier using stochastic gradient descent (SGD).
- Estimate classification error empirically using synthetic datasets.
- Compare performance with the MAP classifier.
## π Language Note
All code comments are written in Italian, as this project was originally developed in an academic setting in Italy. Nonetheless, the structure, organization, and methodology follow international best practices in data science and statistical modeling.
The project_brief and the report are available both in English and Italian.
## π οΈ Techniques and Methods
- π **Bayesian classification** with known priors and Gaussian class-conditionals.
- π§ **Logistic regression** trained with SGD (with and without Monte Carlo averaging).
- π **Monte Carlo simulation** for empirical error evaluation.
- π **Performance analysis** based on varying variance and data richness.
## π§ͺ Results Summary
- Lower variance leads to sharper posteriors and reduced classification error.
- Logistic regression trained with Monte Carlo-averaged Ξ² coefficients yields more stable performance.
- As expected, the MAP classifier achieves slightly lower error, especially when data is limited.
## π Repository Structure
```
π¦ BAYESIAN-MAP-VS-LOGISTIC-REGRESSION/
β
βββ π instructions/
β βββ project_brief_ENGLISH.pdf
β βββ project_brief_ITALIAN.pdf
β
βββ π notebooks/
β βββ case1_bayesian_map.ipynb
β βββ case2_logistic_regression.ipynb
β
βββ π report/
β βββ classification_comparison_ENGLISH.pdf
β βββ classification_comparison_ITALIAN.pdf
β
βββ .gitignore
βββ LICENSE
βββ README.md
```
## π Technologies Used
- **Language**: Python
- **Environment**: Jupyter Notebook
- **Libraries**: `NumPy`, `Matplotlib`, `SciPy`
## π About the Project
This project was created as part of the *Data Science & Data Analysis* course (2024/2025) for the Masterβs Degree in Computer Engineering at the **University of Salerno**.
## π₯ Team β University of Salerno
* [@francescopiocirillo](https://github.com/francescopiocirillo)
* [@alefaso-lucky](https://github.com/alefaso-lucky)
## π¬ Contacts
βοΈ Got feedback or want to contribute? Feel free to open an Issue or submit a Pull Request!
## π SEO Tags
```
Bayesian classification, logistic regression, MAP estimation, supervised learning, binary classification, statistical modeling, Python, Jupyter Notebook, Monte Carlo simulation, machine learning, pattern recognition, probabilistic models, data science project, AI model comparison, classification algorithms
```
## π License
This project is licensed under the **MIT License**, a permissive open-source license that allows anyone to use, modify, and distribute the software freely, as long as credit is given and the original license is included.
> In plain terms: **use it, build on it, just donβt blame us if something breaks**.
> β Like what you see? Consider giving the project a star!