https://github.com/wadaboa/bayesian-net-classifier
Various classifiers using bayesian networks, for Knowledge Representation class at UNIBO
https://github.com/wadaboa/bayesian-net-classifier
bayesian-networks classifiers forest-augmented-naive-bayes naive-bayes tree-augmented-naive-bayes
Last synced: 5 months ago
JSON representation
Various classifiers using bayesian networks, for Knowledge Representation class at UNIBO
- Host: GitHub
- URL: https://github.com/wadaboa/bayesian-net-classifier
- Owner: Wadaboa
- License: mit
- Created: 2020-06-05T09:56:54.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2024-06-17T23:27:14.000Z (over 1 year ago)
- Last Synced: 2025-04-09T23:52:52.161Z (10 months ago)
- Topics: bayesian-networks, classifiers, forest-augmented-naive-bayes, naive-bayes, tree-augmented-naive-bayes
- Language: Jupyter Notebook
- Homepage:
- Size: 2.47 MB
- Stars: 8
- Watchers: 1
- Forks: 5
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Bayesian network classifiers
In this work, we tested the capabilities of various Bayesian networks structures (mainly Naive Bayes and augmented Naive Bayes) in a classification task, over the standard Adult dataset, which aims at separating people whose income is greater than 50 thousands dollars per year from the rest.
## Installation & Execution
In order to play with the provided Jupyter notebook and test the various classifiers, it is necessary to follow these steps:
* Install `Python 3.8` on your system
* Optionally create a virtual environment in the root directory of the project (`python3 -m venv venv`) and activate it (`source venv/bin/activate`)
* Install the required dependencies (`pip install -r requirements.txt`)
## Implemented models
* **Naive Bayes** (NB): Implementation given by [`pgmpy`](https://github.com/pgmpy/pgmpy)
* **Tree-Augmented Naive Bayes** (TAN): Implementation taken by a pending pull request on the [`pgmpy`](https://github.com/pgmpy/pgmpy/pull/1266) repository
* **BN-Augmented Naive Bayes** (BAN): Custom implementation (slow and buggy)
* **Forest-Augmented Naive Bayes** (FAN): Custom implementation
## Source files structure
The Adult dataset was downloaded from the [`UCI Machine Learning Repository`](http://archive.ics.uci.edu/ml/datasets/adult) and placed inside the `dataset` folder.
The whole project was written in the Jupyter notebook [`classify.ipynb`](classify.ipynb), while the custom structural learning algorithms are implemented in the [`estimators.py`](estimators.py) file.
Moreover, a complete overview of the whole data pre-processing, classification and evaluation pipeline can be found in the [`report.pdf`](report/report.pdf) file, inside the `report` folder.