https://github.com/kafkasl/scala_classifiers

Scala repository implementing basic classifiers Max a Posteriori and Naive Bayes, and more advanced method Tree Augmented Naive Bayes
https://github.com/kafkasl/scala_classifiers

Last synced: 3 months ago
JSON representation

Scala repository implementing basic classifiers Max a Posteriori and Naive Bayes, and more advanced method Tree Augmented Naive Bayes

Host: GitHub
URL: https://github.com/kafkasl/scala_classifiers
Owner: kafkasl
License: mit
Created: 2017-06-28T07:24:07.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2017-06-28T07:50:06.000Z (about 8 years ago)
Last Synced: 2025-02-08T05:28:21.937Z (5 months ago)
Language: Scala
Size: 6.97 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Bayesian Classifiers in action with Scala

This repository contains the Scala implementation of 3 classifiers:

* Maximum a posteriori
* Naive Bayes
* Tree Augmented Naive Bayes

Report folder contains a detailed explanation of all the classes, the implemented methods as well as some performance tests of the different classifiers. Refer to that for more information.

## Maximum a posteriori

In Bayesian statistics, a maximum a posteriori probability (MAP) estimate is an estimate of an unknown quantity, that equals the mode of the posterior distribution. The MAP can be used to obtain a point estimate of an unobserved quantity on the basis of empirical data.
This method is quite slow and requires too many instances to be practical. For example:

Task of binary classification:
- 10 attributes with four values each;
- Probabilities needed:
- * Store 2^20 conditional probabilities;
- * Estimate 2^20 conditional probabilities.

## Naive Bayes classifiers

Simple probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence assumptions between the features.

In order to reduce the amount of instances and probabilities needed, we assume that parameters are independent conditioned on the target.

Pr(A1 . . . An|C) ∗ Pr(C) = Pr(A1|C) ∗ . . . ∗ Pr(An|C) ∗ Pr(C)

It is an strong assumption, but works quite well even in scenarios were it does not hold.

## Tree Augmented Naive Bayes

Tree Augmented Naive Bayes (TANaiveBayes or TANB) method tries to soften the independence assumption between variables of Naive Bayes with a tree-like dependency structure.

For more information check papers under ./papers folder or read the report.

If you would like to contribute, have similar and interestings projects, find any error or have some question feel free to write.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kafkasl/scala_classifiers

Awesome Lists containing this project

README