
An open API service indexing awesome lists of open source software.

A general classifier module to allow Bayesian and other types of classifications. A fork of cardmagic/classifier.

bayesian-classifier ruby rubyml

Last synced: 22 days ago
JSON representation

A general classifier module to allow Bayesian and other types of classifications. A fork of cardmagic/classifier.




# Classifier Reborn

[![Gem Version](](
[![Build Status](](

## [Read the Docs](

## Getting Started

Classifier Reborn is a general classifier module to allow Bayesian and other types of classifications.
It is a fork of [cardmagic/classifier]( under more active development.
Currently, it has [Bayesian Classifier]( and [Latent Semantic Indexer (LSI)]( implemented.

Here is a quick illustration of the Bayesian classifier.

$ gem install classifier-reborn
$ irb
irb(main):001:0> require 'classifier-reborn'
irb(main):002:0> classifier = 'Ham', 'Spam'
irb(main):003:0> classifier.train "Ham", "Sunday is a holiday. Say no to work on Sunday!"
irb(main):004:0> classifier.train "Spam", "You are the lucky winner! Claim your holiday prize."
irb(main):005:0> classifier.classify "What's the plan for Sunday?"
#=> "Ham"

Now, let's build an LSI, classify some text, and find a cluster of related documents.

irb(main):006:0> lsi =
irb(main):007:0> lsi.add_item "This text deals with dogs. Dogs.", :dog
irb(main):008:0> lsi.add_item "This text involves dogs too. Dogs!", :dog
irb(main):009:0> lsi.add_item "This text revolves around cats. Cats.", :cat
irb(main):010:0> lsi.add_item "This text also involves cats. Cats!", :cat
irb(main):011:0> lsi.add_item "This text involves birds. Birds.", :bird
irb(main):012:0> lsi.classify "This text is about dogs!"
#=> :dog
irb(main):013:0> lsi.find_related("This text is around cats!", 2)
#=> ["This text revolves around cats. Cats.", "This text also involves cats. Cats!"]

There is much more that can be done using Bayes and LSI beyond these quick examples.
For more information read the following documentation topics.

* [Installation and Dependencies](
* [Bayesian Classifier](
* [Latent Semantic Indexer (LSI)](
* [Classifier Validation](
* [Development and Contributions]( (*Optional Docker instructions included*)

### Notes on JRuby support

gem 'classifier-reborn-jruby', platforms: :java

While experimental, this gem should work on JRuby without any kind of additional changes. Unfortunately, you will **not** be able to use C bindings to GNU/GSL or similar performance-enhancing native code. Additionally, we do not use `fast_stemmer`, but rather [an implementation]( of the [Porter Stemming]( algorithm. Stemming will differ between MRI and JRuby, however you may choose to [disable stemming]( and do your own manual preprocessing (or use some other [popular Java library](

If you encounter a problem, please submit your issue with `[JRuby]` in the title.

## Code of Conduct

In order to have a more open and welcoming community, `Classifier Reborn` adheres to the `Jekyll`
[code of conduct]( adapted from the `Ruby on Rails` code of conduct.

Please adhere to this code of conduct in any interactions you have in the `Classifier` community.
If you encounter someone violating these terms, please let [Chase Gilliam]( know and we will address it as soon as possible.

## Authors and Contributors

* [Lucas Carlson](mailto:[email protected])
* [David Fayram II](mailto:[email protected])
* [Cameron McBride](mailto:[email protected])
* [Ivan Acosta-Rubio](mailto:[email protected])
* [Parker Moore](mailto:[email protected])
* [Chase Gilliam](mailto:[email protected])
* and [many more](

The Classifier Reborn library is released under the terms of the [GNU LGPL-2.1](