Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/Wozacosta/classificator
naive bayes classifier
https://github.com/Wozacosta/classificator
Last synced: 4 months ago
JSON representation
naive bayes classifier
- Host: GitHub
- URL: https://github.com/Wozacosta/classificator
- Owner: Wozacosta
- License: mit
- Created: 2017-08-04T13:52:46.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2021-07-07T03:58:47.000Z (over 3 years ago)
- Last Synced: 2024-08-08T15:27:19.990Z (4 months ago)
- Language: JavaScript
- Size: 61.5 KB
- Stars: 24
- Watchers: 3
- Forks: 10
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# classificator
[![NPM Licence shield](https://img.shields.io/github/license/Wozacosta/classificator.svg)](https://github.com/Wozacosta/classificator/blob/master/LICENSE)
[![NPM release version shield](https://img.shields.io/npm/v/classificator.svg)](https://www.npmjs.com/package/classificator)Naive Bayes classifier for node.js
`bayes` takes a document (piece of text), and tells you what category that document belongs to.
## What can I use this for?
You can use this for categorizing any text content into any arbitrary set of **categories**. For example:
- is an email **spam**, or **not spam** ?
- is a news article about **technology**, **politics**, or **sports** ?
- is a piece of text expressing **positive** emotions, or **negative** emotions?More here: https://en.wikipedia.org/wiki/Naive_Bayes_classifier
## Installing
Recommended: Node v6.0.0 +
```
npm install --save classificator
```## Usage
```
const bayes = require('classificator')
const classifier = bayes()
```### Teach your classifier
```
classifier.learn('amazing, awesome movie! Had a good time', 'positive')
classifier.learn('Buy my free viagra pill and get rich!', 'spam')
classifier.learn('I really hate dust and annoying cats', 'negative')
classifier.learn('LOL this sucks so hard', 'troll')
```### Make your classifier unlearn
```
classifier.learn('i hate mornings', 'positive');
// uh oh, that was mistake. Time to unlearn
classifier.unlearn('i hate mornings', 'positive');
```### Remove a category
```
classifier.removeCategory('troll');
```### categorization
```
classifier.categorize("I've always hated Martians");
// => {
likelihoods: [
{
category: 'negative',
logLikelihood: -17.241944258040537,
logProba: -0.6196197927020783,
proba: 0.538149006882628
}, {
category: 'positive',
logLikelihood: -17.93509143860048,
logProba: -1.312766973262022,
proba: 0.26907450344131445
}, {
category: 'spam',
logLikelihood: -18.26854831109384,
logProba: -1.646223845755383,
proba: 0.19277648967605832 }
],
predictedCategory: 'negative'
}
```### serialize the classifier's state as a JSON string.
`let stateJson = classifier.toJson()`
### load the classifier back from its JSON representation.
`let revivedClassifier = bayes.fromJson(stateJson)`
note: `stateJson` can either be a JSON string (obtained from `classifier.toJson()`), or an object
--------
## API
### `let classifier = bayes([options])`
Returns an instance of a Naive-Bayes Classifier.
Pass in an optional `options` object to configure the instance.
If you specify a `tokenizer` function in `options`, it will be used as the instance's tokenizer. It receives a (string) `text` argument - this is the string value that is passed in by you when you call `.learn()` or `.categorize()`. It must return an array of tokens. The default tokenizer removes punctuation and splits on spaces.
Eg.
```
let classifier = bayes({
tokenizer: function (text) { return text.split(' ') }
})
```You can specify the `alpha` parameter of the [additive smoothing operation](https://en.wikipedia.org/wiki/Additive_smoothing).
This is an integer.
The default value is 1You can also specify the `fitPrior` parameter.
Defines how the [prior probablity](https://en.wikipedia.org/wiki/Prior_probability) is calculated.
If set to `false`, the classifier will use an uniform prior rather than a learnt one.
The default value is `true`.### `classifier.learn(text, category)`
Teach your classifier what `category` should be associated with an array `text` of words.
### `classifier.unlearn(text, category)`
The classifier will unlearn the `text` that was associated with `category`.
### `classifier.removeCategory(category)`
The category is removed and the classifier data are updated accordingly.
### `classifier.categorize(text)`
*Parameters*
`text {String}`
*Returns*
`{Object}` An object with the `predictedCategory` and an array of the categories
ordered by likelihood (most likely first).```
{
likelihoods : [
...
{
category: 'positive',
logLikelihood: -17.93509143860048,
logProba: -1.312766973262022,
proba: 0.26907450344131445
},
...
],
predictedCategory : 'negative' //--> the main category bayes thinks text
belongs to. As a string
}
```### `classifier.toJson()`
Returns the JSON representation of a classifier.
### `let classifier = bayes.fromJson(jsonStr)`
Returns a classifier instance from the JSON representation. Use this with the JSON representation obtained from `classifier.toJson()`