https://github.com/haydenhigg/bengal

Easy-to-use Go implementation of the multinomial Naive Bayes classifier for multilabel text classification.
https://github.com/haydenhigg/bengal

go golang machine-learning multilabel-classification naive-bayes naive-bayes-classifier text-classification text-classifier

Last synced: 27 days ago
JSON representation

Easy-to-use Go implementation of the multinomial Naive Bayes classifier for multilabel text classification.

Host: GitHub
URL: https://github.com/haydenhigg/bengal
Owner: haydenhigg
License: mit
Created: 2020-11-18T22:52:52.000Z (over 5 years ago)
Default Branch: main
Last Pushed: 2022-10-31T18:55:36.000Z (over 3 years ago)
Last Synced: 2025-12-17T10:42:12.041Z (6 months ago)
Topics: go, golang, machine-learning, multilabel-classification, naive-bayes, naive-bayes-classifier, text-classification, text-classifier
Language: Go
Homepage:
Size: 67.4 KB
Stars: 4
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # bengal

Optimized Go implementation of Naive Bayes classifiers for multilabel text classification.

## install

In your project:

`$ go get github.com/haydenhigg/bengal`

Then, import it as:

```go

import "github.com/haydenhigg/bengal"

```

## use

### modeling

- `TrainMultinomial(xs, ys [][]string, smoothing float64) NaiveBayesModel`: Creates and trains a multinomial model.

- `(model *NaiveBayesModel) PredictMultinomial(x []string) []string`: Predicts the labels for an input using token presence only.

- `NewBernoulli(xs, ys [][]string, smoothing float64) NaiveBayesModel`: Creates and trains a Bernoulli model.

- `(model *NaiveBayesModel) PredictBernoulli(x []string) []string`: Predicts the labels for an input using token presence and absence.

### example

```go

package main

import (

  "fmt"

  "github.com/haydenhigg/bengal"

)

func main() {

	inputs := [][]string{

		[]string{"the", "cat", "was", "crying"},

		[]string{"dogs", "like", "to", "smile"},

		...,

	}

	outputs := [][]string{

		[]string{"cat", "sad"},

		[]string{"dog", "happy"},

		...,

	}

	smoothing := 1.0 // fix the zero-probability problem, 1.0 is common

	model := bengal.NewBernoulli(inputs, outputs, smoothing)

	fmt.Println(model.PredictBernoulli([]string{...}))

}

```

## notes

- It is recommended to stem all input examples using something like [this](https://github.com/dchest/stemmer) before training or predicting.

- This uses log probabilities and smoothing for robustness.

- It's viable to use a different training function than prediction function. You can use `NewBernoulli` for training but `PredictMultinomial` for faster -- and similarly accurate on short documents -- predictions.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/haydenhigg/bengal

Awesome Lists containing this project

README