https://github.com/kampsy/gwizo
Simple Go implementation of the Porter Stemmer algorithm with powerful features.
https://github.com/kampsy/gwizo
consonants nlp nlp-stemming porter-stemmer-algorithm stemmer vowel
Last synced: 10 months ago
JSON representation
Simple Go implementation of the Porter Stemmer algorithm with powerful features.
- Host: GitHub
- URL: https://github.com/kampsy/gwizo
- Owner: kampsy
- License: other
- Created: 2016-02-19T22:30:42.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2021-06-03T14:25:58.000Z (about 5 years ago)
- Last Synced: 2025-04-05T14:34:45.153Z (about 1 year ago)
- Topics: consonants, nlp, nlp-stemming, porter-stemmer-algorithm, stemmer, vowel
- Language: Go
- Homepage:
- Size: 10.1 MB
- Stars: 27
- Watchers: 4
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Authors: AUTHORS
Awesome Lists containing this project
README
# gwizo

[](https://github.com/kampsy/gwizo)
[](https://godoc.org/github.com/kampsy/gwizo)
[](https://github.com/kampsy/gwizo/blob/master/LICENSE)
[](https://twitter.com/kampsy)
Package gwizo implements Porter Stemmer algorithm, M. "An algorithm for suffix stripping."
Program 14.3 (1980): 130-137.
Martin Porter, the algorithm's inventor, maintains a web page about the
algorithm at http://www.tartarus.org/~martin/PorterStemmer/
## Installation
To install, simply run in a terminal:
go get github.com/kampsy/gwizo
## Stem
Stem: stem the word.
```go
package main
import (
"fmt"
"github.com/kampsy/gwizo"
)
func main() {
stem := gwizo.Stem("abilities")
fmt.Printf("Stem: %s\n", stem)
}
```
```shell
$ go run main.go
Stem: able
```
## Vowels, Consonants and Measure
gwizo returns a type Token which has two fileds, VowCon which is the vowel consonut pattern
and the Measure value [v]vc{m}[c]
```go
package main
import (
"fmt"
"github.com/kampsy/gwizo"
"strings"
)
func main() {
word := "abilities"
token := gwizo.Parse(word)
// VowCon
fmt.Printf("%s has Pattern %s \n", word, token.VowCon)
// Measure value [v]vc{m}[c]
fmt.Printf("%s has Measure value %d \n", word, token.Measure)
// Number of Vowels
v := strings.Count(token.VowCon, "v")
fmt.Printf("%s Has %d Vowels \n", word, v)
// Number of Consonants
c := strings.Count(token.VowCon, "c")
fmt.Printf("%s Has %d Consonants\n", word, c)
}
```
```bash
$ go run main.go
abilities has Pattern vcvcvcvvc
abilities has Measure value 4
abilities Has 5 Vowels
abilities Has 4 Consonants
```
## File Stem Performance.
```go
package main
import (
"fmt"
"github.com/kampsy/gwizo"
"bufio"
"io/ioutil"
"strings"
"os"
"time"
)
func main() {
curr := time.Now()
writeOut()
elaps := time.Since(curr)
fmt.Println("============================")
fmt.Println("Done After:", elaps)
fmt.Println("============================")
}
func writeOut() {
re, err := ioutil.ReadFile("input.txt")
if err != nil {
fmt.Println(err)
}
file := strings.NewReader(fmt.Sprintf("%s", re))
scanner := bufio.NewScanner(file)
out, err := os.Create("stem.txt")
if err != nil {
fmt.Println(err)
}
defer out.Close()
for scanner.Scan() {
txt := scanner.Text()
stem := gwizo.Stem(txt)
out.WriteString(fmt.Sprintf("%s\n", stem))
fmt.Println(txt, "--->", str)
}
if err := scanner.Err(); err != nil {
fmt.Println(err)
}
}
```
```shell
$ go run main.go
```