Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/BlackRabbitt/mspm
Multi-String Pattern Matching Algorithm Using TrieNode
https://github.com/BlackRabbitt/mspm
aho-corasick golang multi-search search trie
Last synced: about 2 months ago
JSON representation
Multi-String Pattern Matching Algorithm Using TrieNode
- Host: GitHub
- URL: https://github.com/BlackRabbitt/mspm
- Owner: BlackRabbitt
- License: bsd-3-clause
- Created: 2018-05-17T18:59:44.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2018-05-19T06:36:38.000Z (over 6 years ago)
- Last Synced: 2024-07-31T20:48:39.098Z (4 months ago)
- Topics: aho-corasick, golang, multi-search, search, trie
- Language: Go
- Homepage:
- Size: 12.7 KB
- Stars: 25
- Watchers: 3
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-go - mspm - Multi-String Pattern Matching Algorithm for information retrieval. (Data Structures and Algorithms / Text Analysis)
- awesome-go - mspm - Multi-String Pattern Matching Algorithm for information retrieval. (Data Structures and Algorithms / Text Analysis)
- awesome-go - mspm - Multi-String Pattern Matching Algorithm Using TrieHashNode - ★ 3 (Data Structures)
- awesome-go-extra - mspm - String Pattern Matching Algorithm Using TrieHashNode|17|4|0|2018-05-17T18:59:44Z|2018-05-19T06:36:38Z| (Generators / Text Analysis)
README
# Multi-String Pattern Matching algorithm.
[![Go Report Card](https://goreportcard.com/badge/github.com/BlackRabbitt/mspm)](https://goreportcard.com/report/github.com/BlackRabbitt/mspm)
[![GoDoc](https://godoc.org/github.com/BlackRabbitt/mspm?status.svg)](https://godoc.org/github.com/BlackRabbitt/mspm)This implementation is inspired from [Aho-Corasick algorithm](https://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_algorithm)
## Getting Started
```
modelA = mspm.NewModel("mspm_model_A")
patternsToSearch = strings.NewReader(words) // words is newline seperated list of words/keywords// build mspm model with patterns
modelA.Build(patternsToSearch)inputString := "input document where the above pattern is searched for."
document := strings.NewReader(inputString)
output, err := modelA.MultiTermMatch(document)// output ~= [{matched_word: n_count}, ..]
```## Test Coverage
* [trie package](https://gocover.io/github.com/blackrabbitt/mspm/ds/trie)
* [mspm package](https://gocover.io/github.com/blackrabbitt/mspm/search)## TrieNode vs TrieHashNode benchmark
```
$ cd github.com/BlackRabbitt/mspm/ds/trie
$ go test -bench=.
goos: darwin
goarch: amd64
pkg: github.com/BlackRabbitt/mspm/ds/trie
BenchmarkTrieNodeInsert-4 500000 2582 ns/op
BenchmarkTrieNodeSearch-4 10000000 205 ns/op
BenchmarkTrieHashNodeInsert-4 1000000 1491 ns/op
BenchmarkTrieHashNodeSearch-4 10000000 206 ns/op
PASS
ok github.com/BlackRabbitt/mspm/ds/trie 8.795s
```## Resources
1. [Trie](https://en.wikipedia.org/wiki/Trie)
2. [mspm](http://www.ijsrp.org/research_paper_jul2012/ijsrp-july-2012-101.pdf) - Multi-String Pattern Matching algorithm. Generally used for Information Retrieval.
3. [Aho-Corasick algorithm](https://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_algorithm)