https://github.com/gnames/aho_corasick
Implementation of Aho-Corasick algorithm
https://github.com/gnames/aho_corasick
Last synced: 5 months ago
JSON representation
Implementation of Aho-Corasick algorithm
- Host: GitHub
- URL: https://github.com/gnames/aho_corasick
- Owner: gnames
- License: mit
- Created: 2021-09-20T23:39:32.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2024-11-11T17:57:26.000Z (over 1 year ago)
- Last Synced: 2024-11-11T18:38:18.834Z (over 1 year ago)
- Language: Go
- Size: 2.48 MB
- Stars: 0
- Watchers: 5
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# aho_corasick
A Go implementation of Aho-Corasick algorithm for efficient multiple pattern
matching within a string.
## Introduction
This project implements the powerful string searching [Aho-Corasick
algorithm](https://dl.acm.org/doi/10.1145/360825.360855) invented by Alfred V.
Aho and Margaret J. Corasick in the Go programming language. The Aho-Corasick
algorithm is useful because it efficiently indexes all occurrences of a list of
keywords within a text string.
This implementation searches at the letter level instead of the word level.
Both [failure links](https://www.youtube.com/watch?v=O7_w001f58c) and
[dictionary links](https://www.youtube.com/watch?v=OFKxWFew_L0) are
implemented.
## Installation
The Go module is installable by running:
```bash
go get github.com/gnames/aho_corasick
```
## Usage
ereate a new aho_corasick instance with `aho_corasick.New()` and setup the
automaton with the search patterns with `ac.Setup(patterns)`. Run search with
`ac.Setup(patterns)`, which returns an array of matches.
```go
ac := aho_corasick.New()
patterns := []string{"aba", "cla", "ac", "gee", "lan"}
ac.Setup(patterns)
haystack := "abacgeeaba"
matches := ac.Search(haystack)
```
## Development
If you find a bug, please open an
[issue](https://github.com/gnames/aho_corasick/issues) ticket. Pull requests
are welcome.
Tests can be run with `go test` which will produce a text visual of the trie:
```text
******* Trie *******
haystack: abacgeeaba
root->root ┬─ a->root ┬─ b->root ── a*->a
│ └─ c*->c
├─ c->root ── l->l ── a*->a
├─ g->root ── e->root ── e*->root
└─ l->root ── a->a ── n*->root
********************
PASS
```
In the trie output, `root` refers to the root node, `*` represents word nodes,
`->` indicates the failure links, `|` indicates dictionary links.
Trie output can also be produced with the debugger, which can be run with:
```go
ac := aho_corasick.New()
haystack := "geeabaclaba"
patterns := []string{"aba", "cla", "ac", "gee", "lan"}
ac.Setup(patterns)
ac.Debug(haystack)
```
## License
[](https://opensource.org/licenses/MIT)
## Authors
* [Dmitry Mozzherin]
* [Geoff Ower]
[Dmitry Mozzherin]: https://github.com/dimus
[Geoff Ower]: https://github.com/gdower