Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/wikimedia/ahocorasick
A PHP implementation of the Aho-Corasick string search algorithm. Mirror from https://gerrit.wikimedia.org/g/AhoCorasick - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)
https://github.com/wikimedia/ahocorasick
aho-corasick ahocorasick algorithm
Last synced: 2 days ago
JSON representation
A PHP implementation of the Aho-Corasick string search algorithm. Mirror from https://gerrit.wikimedia.org/g/AhoCorasick - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)
- Host: GitHub
- URL: https://github.com/wikimedia/ahocorasick
- Owner: wikimedia
- License: apache-2.0
- Created: 2015-06-07T21:17:36.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2025-01-31T00:03:08.000Z (5 days ago)
- Last Synced: 2025-02-02T09:57:00.816Z (2 days ago)
- Topics: aho-corasick, ahocorasick, algorithm
- Language: PHP
- Homepage:
- Size: 98.6 KB
- Stars: 50
- Watchers: 17
- Forks: 10
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: HISTORY.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
[![Packagist.org](https://img.shields.io/packagist/v/wikimedia/aho-corasick.svg?style=flat)](https://packagist.org/packages/wikimedia/aho-corasick)
AhoCorasick
===========AhoCorasick is a PHP implementation of the [Aho-Corasick][1] string search
algorithm, which is an efficient way of searching a body of text for multiple
search keywords.Here is how you use it:
```php
use AhoCorasick\MultiStringMatcher;$keywords = new MultiStringMatcher( array( 'ore', 'hell' ) );
$keywords->searchIn( 'She sells sea shells by the sea shore.' );
// Result: array( array( 15, 'hell' ), array( 34, 'ore' ) )$keywords->searchIn( 'Say hello to more text. MultiStringMatcher objects are reusable!' );
// Result: array( array( 4, 'hell' ), array( 14, 'ore' ) )
```Features
--------The algorithm works by constructing a finite-state machine out of the set of
search keywords. The time it takes to construct the finite state machine is
proportional to the sum of the lengths of the search keywords. Once
constructed, the machine can locate all occurences of all search keywords in
any body of text in a single pass, making exactly one state transition per
input character.The algorithm originates from ["Efficient string matching: an aid to bibliographic search"][paper] (CACM, Volume 18, Issue 6, June 1975) by Alfred V. Aho and Margaret J. Corasick.
See also the definition and reference implementation on [nist.gov][dads].
[paper]: https://doi.org/10.1145/360825.36085
[dads]: http://xlinux.nist.gov/dads/HTML/ahoCorasick.htmlContribute
----------- Issue tracker: https://phabricator.wikimedia.org/tag/ahocorasick/
- Source code: https://gerrit.wikimedia.org/g/AhoCorasickSupport
-------If you are having issues, [please let us know][2].
License
-------The project is licensed under the Apache license.
[1]: https://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_string_matching_algorithm
[2]: https://phabricator.wikimedia.org/maniphest/task/create/?projects=PHID-PROJ-hs5ausnvlfs4e3n5gmzg