https://github.com/rootslab/qap
Qap is a quick parser/matcher for string or buffer patterns ( optimized for pattern lengths <= 255 bytes ).
https://github.com/rootslab/qap
pattern-matching quicksearch search-algorithm
Last synced: 3 months ago
JSON representation
Qap is a quick parser/matcher for string or buffer patterns ( optimized for pattern lengths <= 255 bytes ).
- Host: GitHub
- URL: https://github.com/rootslab/qap
- Owner: rootslab
- License: mit
- Created: 2012-10-28T14:34:14.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2018-02-10T14:41:14.000Z (over 7 years ago)
- Last Synced: 2025-02-27T03:25:55.148Z (4 months ago)
- Topics: pattern-matching, quicksearch, search-algorithm
- Language: JavaScript
- Homepage:
- Size: 90.8 KB
- Stars: 5
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
- License: LICENSE
Awesome Lists containing this project
README
### Qap
[](https://www.npmjs.org/package/qap)
[](https://www.codacy.com/public/44gatti/qap)
[](https://codeclimate.com/github/rootslab/qap)
[](https://github.com/rootslab/qap#mit-license)
[](http://travis-ci.org/rootslab/qap)
[](https://david-dm.org/rootslab/qap)
[](https://david-dm.org/rootslab/qap#info=devDependencies)[](http://npm-stat.com/charts.html?package=qap)
[](https://nodei.co/npm/qap/)
* __Qap__ is a quick parser for string or buffer patterns.
* It is optimized for using with pattern strings <= 255 bytes.
* Better results are achieved with long and sparse patterns.
* It is an implementation of QuickSearch algorithm.### Main features
> Given a m-length pattern and n-length data and σ-length alphabet ( σ = 256 ):
- simplification of the Boyer-Moore algorithm ( *see [Bop](https://github.com/rootslab/bop)* ).
- uses only a bad-character shift table.
- preprocessing phase in __O(m+σ)__ time and __O(σ)__ space complexity.
- searching phase in __O(m*n)__ time complexity.
- very fast in practice for short patterns and large alphabets.> See __[Lecroq](http://www-igm.univ-mlv.fr/~lecroq/string/node19.html)__ for reference and also __[Bop](https://github.com/rootslab/bop)__, a Boyer-Moore parser.
### Install
```bash
$ npm install qap [-g]
```> __require__:
```javascript
var Qap = require( 'qap' );
```### Run Tests
```javascript
$cd qap/
$npm test
```### Run Benchmarks
```bash
$ cd qap/
$ npm run-script bench
```### Constructor
> Create an instance with a Buffer or String pattern.
```javascript
Qap( Buffer || String pattern )
// or
neq Qap( Buffer || String pattern )
```### Methods
> List all pattern occurrences into a String or Buffer data.
> It returns a new array of indexes, or populates an array passed as the last argument to parse method.```javascript
// slower with String
Qap#parse( String data [, Number startFromIndex [, Number limitResultsTo [, Array array ] ] ] ) : Array// faster with Buffer
Qap#parse( Buffer data [, Number startFromIndex [, Number limitResultsTo [, Array array ] ] ] ) : Array
```> Change the pattern :
```javascript
Qap#set( Buffer || String pattern ) : Buffer
```### Usage Example
```javascript
var log = console.log
, assert = require( 'assert' )
, Qap = require( 'qap' )
, pattern = 'hellofolks\r\n\r\n'
, text = 'hehe' + pattern +'loremipsumhellofolks\r\n' + pattern
, bresult = null
;// create an instance and parse the pattern
var qap = Qap( pattern )
// parse data from beginning
, results = qap.parse( text )
;// set a new Buffer pattern
qap.set( new Buffer( pattern ) );// parse data Buffer instead of a String
bresults = qap.parse( new Buffer( text ) );// parser results ( starting indexes ) [ 4, 40 ]
log( results, bresults );// results are the same
assert.deepEqual( results, bresults );```
#### Benchmark for a small pattern ( length <= 255 bytes )
> Parser uses a Buffer 256-bytes long to build the shifting table, then:
> - Pattern parsing / table creation space and time complexity is O(σ).
> - Very low memory footprint.
> - Ultra fast to preprocess pattern ( = table creation ).```bash
$ node bench/small-pattern-data-rate
```for default it:
> - uses a pattern string of 57 bytes/chars
> - builds a data buffer of 700 MB in memory
> - uses a redundancy/distance factor for pattern strings equal to 2. The bigger the value,
the lesser are occurrences of pattern string into the text buffer.**Custom Usage**:
```bash
# with [testBufferSizeInMB] [distanceFactor] [aStringPattern]
$ node bench/small-pattern-data-rate.js 700 4 "that'sallfolks"
```#### Benchmark for a big pattern ( length > 255 bytes )
> Parser uses one Array to build the shifting table for a big pattern, then:
> - table has a size of 256 elements, every element is an integer value that
> could be between 0 and the pattern length.
> - Fast to preprocess pattern ( = table creation ).
> - Low memory footprint```bash
$ node bench/big-pattern-data-rate
```> - it uses a pattern size of 20MB
> - builds a data buffer of 300MB copying pattern 12 timesSee __[bench](./bench)__ dir.
### MIT License
> Copyright (c) 2013-present < Guglielmo Ferri : [email protected] >
> Permission is hereby granted, free of charge, to any person obtaining
> a copy of this software and associated documentation files (the
> 'Software'), to deal in the Software without restriction, including
> without limitation the rights to use, copy, modify, merge, publish,
> distribute, sublicense, and/or sell copies of the Software, and to
> permit persons to whom the Software is furnished to do so, subject to
> the following conditions:> __The above copyright notice and this permission notice shall be
> included in all copies or substantial portions of the Software.__> THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
> IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
> CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
> TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
> SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.