Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jirikuncar/esmre
Automatically exported from code.google.com/p/esmre
https://github.com/jirikuncar/esmre
Last synced: 16 days ago
JSON representation
Automatically exported from code.google.com/p/esmre
- Host: GitHub
- URL: https://github.com/jirikuncar/esmre
- Owner: jirikuncar
- License: lgpl-2.1
- Created: 2015-06-25T12:57:00.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2015-07-12T18:32:49.000Z (over 9 years ago)
- Last Synced: 2024-10-31T10:02:09.814Z (2 months ago)
- Language: C
- Size: 242 KB
- Stars: 1
- Watchers: 3
- Forks: 1
- Open Issues: 6
-
Metadata Files:
- Readme: README
- License: COPYING
Awesome Lists containing this project
README
esmre - Efficient String Matching Regular Expressions
=====================================================esmre is a Python module that can be used to speed up the execution of a large
collection of regular expressions. It works by building a index of compulsory
substrings from a collection of regular expressions, which it uses to quickly
exclude those expressions which trivially do not match each input.Here is some example code that uses esmre:
>>> import esmre
>>> index = esmre.Index()
>>> index.enter(r"Major-General\W*$", "savoy opera")
>>> index.enter(r"\bway\W+haye?\b", "sea shanty")
>>> index.query("I am the very model of a modern Major-General.")
['savoy opera']
>>> index.query("Way, hay up she rises,")
['sea shanty']
>>>The esmre module builds on the simpler string matching facilities of the esm
module, which wraps a C implementation some of the algorithms described in
Aho's and Corasick's paper on efficient string matching [Aho, A.V, and
Corasick, M. J. Efficient String Matching: An Aid to Bibliographic Search.
Comm. ACM 18:6 (June 1975), 333-340]. Some minor modifications have been made
to the algorithms in the paper and one algorithm is missing (for now), but
there is enough to implement a quick string matching index.Here is some example code that uses esm directly:
>>> import esm
>>> index = esm.Index()
>>> index.enter("he")
>>> index.enter("she")
>>> index.enter("his")
>>> index.enter("hers")
>>> index.fix()
>>> index.query("this here is history")
[((1, 4), 'his'), ((5, 7), 'he'), ((13, 16), 'his')]
>>> index.query("Those are his sheep!")
[((10, 13), 'his'), ((14, 17), 'she'), ((15, 17), 'he')]
>>>You can see more usage examples in the tests.