Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yanivmo/ruler
Humane formal grammar library for Python
https://github.com/yanivmo/ruler
grammar regex regular-expression
Last synced: about 2 months ago
JSON representation
Humane formal grammar library for Python
- Host: GitHub
- URL: https://github.com/yanivmo/ruler
- Owner: yanivmo
- License: mit
- Created: 2016-09-25T04:54:53.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2022-12-26T20:30:45.000Z (about 2 years ago)
- Last Synced: 2024-10-30T16:44:31.947Z (3 months ago)
- Topics: grammar, regex, regular-expression
- Language: Python
- Homepage:
- Size: 74.2 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 12
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
README
*****
Ruler
*****.. image:: https://travis-ci.org/yanivmo/ruler.svg?branch=master
:target: https://travis-ci.org/yanivmo/ruler
:alt: Build status.. image:: https://landscape.io/github/yanivmo/ruler/master/landscape.svg?style=flat
:target: https://landscape.io/github/yanivmo/ruler/master
:alt: Code Health.. image:: https://coveralls.io/repos/github/yanivmo/ruler/badge.svg?branch=master
:target: https://coveralls.io/github/yanivmo/ruler?branch=master.. image:: https://img.shields.io/pypi/v/ruler.svg
:target: https://pypi.python.org/pypi/rulerRuler is a lightweight regular expressions wrapper aiming to make regex definitions more
modular, intuitive, readable and the mismatch reporting more informative.Installation
============
::pip install ruler
Quick start
===========Let's implement the following grammar, given in EBNF_::
grammar = who, ' likes to drink ', what;
who = 'John' | 'Peter' | 'Ann' | 'Paul' | 'Rachel';
what = tea | juice;
juice = 'juice';
tea = 'tea', [milk];
milk = ' with milk';Using ruler it looks almost identical to EBNF_:
>>> class Morning(Grammar):
... who = OneOf('John', 'Peter', 'Ann', 'Paul', 'Rachel')
... juice = Rule('juice')
... milk = Optional(' with milk')
... tea = Rule('tea', milk)
... what = OneOf(juice, tea)
... grammar = Rule(who, ' likes to drink ', what, '\.')
...
... morning = Morning.create()A member named ``grammar`` must be always present - it acts as the start rule.
Let's begin rather with a mismatch:>>> morning.match('John likes to drink coffee')
False``match()`` returns ``True`` if the match was successful and ``False`` otherwise.
One of the major advantages of ``ruler``, as opposed to working directly with regular expressions,
is the ability to know exactly what went wrong:>>> print(morning.error.long_description)
Mismatch at 20:
John likes to drink coffee
^
"coffee" does not match "juice"
"coffee" does not match "tea"Let's fix our text:
>>> morning.match('John likes to drink tea.')
TrueAny rule that is declared as a member variable of your grammar class acts as a named capture group
arranged hierarchically. Use ``matched`` attribute to retrieve the text matched by a specific rule:>>> morning.matched
'John likes to drink tea.'
>>> morning.who.matched
'John'
>>> morning.what.matched
'tea'Branches of OneOf rules that didn't match and optional rules that didn't match have ``None`` as
their values making it easy to ask whether they matched:>>> morning.what.juice.matched is None
True
>>> morning.what.tea.matched is None
False
>>> morning.what.tea.milk.matched is None
TrueRules can be reused multiple times. If the same rule appears multiple times under the same parent,
these rules are collected into a list:>>> class Morning(Grammar):
... person = OneOf('John', 'Peter', 'Ann', 'Paul', 'Rachel')
... who = Rule(person, Optional(', ', person), Optional(' and ', person))
... juice = Rule('juice')
... milk = Optional(' with milk')
... tea = Rule('tea', milk)
... what = OneOf(juice, tea)
... grammar = Rule(who, ' like', Optional('s'), ' to drink ', what, '\.')
...
... morning = Morning.create()
... morning.match('Peter, Rachel and Ann like to drink juice.')
True
>>> morning.who.matched
'Peter, Rachel and Ann'
>>> morning.who.person[0].matched
'Peter'
>>> morning.who.person[1].matched
'Rachel'
>>> morning.who.person[2].matched
'Ann'Notice that, in the grammar above, ``person`` rule is never a direct child of ``who`` but still
is accessed as such. That is because when a rule hierarchy is built, a rule is placed under its
closest named ancestor.Rules' string arguments may actually be any valid regular expression. So we could rewrite our
grammar like this:>>> class Morning(Grammar):
... who = OneOf('\w+')
... juice = Rule('juice')
... milk = Optional(' with milk')
... tea = Rule('tea', milk)
... what = OneOf(juice, tea)
... grammar = Rule(who, ' likes to drink ', what, '\.')
...
... morning = Morning()
... morning.match('R2D2 likes to drink juice. And nothing else matters.')
True
>>> morning.matched
'R2D2 likes to drink juice.'
>>> morning.who.matched
'R2D2'Performance
===========
The library is well optimized for fast matching. Nevertheless it is important to remember
that this is a Python wrapper of the regex library and as such can never outperform matching
directly using the regex library. Currently ruler measures approximately ten times slower
than ``re``.Development
===========* To run the tests::
pytest tests
* To compare the performance to the re library::
python performance/re_compare.py
* To run performance profiling of a specific method, ``Rule.match`` for example::
python performance/profile.py Rule.match
More than one method can be specified in the same command.
Tox
---
Tox takes care of everything without installing anything manually. There are two groups of tox
environments: ``py*-test`` and ``py*-profile``. The test environments run the unit tests while the
profile environments run the performance profiling scripts. If tox is not enough then a development
environment can be generated by creating a new virtualenv and then running
``pip install -r requirements_develop.txt``.Dependency management
---------------------
For the development needs, there are three requirements files in the project's root directory:- ``requirements_test.txt`` contains all the dependencies needed to run the unit tests,
- ``requirements_profile.txt`` contains all the dependencies needed to run the performance profiling,
- ``requirements_develop.txt`` contains the testing dependencies, the profiling dependencies and some additional
dependencies used in development.The requirements files mentioned above are not intended for manual editing. Instead they are managed
using `pip-tools`_. The process of updating the requirements is as follows:#. Add, remove or update a dependency in one of the ``reqs_*.dep`` files:
- Update ``reqs_install.dep`` if the dependency is needed for the regular installation by the end user,
- Update ``reqs_test.dep`` if the dependency is needed to run the unit tests but is not necessary for the
regular installation,
- Update ``reqs_profile.dep`` if the dependency is needed to run the performance profiling but is not necessary
for the regular installation,
- Update ``reqs_develop.dep`` if the dependency is not in one of the previous categories.#. Generate the requirements file running ``pip-compile``. The exact command is documented in the beginning of each
requirements file.
#. Consider running ``pip-sync requirements_develop.txt``.Notice that there is no need to edit ``setup.py`` - it will pull the dependencies by itself from ``reqs_install.dep``.
.. _EBNF: https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form
.. _pip-tools: https://github.com/jazzband/pip-tools