https://github.com/almondtools/regexparser
A Parser for regular expressions
https://github.com/almondtools/regexparser
java regular-expression
Last synced: 5 months ago
JSON representation
A Parser for regular expressions
- Host: GitHub
- URL: https://github.com/almondtools/regexparser
- Owner: almondtools
- License: lgpl-3.0
- Created: 2017-01-22T11:23:10.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2024-04-20T09:15:28.000Z (about 2 years ago)
- Last Synced: 2025-07-26T07:04:52.254Z (11 months ago)
- Topics: java, regular-expression
- Language: Java
- Size: 66.4 KB
- Stars: 5
- Watchers: 2
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
RegexParser
===========
[](https://travis-ci.org/almondtools/regexparser)
[](https://app.codacy.com/gh/almondtools/regexparser/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade)
RegexParser is a handwritten parser for (deterministic) regular expressions. Deterministic means, that the regular expression language can be compiled to a deterministic finite automaton (note that the default java regular expressions are more powerful, but suffer from unpredictable runtime).
Not supported features are:
* backreferences
* lookaheads, lookbehinds
* variations of the Kleene star (greedy, reluctant, possessive)
Syntax
======
The syntax of the recognized regular expressions could be characterized by following table:
| Syntax | Matches |
| ----------------------- |----------------------------------------------------------------------|
| Single Characters | |
| x | The character x, unless there exist special rules for this character |
| \x | The character x, if there exist special rules for this character |
| . | any character (newlines only in DOTALL-mode) |
| \\ | backslash character |
| \n | newline character |
| \t | tab character |
| \r | carriage return character |
| \f | form feed character |
| \a | alert/bell character |
| \e | escape character |
| *\uhhhh* | *unicode character, not yet supported* |
| Character classes | |
| [...] | any of the contained characters |
| [^...] | none of the contained characters |
| [a-z] | char range (all chars from a to z) |
| [a-zA-Z] | char range, union of multiple ranges |
| \s | white space |
| \S | non white space |
| \w | word characters |
| \W | non word charachters |
| \d | digits |
| \D | non digits |
| *\p{name}* | *posix character class, not yet supported* |
| Sequences, alternatives | |
| xy | match x followed by y |
| x|y | match x or y |
| (x) | match inner expression x (grouping is not supported) |
| Repetitions | |
| x? | match x or nothing |
| x* | match a sequence of x's or nothing |
| x+ | match a sequence of x's (minimum one) |
| x{2} | match a sequence of 2 x's |
| x{2,4} | match a 2 to 4 x's |
| x{,4} | match a up to 4 x's |
| x{2,} | match a minimum of 2 x's |
| | |
| *Advanced Groups* | *not supported* |
| *Lookaheads* | *not supported* |
| *Lookbehinds* | *not supported* |
| *References* | *not supported* |
| *Anchors* | *not supported* |
| *Flags* | *not supported* |
Maven Dependency
================
```xml
net.amygdalum
regexparser
0.1.0
```