Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/fy0/tinyre
A light fork of python's regex engine (but slow, ~3k lines).
https://github.com/fy0/tinyre
Last synced: 3 months ago
JSON representation
A light fork of python's regex engine (but slow, ~3k lines).
- Host: GitHub
- URL: https://github.com/fy0/tinyre
- Owner: fy0
- License: zlib
- Created: 2012-04-10T11:47:52.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2017-01-05T04:07:23.000Z (almost 8 years ago)
- Last Synced: 2024-07-16T12:00:44.574Z (4 months ago)
- Language: C
- Homepage:
- Size: 159 KB
- Stars: 78
- Watchers: 4
- Forks: 9
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# tinyre ver 0.9.2
[![Travis](https://travis-ci.org/fy0/tinyre.svg?branch=master)](https://travis-ci.org/fy0/tinyre)
[![Code Climate](https://codeclimate.com/github/fy0/tinyre/badges/gpa.svg)](https://codeclimate.com/github/fy0/tinyre)A tiny regex engine.
Plan to be compatible with "Secret Labs' Regular Expression Engine"(SRE for python).**warning: the project already works fine, but slow**
**Features**:
* **utf-8 support**
Cheers for unicode!* **no octal number**
\\1 means group 1, \\1-100 means group n, \\01 match \\1, \\07 match \\7, \\08 match ['\\0', '8'], \\377 match 0o377, but \\400 isn't match with 0o400 and [chr(0o40), '\\0']!
What the hell ... I choose go die! Go away octal number!* **custom maximum number of backtracking**
An evil regex: **'a?'\*n+'a'\*n** against **'a'\*n**
For example: **'a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaaaa'** matches **'aaaaaaaaaaaaaaaaaaaaaaaaa'**
It will takes a long time because of too many times of backtracking. Perl/Python/PCRE requires over **10^15 years** to match a 29-character string.
You can set a limit to backtracking times to avoid this situation, and the match will be falied.* **more than 100 groups ...**
but who cares?**Supported**:
* "." Matches any character except a newline.
* "^" Matches the start of the string.
* "$" Matches the end of the string or just before the newline at the end of the string.
* "*" Matches 0 or more (greedy) repetitions of the preceding RE. Greedy means that it will match as many repetitions as possible.
* "+" Matches 1 or more (greedy) repetitions of the preceding RE.
* "?" Matches 0 or 1 (greedy) of the preceding RE.
* *?,+?,?? Non-greedy versions of the previous three special characters.
* {m} Matches m copies of the previous RE.
* {m,n} Matches from m to n repetitions of the preceding RE.
* {m,n}? Non-greedy version of the above.
* "\\" Either escapes special characters or signals a special sequence.
* "\\1-N" Matches the text matched earlier by the group index.
* [] Indicates a set of characters.
* [^] A "^" as the first character indicates a complementing set.
* "|" A|B, creates an RE that will match either A or B.
* (...) Matches the RE inside the parentheses. The contents can be retrieved or matched later in the string.
* (?ims) Set the I, M or S flag for the RE (see below).
* (?:...) Non-grouping version of regular parentheses.
* (?P...) The substring matched by the group is accessible by name.
* (?P=name) Matches the text matched earlier by the group named name.
* (?#...) A comment; ignored.
* (?=...) Matches if ... matches next, but doesn't consume the string.
* (?!...) Matches if ... doesn't match next.
* (?<=...) Matches if preceded by ... (must be fixed length).
* (?