https://github.com/jbee/lex
Linear expressions - simple and fast pattern matching
https://github.com/jbee/lex
interpreter java language pattern-matching
Last synced: 2 months ago
JSON representation
Linear expressions - simple and fast pattern matching
- Host: GitHub
- URL: https://github.com/jbee/lex
- Owner: jbee
- Created: 2017-11-14T14:22:42.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2018-07-04T06:57:34.000Z (almost 8 years ago)
- Last Synced: 2025-01-01T20:26:36.241Z (over 1 year ago)
- Topics: interpreter, java, language, pattern-matching
- Language: Java
- Homepage: http://jbee.github.io/lex/
- Size: 44.9 KB
- Stars: 3
- Watchers: 3
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README
Awesome Lists containing this project
README
▓ ▓▓▓ ▓ ▓
▓ ▓ ▓ ▓
▓ ▓▓▓ ▓
▓ ▓ ▓ ▓
▓▓▓ ▓▓▓ ▓ ▓ Linear expressions
First match pattern matching with a tiny bytecode instructed matching machine.
Spec http://jbee.github.io/lex/
Feedback http://github.com/jbee/lex/issues
Copyright (c) 2017 Jan Bernitt
________________________________________________________________________________
IMPLEMENTAIONS
Java: basic ~100 LOC, optimized ~150 LOC
________________________________________________________________________________
SETS {...}
{abc} a set of bytes "a","b" and "c"
{^abc} a set of any byte but "a","b" and "c"
{a-c} a set of "a", "b" and "c" given as a range
{?} a set of *all* non ASCII bytes
SPECIAL SETS
# = {0-9} any ASCII digit
@ = {a-zA-Z} any ASCII letter
$ any ASCII new line (\n or \r)
_ any ASCII whitespace character
^ any byte that is not an ASCII whitespace character
? any single byte
REPETITION x+
+ try previous set, group or literal again
GROUPS (...), [...], `...`
(abc) a group with sequence "abc" that *must* occur
[abc] a group with sequence "abc" that *can* occur
` exit group, unless first in group (used to embed)
SCANNING ~x
~ skip until following set, group or literal matches
ESCAPING
\ escape following byte to a literal (also in set)
Sets can also be used to match most of the instruction symbols literally.
For example {~} is similar to \~ or {\~}.
Any other byte (not {}()[]#@^_$+~?`\) is matched literally.
Escaping can be applied to any byte even if it is not needed.
________________________________________________________________________________
EXAMPLES
####/##/## a date of format yyyy/mm/dd
##:##[:##] a time of format hh:mm or hh:mm:ss
#+[.#+] a simple floating point number with optional decimals
"{^"}+" a quoted string using a set to find the end
"~" a quoted string using a scan to find the end
"~({^\\}") a quoted string using a scan with escaping support
\$@}a-zA-Z0-9_{+ a php style identifier
[\+]#+[{ -}#+]+ international phone numbers
~(Foo) searching for "Foo" (e.g. in a file)
~() searching for "" to ""
~() searching for "
" to ""
~(Foo~(Bar)) searching for "Foo"s followed by "Bar"s