https://github.com/loloicci/nimly
Lexer Generator and Parser Generator as a Library in Nim.
https://github.com/loloicci/nimly
bnf compile-time ebnf lexer-generator lexer-parser macro macros nim parser-generator
Last synced: 20 days ago
JSON representation
Lexer Generator and Parser Generator as a Library in Nim.
- Host: GitHub
- URL: https://github.com/loloicci/nimly
- Owner: loloicci
- License: mit
- Created: 2017-04-24T15:45:37.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2022-06-10T09:42:54.000Z (almost 3 years ago)
- Last Synced: 2025-03-23T18:37:30.832Z (about 1 month ago)
- Topics: bnf, compile-time, ebnf, lexer-generator, lexer-parser, macro, macros, nim, parser-generator
- Language: Nim
- Homepage:
- Size: 210 KB
- Stars: 147
- Watchers: 6
- Forks: 4
- Open Issues: 17
-
Metadata Files:
- Readme: README.rst
- Changelog: changelog.rst
- License: LICENSE
Awesome Lists containing this project
README
#######
nimly
#######
|github_workflow| |nimble|Lexer Generator and Parser Generator as a Macro Library in Nim.
With nimly, you can make lexer/parser by writing definition
in formats like lex/yacc.
``nimly`` generates lexer and parser by using macro in compile-time,
so you can use ``nimly`` not as external tool of your program but as a library.niml
====
``niml`` is a macro to generate a lexer.macro niml
----------
macro ``niml`` makes a lexer.
Almost all part of constructing a lexer is done in compile-time.
Example is as follows... code-block:: nim
## This makes a LexData object named myLexer.
## This lexer returns value with type ``Token`` when a token is found.
niml myLexer[Token]:
r"if":
## this part converted to procbody.
## the arg is (token: LToken).
return TokenIf()
r"else":
return TokenElse()
r"true":
return TokenTrue()
r"false":
return TokenFalse()
## you can use ``..`` instead of ``-`` in ``[]``.
r"[a..zA..Z\-_][a..zA..Z0..9\-_]*":
return TokenIdentifier(token)
## you can define ``setUp`` and ``tearDown`` function.
## ``setUp`` is called from ``open``, ``newWithString`` and
## ``initWithString``.
## ``tearDown`` is called from ``close``.
## an example is ``test/lexer_global_var.nim``.
setUp:
doSomething()
tearDown:
doSomething()Meta charactors are as following:
- ``\``: escape character
- ``.``: match with any charactor
- ``[``: start of character class
- ``|``: means or
- ``(``: start of subpattern
- ``)``: end of subpattern
- ``?``: 0 or 1 times quantifier
- ``*``: 0 or more times quantifire
- ``+``: 1 or more times quantifire
- ``{``: ``{n,m}`` is n or more and m or less times quantifireIn ``[]``, meta charactors are as following
- ``\``: escape character
- ``^``: negate character (only in first position)
- ``]``: end of this class
- ``-``: specify character range (``..`` can be used instead of this)Each of followings is recognized as character set.
- ``\d``: ``[0..9]``
- ``\D``: ``[^0..9]``
- ``\s``: ``[ \t\n\r\f\v]``
- ``\S``: ``[^ \t\n\r\f\v]``
- ``\w``: ``[a..zA..Z0..9_]``
- ``\w``: ``[^a..zA..Z0..9_]``nimy
====
``nimy`` is a macro to generate a LALR(1) parser.macro nimy
----------
macro ``nimy`` makes a parser.
Almost all part of constructing a parser is done in compile-time.
Example is as follows... code-block:: nim
## This makes a LexData object named myParser.
## first cloud is the top-level of the BNF.
## This lexer recieve tokens with type ``Token`` and token must have a value
## ``kind`` with type enum ``[TokenTypeName]Kind``.
## This is naturally satisfied when you use ``patty`` to define the token.
nimy myParser[Token]:
## the starting non-terminal
## the return type of the parser is ``Expr``
top[Expr]:
## a pattern.
expr:
## proc body that is used when parse the pattern with single ``expr``.
## $1 means first position of the pattern (expr)
return $1## non-terminal named ``expr``
## with returning type ``Expr``
expr[Expr]:
## first pattern of expr.
## ``LPAR`` and ``RPAR`` is TokenKind.
LPAR expr RPAR:
return $2## second pattern of expr.
## ``PLUS`` is TokenKind.
expr PLUS expr
return $2You can use following EBNF functions:
- ``XXX[]``: Option (0 or 1 ``XXX``).
The type is ``seq[xxx]`` where ``xxx`` is type of ``XXX``.
- ``XXX{}``: Repeat (0 or more ``XXX``).
The type is ``seq[xxx]`` where ``xxx`` is type of ``XXX``.Example of these is in next section.
Example
=======
``tests/test_readme_example.nim`` is an easy example... code-block:: nim
import unittest
import patty
import strutilsimport nimly
## variant is defined in patty
variant MyToken:
PLUS
MULTI
NUM(val: int)
DOT
LPAREN
RPAREN
IGNOREniml testLex[MyToken]:
r"\(":
return LPAREN()
r"\)":
return RPAREN()
r"\+":
return PLUS()
r"\*":
return MULTI()
r"\d":
return NUM(parseInt(token.token))
r"\.":
return DOT()
r"\s":
return IGNORE()nimy testPar[MyToken]:
top[string]:
plus:
return $1plus[string]:
mult PLUS plus:
return $1 & " + " & $3mult:
return $1mult[string]:
num MULTI mult:
return "[" & $1 & " * " & $3 & "]"num:
return $1num[string]:
LPAREN plus RPAREN:
return "(" & $2 & ")"## float (integer part is 0-9) or integer
NUM DOT[] NUM{}:
result = ""
# type of `($1).val` is `int`
result &= $(($1).val)
if ($2).len > 0:
result &= "."
# type of `$3` is `seq[MyToken]` and each elements are NUM
for tkn in $3:
# type of `tkn.val` is `int`
result &= $(tkn.val)test "test Lexer":
var testLexer = testLex.newWithString("1 + 42 * 101010")
testLexer.ignoreIf = proc(r: MyToken): bool = r.kind == MyTokenKind.IGNOREvar
ret: seq[MyTokenKind] = @[]for token in testLexer.lexIter:
ret.add(token.kind)check ret == @[MyTokenKind.NUM, MyTokenKind.PLUS, MyTokenKind.NUM,
MyTokenKind.NUM, MyTokenKind.MULTI,
MyTokenKind.NUM, MyTokenKind.NUM, MyTokenKind.NUM,
MyTokenKind.NUM, MyTokenKind.NUM, MyTokenKind.NUM]test "test Parser 1":
var testLexer = testLex.newWithString("1 + 42 * 101010")
testLexer.ignoreIf = proc(r: MyToken): bool = r.kind == MyTokenKind.IGNOREvar parser = testPar.newParser()
check parser.parse(testLexer) == "1 + [42 * 101010]"testLexer.initWithString("1 + 42 * 1010")
parser.init()
check parser.parse(testLexer) == "1 + [42 * 1010]"test "test Parser 2":
var testLexer = testLex.newWithString("1 + 42 * 1.01010")
testLexer.ignoreIf = proc(r: MyToken): bool = r.kind == MyTokenKind.IGNOREvar parser = testPar.newParser()
check parser.parse(testLexer) == "1 + [42 * 1.01010]"testLexer.initWithString("1. + 4.2 * 101010")
parser.init()
check parser.parse(testLexer) == "1. + [4.2 * 101010]"test "test Parser 3":
var testLexer = testLex.newWithString("(1 + 42) * 1.01010")
testLexer.ignoreIf = proc(r: MyToken): bool = r.kind == MyTokenKind.IGNOREvar parser = testPar.newParser()
check parser.parse(testLexer) == "[(1 + 42) * 1.01010]"Install
=======
1. ``nimble install nimly``Now, you can use nimly with ``import nimly``.
vmdef.MaxLoopIterations Problem
-------------------------------
During compiling lexer/parser, you can encounter errors with ``interpretation requires too many iterations``.
You can avoid this error to use the compiler option ``maxLoopIterationsVM:N``
which is available since nim v1.0.6.See https://github.com/loloicci/nimly/issues/11 to detail.
Contribute
==========
1. Fork this
2. Create new branch
3. Commit your change
4. Push it to the branch
5. Create new pull requestChangelog
=========
See changelog.rst_.Developing
==========
You can use ``nimldebug`` and ``nimydebug`` as a conditional symbol
to print debug info.example: ``nim c -d:nimldebug -d:nimydebug -r tests/test_readme_example.nim``
.. |github_workflow| image:: https://github.com/loloicci/nimly/workflows/test/badge.svg
:target: https://github.com/loloicci/nimly/actions?query=workflow%3Atest
.. |nimble| image:: https://raw.githubusercontent.com/yglukhov/nimble-tag/master/nimble.png
:target: https://github.com/yglukhov/nimble-tag
.. _changelog.rst: ./changelog.rst