https://github.com/alexdremov/sxtree
Generate AST syntax parser from grammar file
https://github.com/alexdremov/sxtree
grammar grammar-parser grammar-parser-generator lexer lexer-framework lexer-parser parser parser-framework
Last synced: 11 months ago
JSON representation
Generate AST syntax parser from grammar file
- Host: GitHub
- URL: https://github.com/alexdremov/sxtree
- Owner: alexdremov
- License: mit
- Created: 2021-05-09T17:25:08.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2021-05-20T21:17:11.000Z (over 4 years ago)
- Last Synced: 2025-01-17T08:29:40.585Z (about 1 year ago)
- Topics: grammar, grammar-parser, grammar-parser-generator, lexer, lexer-framework, lexer-parser, parser, parser-framework
- Language: C++
- Homepage: https://AlexRoar.github.io/SxTree/
- Size: 3.77 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# SxTree
Generates AST parser and lexer from grammar files. Implements:
- Lexer generator
- Parser generator
## Lexer generator
Generates lexer basing on syntax file. Example:
```yaml
space = (skip("\s"))
lineBreak = ["[\n]+"]
integer = ("^[-+]?[0-9]\d*")
float = ("^[+-]?([0-9]*[.])?[0-9]+")
def = ("def")
```
Lexer syntax grammar:
```bash
G := [Rule]*
ID := regex(^[a-zA-Z_$][a-zA-Z_$0-9]*)
Rule := ID '=' Exp '\n'
Exp := '('P [, P]*')' | '['P [, P]*']' | '?['P [, P]*']'
P(part) := skip+Exp | regex | Exp
```
Several main types of rules are used:
- `'('P [, P]*')'` – sequence of parts, all parts must be met one by one.
- `'['P [, P]*']'` – any of listed parts.
- `'?['P [, P]*']'` – any of listed parts or none.
- `skip+Exp` - get expression and skip it.
Tokens parsed from top to bottom – upper statements have higher priority.
### Lexer generator usage
Generator takes lexer syntax file and generates a file used by SxTree parser further.
```bash
> sxlgen -h
Generate lexer file from lexer syntax
Usage:
SxTree Lexer Generator [OPTION...]
-i, --input arg Input file [required]
-o, --output arg Output file (default: lexer.cpp)
-p, --outputHeader arg Output header file (default: lexer.h)
-q, --quiet Quiet mode (do not show errors)
-h, --help Print usage
```
### Example:
```bash
space = (skip("\s"))
var = ("var")
funcDecl = ["def", "define"]
ok = ("o", ?["k"])
```
Generated lexer structure:
```cpp
enum LexemeType {
lex_NONE = 0,
lex_space = 1,
lex_var = 2,
lex_funcDecl = 3,
lex_ok = 4,
};
Lexer coreLexer({
{1, {{{ R"()", Value::VAL_SKIP,{{{ R"(\s)", Value::VAL_REGEXP,{{}, Expression::EXP_ONE}},}, Expression::EXP_ONE}},}, Expression::EXP_ONE}},
{2, {{{ R"(var)", Value::VAL_REGEXP,{{}, Expression::EXP_ONE}},}, Expression::EXP_ONE}},
{3, {{{ R"(def)", Value::VAL_REGEXP,{{}, Expression::EXP_ONE}},{ R"(define)", Value::VAL_REGEXP,{{}, Expression::EXP_ONE}},}, Expression::EXP_ANY}},
{4, {{{ R"(o)", Value::VAL_REGEXP,{{}, Expression::EXP_ONE}},{ R"()", Value::VAL_EXPRESSION,{{{ R"(k)", Value::VAL_REGEXP,{{}, Expression::EXP_ONE}},}, Expression::EXP_OPTIONAL}},}, Expression::EXP_ONE}},
});
```
Here's an example that can be found in `/examples/simple`. It is extremely easy to adjust lexer for your needs.
```python
def someFunction() {
"Hello world"
}
```
```bash
Lexeme('def')
Lexeme('someFunction')
Lexeme('(')
Lexeme(')')
Lexeme('{')
Lexeme('\n')
Lexeme(' ')
Lexeme('"Hello world"')
Lexeme('\n')
Lexeme('}')
```
## Parser
In progress