https://github.com/karbonitekream/syn

A syntax parser based on the LLLR method
https://github.com/karbonitekream/syn

grammar ll-parser lr-parser paring rust syntax-analysis

Last synced: 10 months ago
JSON representation

A syntax parser based on the LLLR method

Host: GitHub
URL: https://github.com/karbonitekream/syn
Owner: KarboniteKream
License: mit
Created: 2019-03-03T10:30:50.000Z (almost 7 years ago)
Default Branch: master
Last Pushed: 2022-06-07T14:45:55.000Z (over 3 years ago)
Last Synced: 2025-03-24T00:54:26.591Z (10 months ago)
Topics: grammar, ll-parser, lr-parser, paring, rust, syntax-analysis
Language: Rust
Homepage:
Size: 233 KB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # syn

A syntax parser based on the [LLLR] method.

## Requirements

- Rust 1.56.0 or later

## Usage

```bash

syn  -g GRAMMAR [-p lllr] [-o OUTPUT]

```

The optional argument `-o` specifies the desired output file for a graph in the [DOT] language.

This is only available with the LR parser.

## Grammar

Grammar files are defined using the [TOML] format.

### Header

The header contains the following entries:

- `name`: Name of the grammar.

- `description`: An optional description of the grammar.

  Defaults to the canonical path to the grammar file.

- `start_symbol`: Start symbol of the grammar. Defaults to first rule in `[rules]`.

Example:

```toml

name = "grammar"

description = "Example grammar for README"

start_symbol = "S"

```

### Rules

The production rules are described in the `[rules]` table. A production can either be a single

string, or an array of strings, each representing the possible rules for the specific grammar

symbol. When parsing the grammar file, a single string is converted to an array with one element.

To represent an `ϵ` production, use an empty string. The symbols and rules can be in any order.

Example:

```toml

[rules]

# S → A B 'c' | 'a' A B 'b'

S = [

    "A B c",

    "a A B b",

]

# A → 'a' | ϵ

A = [

    "a",

    "",

]

# B → 'b'

B = "b"

```

### Tokens

Regular expressions to match tokens during lexical analysis are described in the `[tokens]` table.

The patterns need to be properly escaped and written in a way that allows partial matching for the

incremental lexical analysis. You can specify a list of strings to match with normal text instead.

Matching precedence is defined by the order of the regular expressions.

Example:

```toml

[tokens]

a = [

    "true",

    "false",

]

b = "'[A-Z\\x61-\\x7A_]*('|$)"

c = "[0-9]+"

```

### Ignored tokens

Regular expressions in the `[ignore]` table define tokens that are ignored during syntax analysis.

The patterns need to follow the rules for the `[tokens]` table.

Example:

```toml

[ignore]

whitespace = "[ \t\r\n]*"

comment = "#.*(\n|$)"

```

### Actions

The `[actions]` table specifies which action to prefer when a Shift/Reduce conflict occurs. This

avoids issues like the *dangling else*. Allowed values are `shift` and `reduce`.

Example:

```toml

[actions]

a = "shift"

```

[LLLR]: https://www.semanticscholar.org/paper/LLLR-Parsing%3A-a-Combination-of-LL-and-LR-Parsing-Slivnik/fac55d573ec8441673022e36f441ca278fc4a717

[DOT]: https://www.graphviz.org/doc/info/lang.html

[TOML]: https://github.com/toml-lang/toml

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/karbonitekream/syn

Awesome Lists containing this project

README