https://github.com/calico32/ebnf-language-support
Extended Backus-Naur Form (EBNF) support for VSCode
https://github.com/calico32/ebnf-language-support
ebnf extended-bnf extension vscode vscode-extension
Last synced: about 2 months ago
JSON representation
Extended Backus-Naur Form (EBNF) support for VSCode
- Host: GitHub
- URL: https://github.com/calico32/ebnf-language-support
- Owner: calico32
- License: mit
- Created: 2023-02-13T15:47:27.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2025-09-15T15:41:28.000Z (9 months ago)
- Last Synced: 2025-09-15T16:30:36.336Z (9 months ago)
- Topics: ebnf, extended-bnf, extension, vscode, vscode-extension
- Language: TypeScript
- Homepage:
- Size: 1.58 MB
- Stars: 4
- Watchers: 3
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# EBNF Language Support
This extension adds support for an EBNF-like syntax ([Extended Backus-Naur Form](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form)) to Visual Studio Code.

## Table of Contents
- [EBNF Language Support](#ebnf-language-support)
- [Table of Contents](#table-of-contents)
- [Features](#features)
- [Roadmap](#roadmap)
- [EBNF Syntax](#ebnf-syntax)
- [Comments](#comments)
- [Rules](#rules)
- [Rule Names](#rule-names)
- [Expressions](#expressions)
- [Literals](#literals)
- [Special Cases](#special-cases)
- [Groups](#groups)
- [Ranges](#ranges)
- [Concatenation](#concatenation)
- [Alternation](#alternation)
- [Exclusion](#exclusion)
- ["One or more"](#one-or-more)
## Features
- Syntax highlighting + semantic highlighting
- Basic error checking
- Syntax errors
- Undefined symbols
- Duplicate symbols
- Go to definition
- Find all references
- Document symbols (go to symbol, outline)
- Basic code completion
- Rule names
- Hover information
- Rule name and definition
- Code folding
## Roadmap
- Railroad diagram generation
## EBNF Syntax
This extension implements a simple and strict-ish version of EBNF. The syntax is defined in itself in [ebnf.ebnf](./ebnf.ebnf).
The dialect implemented mostly follows the [ISO/IEC 14977](https://www.iso.org/standard/26153.html) standard, with some extensions for clarity and convenience.
## Comments
Comments are defined using the `(*` and `*)` delimiters.
## Rules
Rules are defined using the assignment operator `=`. The left-hand side is the rule name, and the right-hand side is an _expression_. Rules must end with a semicolon `;`.
### Rule Names
Rule names can start with any letter, number, or an underscore. They can also contain a hyphen, but not at the beginning. Rule names are case-sensitive.
## Expressions
Expressions are made up of _terms_ and _operators_. Terms are either literals, references to other rules (by name), special cases, groups, or ranges. Operators are used to combine terms into more complex expressions.
### Literals
Literals are enclosed in single quotes or double quotes. They can contain any character except for the quote character used to enclose them. No escaping is considered, so you can't use a single quote inside a single-quoted literal, or a double quote inside a double-quoted literal. How to interpret sequences like `\n` is up to the reader. Both literals and special cases can be multiline.
### Special Cases
Special cases are used to describe content that cannot be easily expressed using the other terms. They are enclosed in question marks `?`, and can have multiple lines.
```ebnf
? any character ?
? valid UTF-8 ?
```
### Groups
There are three different types of groups:
- Parentheses (_group_) are only used to group terms together.
- Brackets (_optional_) indicate that the content inside is optional, i.e. it can appear zero or one times.
- Braces (_repetition_) indicate that the content inside can appear zero or more times.
### Ranges
Ranges are used to define a set a contiguous characters. They are composed of two strings joined by two dots `..`.
Ranges have no specific definition of what a range "is". It should be obvious what the range should represent. For example, a range of `"A".."Z"` is probably a set of uppercase letters, while a range of `"0".."9"` is probably a set of digits.
### Concatenation
Concatenation can be defined using the comma `,` operator between terms or by juxtaposition of terms.
It does not define what whitespace is allowed between terms; it is assumed that the reader knows what is and isn't allowed.
```ebnf
"A", "B", "C" (* probably "ABC" *)
"fn" name "()" (* probably "fn foo()" *)
```
### Alternation
The alternation operator is the pipe `|`. It is used to define a set of possible choices for a term.
```ebnf
"A" | "B" | "C" (* "A", "B", or "C" *)
"A", ( "B" | "C" ) (* "AB" or "AC" *)
```
### Exclusion
The exclusion operator is the caret `-`. It is used to define a set of possible choices for a term, but excludes one or more of them.
```ebnf
letter = "A".."Z" ;
not_z = letter - "Z" ; (* "A".."Y" *)
```
### "One or more"
The postfix operators `+` and `-` modify the preceding term to indicate that it occurs "one or more" times. The following forms are equivalent:
```ebnf
many-as = { "a" }+ ; (* "a", "aa", "aaa", ... but not "" *)
many-as = { "a" }- ;
many-as = { "a" } - '' ;
```
> [!NOTE]
>
> The `-` operator is also valid as an infix oerator (see
> [Exclusion](#exclusion)). Thus, when another term follows a unary `-`, it will
> be interpreted as an exclusion instead of a concatenation. Adding a comma
> directly after a unary `-` can be used to disambiguate this case, but can
> be confusing and error-prone:
>
> ```ebnf
> ooof = { "o" }-, "f" ;
> ```
>
> Usage of `-` as a postfix operator is therefore discouraged. Using `+`, although
> not part of ISO/IEC 14977, is recommended instead.