https://github.com/lapets/imparse
Parser generator that can be used to quickly and succinctly define a parser definition, and to deploy an automatically-generated implementations thereof in multiple languages and on multiple platforms.
https://github.com/lapets/imparse
ll-parser parse parser parser-generator parser-library parsers parsing-library recursive-descent-parser
Last synced: 5 months ago
JSON representation
Parser generator that can be used to quickly and succinctly define a parser definition, and to deploy an automatically-generated implementations thereof in multiple languages and on multiple platforms.
- Host: GitHub
- URL: https://github.com/lapets/imparse
- Owner: lapets
- License: mit
- Created: 2013-03-19T00:28:30.000Z (about 12 years ago)
- Default Branch: master
- Last Pushed: 2020-05-08T04:44:29.000Z (almost 5 years ago)
- Last Synced: 2024-11-09T03:46:27.452Z (6 months ago)
- Topics: ll-parser, parse, parser, parser-generator, parser-library, parsers, parsing-library, recursive-descent-parser
- Language: JavaScript
- Homepage: https://imparse.org
- Size: 5.01 MB
- Stars: 5
- Watchers: 6
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# imparse
Lightweight infinite-lookahead parser generator that supports basic grammars defined in a JSON format. More information and interactive examples are available at [imparse.org](http://imparse.org).
[](https://badge.fury.io/js/imparse)
This library makes it possible to rapidly assemble and deploy a parser for a simple language. It is intended primarily for languages that have an [LL grammar](https://en.wikipedia.org/wiki/LL_grammar).
## Usage
### Representation of Grammars
Suppose we want to represent a grammar of basic arithmetic expressions in the following way (assuming that operators will associate to the right):
```javascript
var grammar = [
{"Term": [
{"Add": [["Factor"], "+", ["Term"]]},
{"": [["Factor"]]}
]},
{"Factor": [
{"Mul": [["Atom"], "*", ["Factor"]]},
{"": [["Atom"]]}
]},
{"Atom": [
{"Num": [{"RegExp":"[0-9]+"}]}
]}
];
```
It is assumed that grammars are represented as nested objects according to the following conventions:* a *grammar* consists of an array of production rules;
* each *production rule* is represented by an object that maps the name of its non-terminal to an array of possible cases;
* each *case* is represented by an object that maps a case name to a sequence of terminals and non-terminals; and
* each *entry* in a case sequence can be a terminal (represented as a string), non-terminal (represented by a singleton array with a non-terminal string), or regular expression (represented as an object with a single key `"RegExp"` that maps to the actual regular expression string).Note that the *case name* (i.e., the sole key in each case object) is used within any abstract syntax tree node constructed according to that case. For example, if a token sequence or string is parsed successfully according to the case sequence `{"Add": [["Factor"], "+", ["Term"]]}`, then the resulting abstract syntax tree will be of the form `{"Add":[...]}`.
### Basic Parsing
It is possible to parse a string according to the grammar in the following way:
```javascript
imparse.parse(grammar, '1*2 + 3*4')
```
The above yields the following abstract syntax tree:
```javascript
{"Add":[
{"Mul":[{"Num":["1"]},{"Num":["2"]}]},
{"Mul":[{"Num":["3"]},{"Num":["4"]}]}
]}
```