https://github.com/axilmar/parserlib

A c++17 recursive-descent generic parsing library that supports left recursion, tokenization, AST tree creation, multiple error handling etc..
https://github.com/axilmar/parserlib

cpp17 parser parser-library

Last synced: 8 months ago
JSON representation

A c++17 recursive-descent generic parsing library that supports left recursion, tokenization, AST tree creation, multiple error handling etc..

Host: GitHub
URL: https://github.com/axilmar/parserlib
Owner: axilmar
License: apache-2.0
Created: 2012-03-05T15:35:08.000Z (over 13 years ago)
Default Branch: master
Last Pushed: 2024-11-24T11:36:53.000Z (8 months ago)
Last Synced: 2024-11-24T12:27:11.570Z (8 months ago)
Topics: cpp17, parser, parser-library
Language: C++
Homepage:
Size: 9.26 MB
Stars: 92
Watchers: 8
Forks: 19
Open Issues: 0
Metadata Files:
- Readme: readme.md
- License: LICENSE.txt

Awesome Lists containing this project

README

        # Parserlib

A `c++17` recursive-descent parser library that:

- can parse left-recursive grammars.

- can handle multiple errors while parsing.

- provides a lexer class, for tokenization.

- allows for solving tokenization/parsing ambgiguities while parsing.

- allows the creation of AST trees, ready for further processing.

Version 1.0.0.5.

## Table Of Contents

[Introduction](#introduction)

[Writing And Using Parsers](./doc/writing_and_using_parsers.md)

[Versions](#versions)

## Introduction

Parserlib is a `c++17` library that allows the creation of recursive-descent parsers, based on `EBNF`-like notation using the available `c++` operators.

The library allows the declaration of left-recursive grammars.

Here is a calculator parser example:

```cpp

#include "parserlib.hpp"

template  class calculator_grammar {

public:

    enum class ast_id {

        number,

        add,

        sub,

        mul,

        div

    };

    using pe = parserlib::parser_engine;

    calculator_grammar() {

        auto digit = pe::range('0', '9');

        auto number = (-pe::one_of("+-") >> +digit >> -('.' >> +digit))->*ast_id::number;

        auto val = '(' >> m_add >> ')'

                 | number;

        m_mul = (m_mul >> '*' >> val) ->* ast_id::mul

              | (m_mul >> '/' >> val) ->* ast_id::div

              | val;

        m_add = (m_add >> '+' >> m_mul) ->* ast_id::add

              | (m_add >> '-' >> m_mul) ->* ast_id::mul

              | m_mul;

    }

    double evaluate(const SourceT& input) {

        SourceT source = input;

        auto [success, ast, it] = pe::parse(source, m_add);

        if (success && ast.size() == 1) {

            return evaluate_ast(ast[0]);

        }

        throw std::invalid_argument("Parse error: " + std::string(it, source.cend()));

    }

private:

    typename pe::rule m_add, m_mul;

    static double evaluate_ast(const typename pe::ast_node_ptr_type& node) {

        switch (node->get_id()) {

            case ast_id::number: {

                std::stringstream stream;

                stream << node->get_source();

                double r;

                stream >> r;

                return r;

            }

            case ast_id::add:

                return evaluate_ast(node->get_children()[0]) + evaluate_ast(node->get_children()[1]);

            case ast_id::sub:

                return evaluate_ast(node->get_children()[0]) - evaluate_ast(node->get_children()[1]);

            case ast_id::mul:

                return evaluate_ast(node->get_children()[0]) * evaluate_ast(node->get_children()[1]);

            case ast_id::div:

                return evaluate_ast(node->get_children()[0]) / evaluate_ast(node->get_children()[1]);

        }

        throw std::logic_error("invalid ast id");

    }

};

int main() {

    calculator_grammar<> calculator;

    std::cout << calculator.evaluate("1+(5*6)/2") ;

}

```

The above prints `16` at the console.

## Versions

  - 1.0.0.5

  	- Added custom match functions in order to allow the resolution of ambiguities while parsing.

  	- allowed terminal values to be of different type that the value of the source container, in order to allow the result of a parse (the ast nodes created by a parse) to be fed to another parse function.

  	- added terminal parsing via functions.

  	- added parsing via standalone functions.

  	- added multiple error handling.

  - 1.0.0.4

  

    - Rewrote the library:

        - all parser grammar classes are now inside a single template class `class parser_engine`, for the following reasons:

            - compiler performance (MSVC 32-bit regularly crashed with out of memory error from the many template instantiations of previous versions).

            - library code organization; writing a grammar usually requires including all the grammar constructs, so it is reduntant to have separate files for each grammar-related class.

            - user code organization; whole grammars need to be specialized on source type.

        - coding style is closer to the standard: all identifiers are lower case, words are separated by underscores, idiomatic c++ is used whenever possible.

    - Rewrote the documentation, due to more functionality to be added in the future.

 

 

  - 1.0.0.3

 

 	- Reorganized the library in order to support compiler front ends into a separate namespace. The main library is now in `namespace parserlib::core`.

 	- Added `namespace parserlib::cfe` which now contains the compiler-front-end functionality.

 	- separated tokenization and parsing phases for compiler-front-ends.

 	- Added relevant documentation and unit tests.

 

  - 1.0.0.2

 

 	Rewrote the library from scratch in order to provide a better interface. Changes:

 	

	- All the getter methods now start with 'get', in order to play better with Intellisense.

	- The `ParseContext` class is now configured over the Source type, with the default class being the class `SourceString`.

	- The class `SourceString` provides custom iterator which counts lines and columns, compatible with the `std::string` interface.

	- The functions `terminal`, `terminalSet`, `terminalRange` are changed to `term`, `oneOf`, `oneIn`.

	- Matches are now only hierarchical (as in `operator >=` of previous version).

	- The `'operator >=` has been replaced with `operator ->*`, which is much more distinct than the former; no more typing accidentally '>>' where `>=` was intended.

	- The default match id type is no longer a string; it is an int.

	- Simplified the left recursion parsing implementation.

 

  - 1.0.0.1

 

  	Added support for compiler front-end construction.

  	

- 1.0.0.0

 

  	Initial release.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/axilmar/parserlib

Awesome Lists containing this project

README