An open API service indexing awesome lists of open source software.

https://github.com/taocpp/PEGTL

Parsing Expression Grammar Template Library
https://github.com/taocpp/PEGTL

cpp cpp11 cpp17 grammar header-only parser-combinators parsing parsing-expression-grammar parsing-expression-grammars peg pegtl

Last synced: 8 months ago
JSON representation

Parsing Expression Grammar Template Library

Awesome Lists containing this project

README

          

# Welcome to the PEGTL

[![Windows](https://github.com/taocpp/PEGTL/actions/workflows/windows.yml/badge.svg)](https://github.com/taocpp/PEGTL/actions/workflows/windows.yml)
[![macOS](https://github.com/taocpp/PEGTL/actions/workflows/macos.yml/badge.svg)](https://github.com/taocpp/PEGTL/actions/workflows/macos.yml)
[![Linux](https://github.com/taocpp/PEGTL/actions/workflows/linux.yml/badge.svg)](https://github.com/taocpp/PEGTL/actions/workflows/linux.yml)
[![Android](https://github.com/taocpp/PEGTL/actions/workflows/android.yml/badge.svg)](https://github.com/taocpp/PEGTL/actions/workflows/android.yml)


[![clang-analyze](https://github.com/taocpp/PEGTL/actions/workflows/clang-analyze.yml/badge.svg)](https://github.com/taocpp/PEGTL/actions/workflows/clang-analyze.yml)
[![clang-tidy](https://github.com/taocpp/PEGTL/actions/workflows/clang-tidy.yml/badge.svg)](https://github.com/taocpp/PEGTL/actions/workflows/clang-tidy.yml)
[![Sanitizer](https://github.com/taocpp/PEGTL/actions/workflows/sanitizer.yml/badge.svg)](https://github.com/taocpp/PEGTL/actions/workflows/sanitizer.yml)
[![CodeQL](https://github.com/taocpp/PEGTL/actions/workflows/codeql-analysis.yml/badge.svg)](https://github.com/taocpp/PEGTL/actions/workflows/codeql-analysis.yml)
[![Codecov](https://codecov.io/gh/taocpp/PEGTL/branch/main/graph/badge.svg?token=ykWa8RRdyk)](https://codecov.io/gh/taocpp/PEGTL)

The Parsing Expression Grammar Template Library (PEGTL) is a zero-dependency C++ header-only parser combinator library for creating parsers according to a [Parsing Expression Grammar](http://en.wikipedia.org/wiki/Parsing_expression_grammar) (PEG).

During development of a new major version the main branch can go through incompatible changes. For a stable experience please download [the latest release](https://github.com/taocpp/PEGTL/releases) rather than using the main branch.

## Documentation

* [Changelog](doc/Changelog.md)
* [Development](doc/README.md) (requires C++17)
* [Version 3.x](https://github.com/taocpp/PEGTL/blob/3.x/doc/README.md) (requires C++17)
* [Version 2.x](https://github.com/taocpp/PEGTL/blob/2.x/doc/README.md) (requires C++11)
* [Version 1.x](https://github.com/taocpp/PEGTL/blob/1.x/doc/README.md) (requires C++11)

## Contact

For questions and suggestions regarding the PEGTL, success or failure stories, and any other kind of feedback, please feel free to open a [discussion](https://github.com/taocpp/PEGTL/discussions), an [issue](https://github.com/taocpp/PEGTL/issues) or a [pull request](https://github.com/taocpp/PEGTL/pulls), or contact the authors at `taocpp(at)icemx.net`.

## Introduction

Grammars are written as regular C++ code, created with template programming (not template meta programming), i.e. nested template instantiations that naturally correspond to the inductive definition of PEGs (and other parser-combinator approaches).

A comprehensive set of [parser rules](doc/Rule-Reference.md) that can be combined and extended by the user is included, as are mechanisms for debugging grammars, and for attaching user-defined [actions](doc/Actions-and-States.md) to grammar rules.
Here is an example of how a parsing expression grammar rule is implemented as C++ class with the PEGTL.

```c++
// PEG rule for integers consisting of a non-empty
// sequence of digits with an optional sign:

// sign ::= '+' / '-'
// integer ::= sign? digit+

// The same parsing rule implemented with the PEGTL:

using namespace tao::pegtl;

struct sign : one< '+', '-' > {};
struct integer : seq< opt< sign >, plus< digit > > {};
```

PEGs are superficially similar to Context-Free Grammars (CFGs), however the more deterministic nature of PEGs gives rise to some very important differences.
The included [grammar analysis](doc/Grammar-Analysis.md) finds several typical errors in PEGs, including left recursion.

## Design

The PEGTL is designed to be "lean and mean", the core library consists of approximately 6000 lines of code.
Emphasis is on simplicity and efficiency, preferring a well-tuned simple approach over complicated optimisations.

The PEGTL is mostly concerned with parsing combinators and grammar rules, and with giving the user of the library (the possibility of) full control over all other aspects of a parsing run.
Whether/which actions are taken, and whether/which data structures are created during a parsing run, is entirely up to the user.

Included are some [examples](doc/Contrib-and-Examples.md#examples) for typical situation like unescaping escape sequences in strings, building a generic [JSON](http://www.json.org/) data structure, and on-the-fly evaluation of arithmetic expressions.

Through the use of template programming and template specialisations it is possible to write a grammar once, and use it in multiple ways with different (semantic) actions in different (or the same) parsing runs.

With the PEG formalism, the separation into lexer and parser stages is usually dropped -- everything is done in a single grammar.
The rules are expressed in C++ as template instantiations, and it is the compiler's task to optimise PEGTL grammars.

## Status

Each commit is automatically tested with multiple architectures, operating systems, compilers, and versions thereof.

Each commit is checked with the GCC and Clang [sanitizers](https://github.com/google/sanitizers), Clang's [Static Analyzer](https://clang-analyzer.llvm.org/), and [`clang-tidy`](http://clang.llvm.org/extra/clang-tidy/).
Additionally, we use [CodeQL](https://securitylab.github.com/tools/codeql) to scan for (security) issues.

Code coverage is automatically measured and the unit tests cover 100% of the core library code (for releases).

[Releases](https://github.com/taocpp/PEGTL/releases) are done in accordance with [Semantic Versioning](http://semver.org/).
Incompatible API changes are *only* allowed to occur between major versions.

## Thank You

In appreciation of all contributions here are the people that have [directly contributed](https://github.com/taocpp/PEGTL/graphs/contributors) to the PEGTL and/or its development.

[amphaal](https://github.com/amphaal)
[anand-bala](https://github.com/anand-bala)
[andoma](https://github.com/andoma)
[barbieri](https://github.com/barbieri)
[bjoe](https://github.com/bjoe)
[bwagner](https://github.com/bwagner)
[cdiggins](https://github.com/cdiggins)
[clausklein](https://github.com/clausklein)
[delpinux](https://github.com/delpinux)
[dkopecek](https://github.com/dkopecek)
[gene-hightower](https://github.com/gene-hightower)
[irrequietus](https://github.com/irrequietus)
[jedelbo](https://github.com/jedelbo)
[joelfrederico](https://github.com/joelfrederico)
[johelegp](https://github.com/johelegp)
[jovermann](https://github.com/jovermann)
[jubnzv](https://github.com/jubnzv)
[kelvinhammond](https://github.com/kelvinhammond)
[kneth](https://github.com/kneth)
[kuzmas](https://github.com/kuzmas)
[lambdafu](https://github.com/lambdafu)
[lichray](https://github.com/lichray)
[michael-brade](https://github.com/michael-brade)
[mkrupcale](https://github.com/mkrupcale)
[newproggie](https://github.com/newproggie)
[obiwahn](https://github.com/obiwahn)
[ohanar](https://github.com/ohanar)
[pauloscustodio](https://github.com/pauloscustodio)
[pleroux0](https://github.com/pleroux0)
[quadfault](https://github.com/quadfault)
[quarticcat](https://github.com/quarticcat)
[ras0219](https://github.com/ras0219)
[redmercury](https://github.com/redmercury)
[robertcampion](https://github.com/robertcampion)
[samhocevar](https://github.com/samhocevar)
[sanssecours](https://github.com/sanssecours)
[sgbeal](https://github.com/sgbeal)
[skyrich62](https://github.com/skyrich62)
[studoot](https://github.com/studoot)
[svenjo](https://github.com/svenjo)
[wickedmic](https://github.com/wickedmic)
[wravery](https://github.com/wravery)
[zhihaoy](https://github.com/zhihaoy)

## The Art of C++

The PEGTL is part of [The Art of C++](https://taocpp.github.io/).

[colinh](https://github.com/colinh)
[d-frey](https://github.com/d-frey)
[uilianries](https://github.com/uilianries)

## License

Open Source Initiative

Copyright (c) 2007-2023 Daniel Frey and Dr. Colin Hirsch

The PEGTL is certified [Open Source](http://www.opensource.org/docs/definition.html) software.
It is [licensed](https://pdimov.github.io/blog/2020/09/06/why-use-the-boost-license/) under the terms of the [Boost Software License, Version 1.0](https://www.boost.org/LICENSE_1_0.txt) reproduced here.

> Boost Software License - Version 1.0 - August 17th, 2003
>
> Permission is hereby granted, free of charge, to any person or organization obtaining a copy of the software and accompanying documentation covered by this license (the "Software") to use, reproduce, display, distribute, execute, and transmit the Software, and to prepare derivative works of the Software, and to permit third-parties to whom the Software is furnished to do so, all subject to the following:
>
> The copyright notices in the Software and this entire statement, including the above license grant, this restriction and the following disclaimer, must be included in all copies of the Software, in whole or in part, and all derivative works of the Software, unless such copies or derivative works are solely in the form of machine-executable object code generated by a source language processor.
>
> THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.