An open API service indexing awesome lists of open source software.

https://github.com/zhuzilin/swiftpeg

A PEG parser generator written in swift 5.3.
https://github.com/zhuzilin/swiftpeg

parser parser-generator parsimonious swift

Last synced: 7 months ago
JSON representation

A PEG parser generator written in swift 5.3.

Awesome Lists containing this project

README

          

# SwiftPEG

A PEG parser generator written in swift 5.3. The code structure and grammar are largely learnt from the excellent python package [parsimonious](https://github.com/erikrose/parsimonious). If you are doing some parsing using python, you should definitely check it out.

The nice part of this parser generator is that its PEG rule parser is also generated from a PEG syntax with a bootstrap manner, and the bootstrap hardcoding parser can also be generated from itself. The rule syntax is:

```
rules = _ rule+
rule = label equals expression
equals = "=" _
literal = spaceless_literal _

spaceless_literal = ~"\"[^\"\\\\]*(?:\\\\.[^\"\\\\]*)*\""is

expression = ored / sequence / term
or_term = "/" _ term
ored = term or_term+
sequence = term term+
not_term = "!" term _
lookahead_term = "&" term _
term = not_term / lookahead_term / quantified / atom
quantified = atom quantifier
atom = reference / literal / regex / parenthesized
regex = "~" spaceless_literal ~"[ilmsuxa]*"i _
parenthesized = "(" _ expression ")" _
quantifier = ~"[*+?]" _
reference = label !equals

label = ~"[a-zA-Z_][a-zA-Z_0-9]*" _

_ = meaninglessness*
meaninglessness = ~"\s+" / comment
comment = ~"#[^\r\n]*"
```

Notice that the above syntax is the same as [parsimonious](https://github.com/erikrose/parsimonious).

To write a proper PEG syntax, please follow the [PEG syntax reference](https://www.gnu.org/software/guile/manual/html_node/PEG-Syntax-Reference.html).

## Usage

In your `Package.swift`, add the following code to dependencies:

```swift
.package(name: "SwiftPEG", url: "https://github.com/zhuzilin/SwiftPEG.git", from: "0.1.0"),
```

And add `"SwiftPEG"` to target dependencies.

## Example

Here is an example of a simplified markdown parser.

```swift
let markdownSyntax = #"""
raw_text = ~"[^\n]+"
bold_text = ("**" raw_text "**") / ("__" raw_text "__")
text = (bold_text / raw_text)

h1 = "# " text
h2 = "## " text
h3 = "### " text
h4 = "#### " text
h5 = "##### " text
h6 = "######" text
header = (h6 / h5 / h4 / h3 / h2 / h1)

ordered_list = (~"[0~9]+\. " text ~"\n")+

unordered_list = (~"[-*+] " text ~"\n")+

link = "[" raw_text "]" "(" raw_text ")"

image = "![" raw_text "]" "(" raw_text ")"

paragraph = (header / text)?
doc = (paragraph ~"\n\n")* paragraph
"""#

// Initialize the parser
let markdownParser: Grammar = Grammar(rules: markdownSyntax)
// Get the AST root node from the parser with the name of the rule you defined in the syntax.
let ast: Node = grammar.parse(for: text, with: "doc")
// Then do what ever you like with the AST
...
// Or your can use the simplified AST which only contains node with named rule
let simplifiedAst: SimplifiedNode = simplify(for: ast)
...
```

## API

### Grammar

`Grammar` type has the following public interfaces:

```swift
public class Grammar {
// Name dict of the parsing rules defined in the syntax
// It will be generated upon init.
// If it is empty it means there is some error in the syntax.
public var ruleDict: [String: Expression] = [:]

public init(rules: String)
// Return nil if the parsing failed
public func parse(for text: String, with ruleName: String) -> Node?
}
```

### Node

`Node` type has the following public interfaces:

```swift
public struct Node: CustomStringConvertible, Equatable {
// The parser node used to parse this node
public let expr: Expression
public var name: String { expr.name }
// The children nodes
public var children: [Node] = []
// The matched text of this Node
public var text: String
// The matched range of this Node
public let start: String.Index
public let end: String.Index

public var description: String {
toString(withName: true)
}
public func toString(withName: Bool = false) -> String

public static func ==(lhs: Node, rhs: Node) -> Bool
}
```

### SimplifiedNode

`SimplifiedNode` type has the following interfaces:

```swift
public struct SimplifiedNode: CustomStringConvertible {
public let name: String
// The children nodes
public var children: [SimplifiedNode] = []
// The matched text of this Node
public var text: String
// The matched range of this Node
public let start: String.Index
public let end: String.Index

public var description: String
}
```

### Expression

Normally you should not work with this type. If you have interest, please check `Expression.swift` for more information.

## TODO

- Support better error handling.
- Optimize the performance with memoization.