Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/strojure/parsesso

Parser combinators for Clojure(Script).
https://github.com/strojure/parsesso

babashka clojure clojurescript parser parser-combinators

Last synced: about 2 months ago
JSON representation

Parser combinators for Clojure(Script).

Awesome Lists containing this project

README

        

# parsesso

[Parser combinators](https://en.wikipedia.org/wiki/Parser_combinator) for
Clojure(Script).

[![Clojars Project](https://img.shields.io/clojars/v/com.github.strojure/parsesso.svg)](https://clojars.org/com.github.strojure/parsesso)
![ClojarsDownloads](https://img.shields.io/clojars/dt/com.github.strojure/parsesso)

[![cljdoc badge](https://cljdoc.org/badge/com.github.strojure/parsesso)](https://cljdoc.org/d/com.github.strojure/parsesso)
[![cljs compatible](https://img.shields.io/badge/cljs-compatible-green)](https://clojurescript.org/)
[![bb compatible](https://raw.githubusercontent.com/babashka/babashka/master/logo/badge.svg)](https://babashka.org)
[![tests](https://github.com/strojure/parsesso/actions/workflows/tests.yml/badge.svg)](https://github.com/strojure/parsesso/actions/workflows/tests.yml)

## Motivation

* Idiomatic and convenient API for parser combinators in Clojure and
ClojureScript.

## Inspiration

* [haskell/parsec](https://github.com/haskell/parsec)
* [blancas/kern](https://github.com/blancas/kern)
* [youngnh/parsatron](https://github.com/youngnh/parsatron)
* [rm-hull/jasentaa](https://github.com/rm-hull/jasentaa)

## Documentation

As far as there is no comprehensive documentation how to use `parsesso` there
are another resources to get familiar with idea of parser combinators in Clojure:

- [Kern documentation wiki](https://github.com/blancas/kern/wiki).

## Cheat sheet

| Parsesso | Parsec[1],[2],[3] | Kern[4] | Parsatron[5] |
|---------------------------------------|---------------------------------|-------------------------|-------------------------|
| [p/do-parser] | | `fwd` | `defparser` |
| [p/result] | `return` | `return` | `always` |
| [p/fail] | `fail` | `fail` | `never` |
| [p/fail-unexpected] | `unexpected` | `unexpected` | |
| [p/expecting] | `>`, `label` | `>`, `expect` | |
| [p/bind] | `>>=` | `>>=` | `bind` |
| [p/for] | `do` | `bind` | `let->>` |
| [p/after] | `>>` | `>>` | `>>`, `nxt` |
| [p/value] | `fmap` | `<$>` | |
| [p/maybe] | `try` | `<:>` | `attempt` |
| [p/look-ahead] | `lookAhead` | `look-ahead` | `lookahead` |
| [p/not-followed-by] | `notFollowedBy` | `not-followed-by` | |
| [p/*many] | `many` | `many` | `many` |
| [p/+many] | `many1` | `many1` | `many1` |
| [p/*skip] | `skipMany` | `skip-many` | |
| [p/+skip] | `skipMany1` | `skip-many1` | |
| [p/token] | `token`, `satisfy` | `satisfy` | `token` |
| [p/token-not] | | | |
| [p/word] | `tokens`, `string` | `token*` | `string` |
| [p/any-token] | `anyToken`,`anyChar` | `any-char` | `any-char` |
| [p/eof] | `eof` | `eof` | `eof` |
| [p/group] | `<*>` | `<*>` | |
| [p/alt] | <|>, `choice` | <|> | `choice` |
| [p/option] | `option`, `optional` | `option`, `optional` | |
| [p/between] | `between` | `between` | `between` |
| [p/times] | `count` | `times` | `times` |
| [p/*many-till] | `manyTill` | `many-till` | |
| [p/*sep-by] | `sepBy` | `sep-by` | |
| [p/+sep-by] | `sepBy1` | `sep-by1` | |
| [p/*sep-end-by] | `endBy` | `end-by` | |
| [p/+sep-end-by] | `endBy1` | `end-by1` | |
| [p/*sep-opt-by] | `sepEndBy` | `sep-end-by` | |
| [p/+sep-opt-by] | `sepEndBy1` | `sep-end-by1` | |
| [p/get-state] | `getParserState`... | input, pos, user state | |
| [p/set-state] | `setParserState`... | input, pos, user state | |
| [p/update-state] | `updateParserState`... | user state | |
| [p/trace] | `parserTrace`, `parserTraced` | | |
| [expr/*chain-left] | `chainl` | `chainl` | |
| [expr/+chain-left] | `chainl1` | `chainl1` | |
| [expr/*chain-right] | `chainr` | `chainr` | |
| [expr/+chain-right] | `chainr1` | `chainr1` | |
| [char/is] | `char`, `oneOf` | `sym*`, `one-of*` | `char` |
| [char/is-not] | `noneOf` | `none-of*` | |
| [char/regex] | | | |
| [char/upper?] | `upper` | `upper` (unicode) | |
| [char/lower?] | `lower` | `lower` (unicode) | |
| [char/letter?] | `letter` | `letter` (unicode) | `letter` (unicode) |
| [char/number?] | `digit` | `digit` (unicode) | `digit` (unicode) |
| [char/letter-or-number?] | `alphaNum` | `alpha-num` (unicode) | |
| [char/white?] | `space` | `white-space` (unicode) | |
| [char/newline] | `endOfLine` | `new-line*` | |
| [char/str*] | | `<+>` | |

[1]: https://github.com/haskell/parsec/blob/master/src/Text/Parsec/Prim.hs

[2]: https://github.com/haskell/parsec/blob/master/src/Text/Parsec/Combinator.hs

[3]: https://github.com/haskell/parsec/blob/master/src/Text/Parsec/Char.hs

[4]: https://github.com/blancas/kern/blob/master/src/main/clojure/blancas/kern/core.clj

[5]: https://github.com/youngnh/parsatron/blob/master/src/clj/the/parsatron.clj

[p/do-parser]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#do-parser

[p/result]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#result

[p/fail]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#fail

[p/fail-unexpected]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#fail-unexpected

[p/expecting]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#expecting

[p/bind]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#bind

[p/for]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#for

[p/after]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#after

[p/value]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#value

[p/maybe]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#maybe

[p/look-ahead]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#look-ahead

[p/not-followed-by]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#not-followed-by

[p/*many]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#*many

[p/+many]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#+many

[p/*skip]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#*skip

[p/+skip]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#+skip

[p/token]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#token

[p/token-not]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#token-not

[p/word]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#word

[p/any-token]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#any-token

[p/eof]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#eof

[p/group]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#group

[p/alt]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#alt

[p/option]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#option

[p/between]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#between

[p/times]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#times

[p/*many-till]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#*many-till

[p/*sep-by]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#*sep-by

[p/+sep-by]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#+sep-by

[p/*sep-end-by]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#*sep-end-by

[p/+sep-end-by]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#+sep-end-by

[p/*sep-opt-by]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#*sep-opt-by

[p/+sep-opt-by]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#+sep-opt-by

[p/get-state]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#get-state

[p/set-state]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#set-state

[p/update-state]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#update-state

[p/trace]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.parser#trace

[expr/*chain-left]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.expr#*chain-left

[expr/+chain-left]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.expr#+chain-left

[expr/*chain-right]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.expr#*chain-right

[expr/+chain-right]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.expr#+chain-right

[char/is]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.char#is

[char/is-not]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.char#is-not

[char/regex]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.char#regex

[char/upper?]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.char#upper?

[char/lower?]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.char#lower?

[char/letter?]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.char#letter?

[char/number?]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.char#number?

[char/letter-or-number?]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.char#letter-or-number?

[char/white?]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.char#white?

[char/newline]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.char#newline

[char/str*]: https://cljdoc.org/d/com.github.strojure/parsesso/CURRENT/api/strojure.parsesso.char#str*

## Examples

* [HoneySQL SELECT](doc/demo/honeysql_select.clj)

## Performance

See some benchmarks [here](doc/benchmarks/compare.clj).

## FAQ

**What parser combinators are & are good for? How does it differ e.g. from
Instaparse, which also parses text into data?**

A parser combinator library is a library with functions that can be composed
into a parser. Instaparse takes a grammar specification, but in a parser
combinator library you build the specification from functions, rather than a
DSL.

**When should I pick parser combinators over EBNF? Do they offer the same,
and it is only question of which one I prefer to learn or is there some distinct
advantage over a DSL such as EBNF? Perhaps it is easier to describe more complex
grammars b/c I can make my own helper functions, or something?**

In general, parser combinators such as `parsesso` are for creating top-down
(i.e. LL) parsers, with the ability to reuse common code (this lib). Parser
Generators typically generate a finite state automaton for a bottom-up (LR)
parser. Though nowadays there are also combinators for LR grammars and
generators for LL ones (e.g. ANTLR). Which one you should use, depends on how
hard your grammar is, and how fast the parser needs to be. Especially if the
grammar has lot of non-trivial ambiguities then it might be easier with the more
flexible combinators approach.

## Contributors

- [Michiel Borkent](https://github.com/borkdude)
+ Compatibility with babashka.
+ Github CI configuration.
+ Clj-kondo configuration tips.
- [Jakub Holý](https://github.com/holyjak)
+ Questions and answers in FAQ.