{"id":13760335,"url":"https://github.com/rm-hull/jasentaa","last_synced_at":"2025-05-10T23:02:55.192Z","repository":{"id":62434504,"uuid":"58483008","full_name":"rm-hull/jasentaa","owner":"rm-hull","description":"A parser combinator library for Clojure and ClojureScript","archived":false,"fork":false,"pushed_at":"2025-03-23T10:55:11.000Z","size":196,"stargazers_count":74,"open_issues_count":1,"forks_count":6,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-12T03:16:06.172Z","etag":null,"topics":["clojure","parser","parser-combinators"],"latest_commit_sha":null,"homepage":"https://www.destructuring-bind.org/jasentaa/","language":"Clojure","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rm-hull.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2016-05-10T18:04:17.000Z","updated_at":"2025-03-23T13:34:39.000Z","dependencies_parsed_at":"2024-01-15T03:43:53.881Z","dependency_job_id":"47a0b491-492d-4360-bf68-e5e12d99f152","html_url":"https://github.com/rm-hull/jasentaa","commit_stats":{"total_commits":111,"total_committers":2,"mean_commits":55.5,"dds":0.009009009009009028,"last_synced_commit":"f52a0e75cbdf1d2b72d9604232db264ff6473f12"},"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rm-hull%2Fjasentaa","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rm-hull%2Fjasentaa/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rm-hull%2Fjasentaa/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rm-hull%2Fjasentaa/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rm-hull","download_url":"https://codeload.github.com/rm-hull/jasentaa/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253492654,"owners_count":21916969,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clojure","parser","parser-combinators"],"created_at":"2024-08-03T13:01:08.114Z","updated_at":"2025-05-10T23:02:55.178Z","avatar_url":"https://github.com/rm-hull.png","language":"Clojure","readme":"# [Jäsentää](https://translate.google.co.uk/#fi/en/j%C3%A4sent%C3%A4%C3%A4)\n\n[![Build Status](https://github.com/rm-hull/jasentaa/actions/workflows/clojure.yml/badge.svg)](https://github.com/rm-hull/jasentaa/actions/workflows/clojure.yml)\n[![Coverage Status](https://coveralls.io/repos/rm-hull/jasentaa/badge.svg?branch=main)](https://coveralls.io/r/rm-hull/jasentaa?branch=main)\n[![Downloads](https://versions.deps.co/rm-hull/jasentaa/downloads.svg)](https://versions.deps.co/rm-hull/jasentaa)\n[![Clojars Project](https://img.shields.io/clojars/v/rm-hull/jasentaa.svg)](https://clojars.org/rm-hull/jasentaa)\n[![Maintenance](https://img.shields.io/maintenance/yes/2025.svg?maxAge=2592000)]()\n\nA parser-combinator library for Clojure and ClojureScript.\n\n### Pre-requisites\n\nYou will need [Leiningen](https://github.com/technomancy/leiningen) 2.8.1 or above installed.\n\n### Building\n\nTo build and install the library locally, run:\n\n    $ cd jasentaa\n    $ lein test\n    $ lein install\n\n### Including in your project\n\nThere is a version hosted at [Clojars](https://clojars.org/rm-hull/infix).\nFor leiningen include a dependency:\n\n```clojure\n[rm-hull/jasentaa \"0.2.5\"]\n```\n\nFor maven-based projects, add the following to your `pom.xml`:\n\n```xml\n\u003cdependency\u003e\n  \u003cgroupId\u003erm-hull\u003c/groupId\u003e\n  \u003cartifactId\u003ejasentaa\u003c/artifactId\u003e\n  \u003cversion\u003e0.2.3\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n## API Documentation\n\nSee [www.destructuring-bind.org/jasentaa](http://www.destructuring-bind.org/jasentaa/) for API details.\n\n### Breaking changes between versions 0.1.x → 0.2.x\n\nThe **0.1.x** line worked on parsing a stream of characters. If the parser\nbecame exhausted, then `parse-all` would return `nil` and no indication\nof where the parser failed.\n\nAs of **0.2.0**, althought the parser still accepts a stream of characters, it\nreprocesses them into a stream of [Location](https://github.com/rm-hull/jasentaa/blob/main/src/jasentaa/position.clj#L3)'s.\nIf the input cannot be fully parsed, `parse-all` now throws a [ParseException](https://docs.oracle.com/javase/8/docs/api/java/text/ParseException.html#ParseException-java.lang.String-int-),\nwhere the message gives a human-readable location of where the parse failed,\nand `ParseException#getErrorOffset` gives the zero-indexed offset to the start\nof the unparseable text.\n\nCombinators that previously operated on characters or strings now have to\nextract the text using `jasentaa.position/strip-location`, so a previous\n0.1.x code example that does:\n\n```clojure\n(def single-word\n  (m/do*\n    (w \u003c- (token (plus alpha-num)))\n    (m/return w)))\n```\n\nShould be coverted to:\n\n```clojure\n(def single-word\n  (m/do*\n    (w \u003c- (token (plus alpha-num)))\n    (m/return (strip-location w))))\n```\n\n## Worked Example #1\n\nIn [Getting Started with PyParsing](http://shop.oreilly.com/product/9780596514235.do),\n**Paul McGuire** describes an example search string interface, with support for\nAND, OR, and NOT keyword qualifiers, and gives examples of some typical search\nphrases one might use in a search engine:\n\n    wood and blue or red\n    wood and (blue or red)\n    (steel or iron) and \"lime green\"\n    not steel or iron and \"lime green\"\n    not(steel or iron) and \"lime green\"\n\nThe article then goes on to build up python code that returns the parsed\nresults in a hierarchical structure based on the precedence of operations among\nthe AND, OR, and NOT quantifiers, where NOT has the highest precendence and is\nevaluated first, with AND next highest precedence, while OR is the lowest and\nevaluated last.\n\nExpressing this in BNF, we have the following rules:\n\n* _**searchExpr** ::= searchAnd [ OR searchAnd ]..._\n\n* _**searchAnd** ::= searchTerm [ AND searchTerm ]..._\n\n* _**searchTerm** ::= \\[NOT\\] ( singleWord | quotedString | '(' searchExpr ')' )_\n\nFollowing the _PyParsing_ implementation, we can build up the\nparsers in Clojure starting with:\n\n```clojure\n(ns jasentaa.worked-example-1\n  (:require\n    [jasentaa.monad :as m]\n    [jasentaa.position :refer [strip-location]]\n    [jasentaa.parser :refer [parse-all]]\n    [jasentaa.parser.basic :refer :all]\n    [jasentaa.parser.combinators :refer :all]))\n\n(def digit (from-re #\"[0-9]\"))\n(def letter (from-re #\"[a-z]\"))\n(def alpha-num (any-of letter digit))\n```\n\nwhich just defines some basic character parsers; then, we use these to build up\nparsers for _singleWord_, _quotedString_ and bracketed expressions.\n\n```clojure\n(declare search-expr)\n\n(def single-word\n  (m/do*\n    (w \u003c- (token (plus alpha-num)))\n    (m/return (strip-location w))))\n\n(def quoted-string\n  (m/do*\n    (symb \"\\\"\")\n    (t \u003c- (plus (any-of digit letter (match \" \"))))\n    (symb \"\\\"\")\n    (m/return (strip-location t))))\n\n(def bracketed-expr\n  (m/do*\n    (symb \"(\")\n    (expr \u003c- (token search-expr))\n    (symb \")\")\n    (m/return expr)))\n```\n\n(Note how it is necessary to forward declare `search-expr`)\n\nNext, a _searchTerm_ parser is composed from the three prior parsers. The\nreturned value is wrapped with a `:NOT` keyword as necessary:\n\n```clojure\n(def search-term\n  (m/do*\n    (neg \u003c- (optional (symb \"not\")))\n    (term \u003c- (any-of single-word quoted-string bracketed-expr))\n    (m/return (if (empty? neg) term (list :NOT term)))))\n```\n\nFinally the _searchAnd_ and _searchExpr_ parsers are implemented in terms\nof the earlier definitions:\n\n```clojure\n(def search-and\n  (m/do*\n    (lst \u003c- (separated-by search-term (symb \"and\")))\n    (m/return (if (= (count lst) 1)\n                (first lst)\n                (cons :AND lst)))))\n\n(def search-expr\n  (m/do*\n    (lst \u003c- (separated-by search-and (symb \"or\")))\n    (m/return (if (= (count lst) 1)\n                (first lst)\n                (cons :OR lst)))))\n```\n\nNotice how the returned values are (purposely) constructed in prefix notation,\nwhereas the _Getting Started with PyParsing_ examples are returned infix.\nPrefix notation is (obviously) more LISPy, and as well as being consistent with\nthe host language, this makes the resulting aborescent structures simpler to\nhandle as well.\n\nTesting the parsers for the given examples:\n\n```clojure\n(parse-all search-expr \"wood and blue or red\")\n; =\u003e (:OR (:AND \"wood\" \"blue\") \"red\")\n\n(parse-all search-expr \"wood and (blue or red)\")\n; =\u003e (:AND \"wood\" (:OR \"blue\" \"red\"))\n\n(parse-all search-expr \"(steel or iron) and \\\"lime green\\\"\")\n; =\u003e (:AND (:OR \"steel\" \"iron\") \"lime green\")\n\n(parse-all search-expr \"not steel or iron and \\\"lime green\\\"\")\n; =\u003e (:OR (:NOT \"steel\") (:AND \"iron\" \"lime green\"))\n\n(parse-all search-expr \"not(steel or iron) and \\\"lime green\\\"\")\n; =\u003e (:AND (:NOT (:OR \"steel\" \"iron\")) \"lime green\")\n```\n\nThis example is encapsulated as a [test](https://github.com/rm-hull/jasentaa/blob/main/test/jasentaa/worked_example_1.clj).\n\n## Worked Example #2\n\nThe previous example yielded a resulting data structure which corresponded to the\nparsed input. There is no reason why the result cannot be evaluated as part of the\nparsing process. **Graham Hutton** and **Erik Meijer** presented a simple integer\ncalculator in [Functional Pearls: _Monadic Parsing in Haskell_](http://www.cs.uwyo.edu/~jlc/courses/3015/parser_pearl.pdf)\nwhich does exactly this.\n\nConsidering a standard grammar for arithmetic expressions built up from single digits\nusing the operators +, -, * and /, together with parentheses:\n\n* _**expr** ::= expr addop term | term_\n\n* _**term** ::= term mulop factor | factor_\n\n* _**factor** ::= digit | ( expr )_\n\n* _**digit** ::= 0 | 1 | ... | 9_\n\n* _**addop** ::= + | -_\n\n* _**mulop** ::= * | /_\n\nAs per the _Haskell_ implementation, we need to forward declare\nthe _expr_ parser:\n\n```clojure\n(ns jasentaa.worked-example-2\n  (:require\n    [jasentaa.monad :as m]\n    [jasentaa.position :refer :all])\n    [jasentaa.parser :as p]\n    [jasentaa.parser.basic :refer :all]\n    [jasentaa.parser.combinators :refer :all]))\n\n(declare expr)\n```\n\nThe _digit_ parser follows the exact same implementation as the Haskell\nexample; A check is made to see if the current input satisfies the `digit?`\npredicate, and the returned value is calculated from the ordinal value of the\ncharacter minus zero's ordinal.\n\n```clojure\n(defn- digit? [^Character c]\n  (Character/isDigit c))\n\n(def digit\n  (m/do*\n    (x \u003c- (token (sat digit?)))\n    (m/return (- (byte (strip-location x)) (byte \\0)))))\n```\n\n_factor_ is either a single digit or a bracketed-expression:\n\n```clojure\n(def factor\n  (choice\n    digit\n    (m/do*\n      (symb \"(\")\n      (n \u003c- (fwd expr))\n      (symb \")\")\n      (m/return n))))\n```\n\n_addop_ and _mulop_ yield a choice of the core function for +, -, * and /\nrespectively. _term_ and _expr_ are then simple `chain-left` applications\nas per the declared grammar:\n\n```clojure\n(def addop\n  (choice\n    (m/do*\n      (symb \"+\")\n      (m/return +))\n    (m/do*\n      (symb \"-\")\n      (m/return -))))\n\n(def mulop\n  (choice\n    (m/do*\n      (symb \"*\")\n      (m/return *))\n    (m/do*\n      (symb \"/\")\n      (m/return /))))\n\n(def term\n  (chain-left factor mulop))\n\n(def expr\n  (chain-left term addop))\n```\n\nTesting the example expression yields the expected result:\n\n```clojure\n(take 1 (p/apply expr \" 1 - 2 * 3 + 4 \"))\n; =\u003e ([-1, ()])\n; i.e. (+ 4 (- 1 (* 2 3)))\n```\n\n`chain-left` associates from the left, so this expression evaluates as _((1 - (2 * 3)) + 4)_.\n`chain-right` associates from the right, so substituting that would evaluate as _(1 - ((2 * 3) + 4))_,\nresulting in -9. Clearly, in both cases, multiplcation binds before addition.\n\n```clojure\n(def term'\n  (chain-right factor mulop))\n\n(def expr'\n  (chain-right term addop))\n\n(take 1 (p/apply expr' \" 1 - 2 * 3 + 4 \"))\n; =\u003e ([-9, ()])\n; i.e. (- 1 (+ 4 (* 2 3)))\n```\n\n_I can't immediately think of a scenario where `chain-right` would be used\nover `chain-left` - postfix notation perhaps? - but other than that..._\n\nThis example is also encapsulated as another [test](https://github.com/rm-hull/jasentaa/blob/main/test/jasentaa/worked_example_2.clj).\n\n## Further examples \u0026 implementations\n\n* [ODS Search Appliance](https://github.com/rm-hull/ods-search-appliance) uses\n  a similar EBNF grammar for search phrases to the above example. However\n  rather than returning a data structure, the parsed result is a composed\n  function that takes a trigram inverted-index, and returns a list of matching\n  document IDs.\n\n* [Warren's Abstract Machine](https://github.com/rm-hull/wam) is an\n  _\"in-progress\"_ Prolog implementation which uses parser combinators to read\n  Prolog programs (questions, facts and rules) before compiling into\n  virtual machine instructions.\n\n* [Infix](https://github.com/rm-hull/infix) is a Clojure library that allows\n  infix math expressions to be read from a string, and 'compiled' into a\n  function definition.\n\n## Attribution\n\nSubstantial portions based on:\n* https://gist.github.com/kachayev/b5887f66e2985a21a466\n* https://pyparsing.wikispaces.com/\n\n## References\n\n* http://www.cs.uwyo.edu/~jlc/courses/3015/parser_pearl.pdf\n* http://www.haskellforall.com/2012/10/parsing-chemical-substructures.html\n* https://speakerdeck.com/kachayev/monadic-parsing-in-python\n\n## License\n\n### The MIT License (MIT)\n\nCopyright (c) 2016-18 Richard Hull\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n","funding_links":[],"categories":["Clojure"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frm-hull%2Fjasentaa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frm-hull%2Fjasentaa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frm-hull%2Fjasentaa/lists"}