{"id":17796362,"url":"https://github.com/d-plaindoux/transept","last_synced_at":"2025-03-17T02:31:15.083Z","repository":{"id":79520863,"uuid":"232749567","full_name":"d-plaindoux/transept","owner":"d-plaindoux","description":"An OCaml  modular and generalised parser combinator library.","archived":false,"fork":false,"pushed_at":"2021-08-15T07:44:42.000Z","size":126,"stargazers_count":21,"open_issues_count":3,"forks_count":2,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-11T16:11:57.599Z","etag":null,"topics":["modular","monad","ocaml-library","parser-combinators"],"latest_commit_sha":null,"homepage":"","language":"OCaml","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/d-plaindoux.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-01-09T07:32:45.000Z","updated_at":"2022-07-28T01:57:44.000Z","dependencies_parsed_at":"2023-03-12T08:25:02.050Z","dependency_job_id":null,"html_url":"https://github.com/d-plaindoux/transept","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-plaindoux%2Ftransept","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-plaindoux%2Ftransept/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-plaindoux%2Ftransept/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/d-plaindoux%2Ftransept/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/d-plaindoux","download_url":"https://codeload.github.com/d-plaindoux/transept/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243841211,"owners_count":20356443,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["modular","monad","ocaml-library","parser-combinators"],"created_at":"2024-10-27T11:45:09.272Z","updated_at":"2025-03-17T02:31:15.077Z","avatar_url":"https://github.com/d-plaindoux.png","language":"OCaml","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Transept\n\n![Transept](https://github.com/d-plaindoux/transept/workflows/Transept/badge.svg)\n\nAn OCaml modular and generalised parser combinator library.\n\n# Installation\n\nInstall the library and its dependencies via [OPAM](https://opam.ocaml.org/packages/transept/transept.0.1.0/):\n\n```\nopam install transept\n```\n\nor in your `project-name.opam` dependencies:\n\n```\n...\ndepends: [\n  \"transept\" { \u003e= \"0.1.0\" }\n  ...\n]  \n...\n```\n\n# Examples\n\n## Parsing arithmetic expressions\n\n### ADTs definition\n\nThis example is the traditional arithmetic expression language. This can be represented by the following abstract data \ntypes.\nIn this first example we only care about significant items like `float`, parenthesis and finally operations. \n\n```ocaml\ntype operation =\n  | Add\n  | Minus\n  | Mult\n  | Div\n\ntype expr =\n  | Number of float\n  | BinOp of operation * expr * expr\n```\n\n### Parsers with a direct style \n\nDirect style means we parse a stream of characters. In this case all characters are significant even spaces. \n\n#### Required modules\n\n`Transept` provides modules in order to help parsers construction. In the next fragment `Utils` contains basic functions \nlike `constant`. The `Parser` module is a is parser dedicated to char stream analysis and `Literals`is dedicated to string, \nfloat etc. parsing. \n\n```ocaml\nmodule Utils = Transept.Utils\nmodule CharParser = Transept.Extension.Parser.For_char_list\nmodule Literals = Transept.Extension.Literals.Make (CharParser)\n```\n\n#### Operation parser\n\nTherefore we can propose a first parser dedicated to operations. \n\n```ocaml\nlet operator = \n    let open Utils in\n    let open CharParser in\n        (atom '+' \u003c$\u003e constant Add) \n    \u003c|\u003e (atom '-' \u003c$\u003e constant Minus)\n    \u003c|\u003e (atom '*' \u003c$\u003e constant Mult)\n    \u003c|\u003e (atom '/' \u003c$\u003e constant Div)\n```\n\n#### Expression parser\n\nThen the simple expression and the expression can be defined by the following parsers.\n     \n```ocaml\nlet expr = \n    (* sexpr ::= float | '(' expr ')' *)\n    let rec sexpr () =\n      let open Literals in\n      let open CharParser in\n      float \u003c$\u003e (fun f -\u003e Number f) \u003c|\u003e (atom '(' \u0026\u003e do_lazy expr \u003c\u0026 atom ')')\n    \n    (* expr ::= sexpr (operator expr)? *)\n    and expr () =\n      let open CharParser in\n      do_lazy sexpr \u003c\u0026\u003e opt (operator \u003c\u0026\u003e do_lazy expr) \u003c$\u003e function\n      | e1, None -\u003e e1\n      | e1, Some (op, e2) -\u003e BinOp (op, e1, e2)\n    \n    in expr\n```\n\nFinally, a sentence can be easily parsed.\n\n```ocaml\nlet parse s =\n    let open Utils in\n    let open CharParser in\n    parse (expr ()) @@ Stream.build @@ chars_of_string s\n```\n\nWith this solution we don't skip whitespaces. It means `1+(2+3)` is parsed when `1 + (2 + 3)` is not!  \n\n### The indirect style\n\nSince `Transept` is a generalized version, it's possible to parse something other than characters. For this purpose a \ngeneric lexer is proposed thanks to the `Genlex` module. \n\n#### Required modules\n\n`Transept` provides modules in order to help parsers construction. In the next fragment `Utils` contains basic functions \nlike `constant`. The `CharParser` module is a parser dedicated to char stream analysis and `Stream`is dedicated to \nparsing using another parser.\n\n```ocaml\nmodule Utils = Transept.Utils.Fun\nmodule Parser = Transept.Extension.Parser.For_char_list\nmodule Stream = Transept.Stream.Via_parser (Parser)\nmodule Genlex = Transept.Genlex.Lexer.Make (Parser)\n```\n\n#### Main parser\n\n```ocaml\nmodule Parser =\n  Transept.Core.Parser.Make_via_stream\n    (Stream)\n    (struct\n      type t = Transept.Genlex.Lexeme.t\n    end)\n\nmodule Token = Transept.Genlex.Lexeme.Make (Parser) \n```\n\n#### Operation parser\n\nTherefore, we can propose a first parser dedicated to operations. \n\n```ocaml\nlet operator = \n    let open Utils in\n    let open Parser in\n    let open Token in\n        (kwd \"+\" \u003c$\u003e constant Add)  \n    \u003c|\u003e (kwd \"-\" \u003c$\u003e constant Minus)\n    \u003c|\u003e (kwd \"*\" \u003c$\u003e constant Mult)\n    \u003c|\u003e (kwd \"/\" \u003c$\u003e constant Div)\n```\n\n#### Expression parser\n\nThen the simple expression and the expression can be defined by the following parsers.\n     \n```ocaml\nlet expr = \n    (* sexpr ::= float | '(' expr ')' *)\n    let rec sexpr () =\n      let open Parser in\n      let open Lexeme in\n      float \u003c$\u003e (fun f -\u003e Number f) \u003c|\u003e (kwd \"(\" \u0026\u003e do_lazy expr \u003c\u0026 kwd \")\")\n    \n    (* expr ::= sexpr (operator expr)? *)\n    and expr () =\n      let open Parser in\n      do_lazy sexpr \u003c\u0026\u003e opt (operator \u003c\u0026\u003e do_lazy expr) \u003c$\u003e function\n      | e1, None -\u003e e1\n      | e1, Some (op, e2) -\u003e BinOp (op, e1, e2)\n    \n    in expr ()\n```\n\nFinally, a sentence can be parsed using parsers. First one `CharParser` parses char stream and is used by the `Genlex` in order to create a stream\nof lexemes. The second one `Parser` is used to parse the previous lexeme stream.\n\n```ocaml\nlet parse s =\n    let open Utils in\n    let open Parser in \n    let parser = CharParser.Stream.build @@ Utils.chars_of_string s in\n    let stream = Stream.build Genlex.tokenizer parser in\n    parse (expr \u003c\u0026 eos) stream\n```\n\nWith this solution whitespaces are skipped by the generic lexer. It means `1 + ( 2+ 3)` is parsed correctly now.  \n\n# Indirect style applied to JSON parsing\n\nA [JSON Parser](https://github.com/d-plaindoux/transept/blob/master/lib/transept_json/parser.ml) has been designed with this approch based on a low level parser producing tokens and a high level parser producing JSON terms from tokens.\n\n# LICENSE \n\nMIT License\n\nCopyright (c) 2020 Didier Plaindoux\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fd-plaindoux%2Ftransept","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fd-plaindoux%2Ftransept","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fd-plaindoux%2Ftransept/lists"}