{"id":13415952,"url":"https://github.com/pyrocat101/opal","last_synced_at":"2025-03-17T07:30:50.527Z","repository":{"id":25934500,"uuid":"29375876","full_name":"pyrocat101/opal","owner":"pyrocat101","description":"Self-contained monadic parser combinators for OCaml","archived":false,"fork":false,"pushed_at":"2023-07-24T22:23:49.000Z","size":93,"stargazers_count":147,"open_issues_count":3,"forks_count":14,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-27T19:34:57.175Z","etag":null,"topics":["parser-combinators","parser-monad"],"latest_commit_sha":null,"homepage":null,"language":"OCaml","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pyrocat101.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-01-17T01:36:47.000Z","updated_at":"2025-02-11T15:35:53.000Z","dependencies_parsed_at":"2024-05-02T15:21:17.613Z","dependency_job_id":null,"html_url":"https://github.com/pyrocat101/opal","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyrocat101%2Fopal","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyrocat101%2Fopal/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyrocat101%2Fopal/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyrocat101%2Fopal/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pyrocat101","download_url":"https://codeload.github.com/pyrocat101/opal/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243848088,"owners_count":20357489,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["parser-combinators","parser-monad"],"created_at":"2024-07-30T21:00:53.267Z","updated_at":"2025-03-17T07:30:50.217Z","avatar_url":"https://github.com/pyrocat101.png","language":"OCaml","readme":"# Opal: Monadic Parser Combinators for OCaml\n\nOpal is a minimum collection of useful parsers and combinators (~150 loc of\nOCaml) that makes writing parsers easier. It is designed to be small,\nself-contained, pure-functional, and only includes most essential parsers, so\nthat one could include single file in the project or just embed it in other\nOCaml source code files.\n\nI find myself writing lots of recursive-descent parsers from scratch in OCaml\nwhen I was solving Hackerrank FP challenges. That's why I wrote opal: to include\nit in the source code and build parsers on top of parser combinators easily.\n\n## Example\n\nTrivial arithmetic calculator:\n\n~~~ocaml\nopen Opal\n\nlet parens = between (exactly '(') (exactly ')')\nlet integer = many1 digit =\u003e implode % int_of_string\nlet add = exactly '+' \u003e\u003e return ( + )\nlet sub = exactly '-' \u003e\u003e return ( - )\nlet mul = exactly '*' \u003e\u003e return ( * )\nlet div = exactly '/' \u003e\u003e return ( / )\n\nlet rec expr input = chainl1 term (add \u003c|\u003e sub) input\nand term input = chainl1 factor (mul \u003c|\u003e div) input\nand factor input = (parens expr \u003c|\u003e integer) input\n\nlet () =\n  let input = LazyStream.of_channel stdin in\n  match parse expr input with\n  | Some ans -\u003e Printf.printf \"%d\\n\" ans\n  | None -\u003e print_endline \"ERROR!\"\n~~~\n\nFor non-trivial examples, see Hackerrank challenge solutions using opal in\n`examples/`.\n\n## Documentation\n\nThe expressiveness of parser combinators are attributed to higher-order\nfunctions and the extensive use of currying. However, due to lack of `do`\nsyntax, the bind operation of monad would not be as succinct as that in Haskell.\n\nA parser monad is either `None` (indicates failure), or `Some` pair of result\nand unconsumed input, where the result is a user-defined value. The input is a\nlazy stream of arbitrary token type.  A parser is a function that accepts an\ninput and returns a parser monad. Although most parsers in opal is polymorphic\nover token type and result type, some useful parsers only accepts `char` as\ninput token type.\n\nSince combinators in opal are roughly based on Haskell's Parsec. The following\ndocumentation is somehow a rip-off of Parsec's doc.\n\n### Lazy Stream\n\n**`type 'a LazyStream t`**\n\nPolymorphic lazy stream type.\n\n**`val LazyStream.of_stream : 'a Stream.t -\u003e 'a LazyStream.t`**\n\nBuild a lazy stream from stream.\n\n**`val LazyStream.of_function : (unit -\u003e 'a) -\u003e 'a LazyStream.t`**\n\nBuild a lazy stream from a function `f`. The elements in the stream is populated\nby calling `f ()`.\n\n**`val LazyStream.of_string : string -\u003e char LazyStream.t`**\n\nBuild a char lazy stream from string.\n\n**`val LazyStream.of_channel : in_channel -\u003e char LazyStream.t`**\n\nBuild a char lazy stream from a input channel.\n\n### Utilities\n\n**`val implode : char list -\u003e bytes`**\n\nImplode character list into a string. Useful when used with `many`.\n\n**`val explode : bytes -\u003e char list`**\n\nExplode a string into a character list.\n\n**`val ( % ) : ('a -\u003e 'b) -\u003e ('b -\u003e 'c) -\u003e 'a -\u003e 'c`**\n\nInfix operator for left-to-right function composition. `(f % g % h) x` is\nequivalent to `h (g (f x))`.\n\n**`val parse : ('token, 'result) parser -\u003e 'token LazyStream -\u003e 'result option`**\n\n`parse parser input` parses `input` with `parser`, and returns `Some result` if\nsucceed, or `None` on failure.\n\n### Primitives\n\n**`type 'token input = 'token LazyStream.t`**\n\n**`type ('token, 'result) monad = ('result * 'token input) option`**\n\n**`type ('token, 'result) parser = 'token input -\u003e ('token, 'result) monad`**\n\nA parser is a function that accepts an input and returns either `None` on\nfailure, or `Some (result, input')`, where `result` is user-defined value and\n`input'` is unconsumed input after parsing.\n\n**`val return : 'result -\u003e 'token input -\u003e ('token, 'result) monad`**\n\nAccepts a value and an input, and returns a monad.\n\n**`val ( \u003e\u003e= ) : ('t, 'r) parser -\u003e ('r -\u003e ('t, 'r) monad) -\u003e ('t, 'r) parser`**\n\n`x \u003e\u003e= f` returns a new parser that if parser `x` succeeds, applies function `f` \non monad produced by `x`, and produces a new monad (a.k.a. `bind`).\n\n**`val ( let* ) : ('t, 'r) parser -\u003e ('r -\u003e ('t, 'r) monad) -\u003e ('t, 'r) parser`**\n\nThis operator is the same as `\u003e\u003e=` but using the `let` notation.\nIt is usefull to avoid ugly sequences of bindings. For exemple, `p \u003e\u003e= fun x -\u003e f x` can\nbe rewritten `let* x = p in f x`. Combined with the `return` function, we can define complex parsers :\n\n```ocaml\nlet tuple_parser =\n  let* x = digit in\n  let* _ = exactly ',' in\n  let* y = digit in\n  return (x, y)\n```\n\n**`val ( \u003c|\u003e ) : ('t, 'r) parser -\u003e ('t, 'r) parser -\u003e ('t, 'r) parser`**\n\nChoice combinator. The parser `p \u003c|\u003e q` first applies `p`. If it succeeds, the\nvalue of `p` is returned. If `p` fails, parser `q` is tried.\n\n**`val mzero : 'a -\u003e ('t, 'r) monad`**\n\nA parser that always fails.\n\n**`val any : ('t, 'r) parser`**\n\nThe parser succeeds for any token in the input. Consumes a token and returns it.\n\n**`val satisfy : ('t -\u003e bool) -\u003e ('t, 'r) parser`**\n\nThe parser `satisfy test` succeeds for any token for which the supplied function\n`test` returns `true`. Returns the token that is actually parsed.\n\n**`val eof : 'a -\u003e ('t, 'a) parser`**\n\nThe parser `eof x` succeeds if the input is exhausted. Returns value `x`.\n\n### Derived\n\n**`val ( =\u003e ) : ('t, 'r) parser -\u003e ('r -\u003e 'a) -\u003e ('t, 'a) parser`**\n\nMap combinator. `x =\u003e f` parses `x`. If it succeeds, returns the value of `x`\napplied with `f`.\n\n**`val ( \u003e\u003e ) : ('t, 'a) parser -\u003e ('t, 'b) parser -\u003e ('t, 'b) parser`**\n\nIgnore-left combinator. `x \u003e\u003e y` parses `x` and then `y`. Returns the value\nreturned by `y`.\n\n**`val ( \u003c\u003c ) : ('t, 'a) parser -\u003e ('t, 'b) parser -\u003e ('t, 'a) parser`**\n\nIgnore-right combinator. `x \u003e\u003e y` parses `x` and then `y`. Returns the value\nreturned by `x`.\n\n**`val ( \u003c~\u003e ) : ('t, 'r) parser -\u003e ('t, 'r list) parser -\u003e 't input -\u003e ('t, 'r list) monad`**\n\nCons combinator. `x \u003c~\u003e y` parses `x` and then `y`. Returns the value of `x`\nprepended to the value of `y` (a list).\n\n~~~ocaml\nlet ident = letter \u003c~\u003e many alpha_num\n~~~\n\n**`val choice : ('t, 'r) parser list -\u003e ('t, 'r) parser`**\n\n`choice ps` tries to apply the parsers in the list `ps` in order, until one of\nthem succeeds. Returns the value of the succeeding parser.\n\n**`val count : int -\u003e ('t, 'r) parser -\u003e 't input -\u003e ('t, 'r list) monad`**\n\n`count n` parses `n` occurrences of `p`. If `n` is smaller or equal to zero, the\nparser equals to `return []`. Returns a list of `n` values returned by `p`.\n\n**`between : ('t, 'a) parser -\u003e ('t, 'b) parser -\u003e ('t, 'r) parser -\u003e ('t, 'r) parser`**\n\n`between open close p` parses `open`, followed by `p` and `close`. Returns the\nvalue returned by `p`.\n\n~~~ocaml\nlet braces = between (exactly '{') (exactly '}')\n~~~\n\n**`val option : 'r -\u003e ('t, 'r) parser -\u003e ('t, 'r) parser`**\n\n`option default p` tries to apply parser `p`. If `p` fails, it returns the\nvalue `default`, otherwise the value returned by `p`.\n\n~~~ocaml\nlet priority = option 0 (digit =\u003e String.make 1 % int_of_string)\n~~~\n\n**`val optional : 'r -\u003e ('t, 'r) parser -\u003e ('t, unit) parser`**\n\n`optional p` tries to apply parser `p`. It will parse `p` or nothing. It only\nfails if `p` fails. Discard the result of `p`.\n\n**`val skip_many : ('t, 'r) parser -\u003e ('t, unit) parser`**\n\n`skip_many p` applies `p` *zero or more* times, skipping its result.\n\n~~~ocaml\nlet spaces = skip_many space\n~~~\n\n**`val skip_many1 : ('t, 'r) parser -\u003e ('t, unit) parser`**\n\n`skip_many1 p` applies `p` *one or more* times, skipping its result.\n\n**`val many : ('t, 'r) parser -\u003e 't input -\u003e ('t, 'r list) monad`**\n\n`many p` applies the parser `p` *zero or more* times. Returns a list of returned\nvalues of `p`.\n\n**`val many1 : ('t, 'r) parser -\u003e 't input -\u003e ('t, 'r list) monad`**\n\n`many1 p` applies the parser `p` *one or more* times. Returns a list of returned\nvalues of `p`.\n\n**`val sep_by : ('t, 'r) parser -\u003e ('t, 'a) parser -\u003e 't input -\u003e ('t, 'r list) monad`**\n\n`sep_by p sep` parses *zero or more* occurrences of `p`, separated by `sep`.\nReturns a list of values returned by `p`.\n\n~~~ocaml\nlet comma_sep p = sep_by p (token \",\")\n~~~\n\n**`val sep_by1 : ('t, 'r) parser -\u003e ('t, 'a) parser -\u003e 't input -\u003e ('t, 'r list) monad`**\n\n`sep_by1 p sep` parses *one or more* occurrences of `p`, separated by `sep`.\nReturns a list of values returned by `p`.\n\n**`val end_by: ('t, 'r) parser -\u003e ('t, 'a) parser -\u003e ('t, 'r) parser`**\n\n`end_by p sep` parses *zero or more* ocurrences of `p`, separated and ended by\n`sep`. Returns a list of values returned by `p`.\n\n~~~ocaml\nlet statements = end_by statement (token \";\")\n~~~\n\n**`val end_by1: ('t, 'r) parser -\u003e ('t, 'a) parser -\u003e ('t, 'r) parser`**\n\n`end_by1 p sep` parses *one or more* ocurrences of `p`, separated and ended by\n`sep`. Returns a list of values returned by `p`.\n\n**`val chainl : ('t, 'r) parser -\u003e ('t, 'r -\u003e 'r -\u003e 'r) parser -\u003e 'r -\u003e ('t, 'r) parser`**\n\n`chainl p op default` parses *zero or more* occurrences of `p`, separated by\n`op`. Returns a value obtained by a *left* associative application of all\nfunctions by `op` to the values returned by `p`. If there are zero occurences\nof `p`, the value `default` is returned.\n\n**`val chainl1 : ('t, 'r) parser -\u003e ('t, 'r -\u003e 'r -\u003e 'r) parser -\u003e ('t, 'r) parser`**\n\n`chainl1 p op` parses *one or more* occurrences of `p`, separate by `op`.\nReturns a value obtained by a *left* associative application of all functions\nreturned by `op` to the values returned by `p`. This parser can be used to\neliminate left recursion which typically occurs in expression grammars. See\nthe arithmetic caculator example above.\n\n**`val chainr : ('t, 'r) parser -\u003e ('t, 'r -\u003e 'r -\u003e 'r) parser -\u003e 'r -\u003e ('t, 'r) parser`**\n\n`chainr p op default` parses *zero or more* occurrences of `p`, separated by\n`op`. Returns a value obtained by *right* associative application of all\nfunctions returned by `op` to the values returned by `p`. If there are no\noccurrences of `p`, the value `x` is returned.\n\n**`val chainr1 : ('t, 'r) parser -\u003e ('t, 'r -\u003e 'r -\u003e 'r) parser -\u003e ('t, 'r) parser`**\n\n`chainr p op` parses *one or more* occurrences of `p`, separated by `op`.\nReturns a value obtained by *right* associative application of all functions\nreturned by `op` to the values returned by `p`.\n\n### Singletons\n\n**`val exactly : 'r -\u003e ('t, 'r) parser`**\n\n`exactly x` parses a single token `x`. Returns the parsed token (i.e. `x`).\n\n~~~ocaml\nlet semi_colon = exactly ';'\n~~~\n\n**`val one_of : 'r list -\u003e ('t, 'r) parser`**\n\n`one_of xs` succeeds if the current token is in the supplied list of tokens\n`xs`. Returns the parsed token.\n\n~~~ocaml\nlet vowel = one_of ['a'; 'e'; 'i'; 'o'; 'u']\n~~~\n\n**`val none_of : 'r list -\u003e ('t, 'r) parser`**\n\nAs the dual of `one_of`, `none_of xs` succeeds if the current token *not* in\nthe supplied list of tokens `xs`. Returns the parsed token.\n\n~~~ocaml\nlet consonant = none_of ['a'; 'e'; 'i'; 'o'; 'u']\n~~~\n\n**`val range : 'r -\u003e 'r -\u003e ('t, 'r) parser`**\n\n`range low high` succeeds if the current token is in the range between `low`\nand `high` (inclusive). Returns the parsed token.\n\n### Char Parsers\n\n**`val space = (char, char) parser`**\n\nParses a white space character (`'\\s\\t\\r\\n'`). Returns the parsed character.\n\n**`val spaces = (char, unit) parser`**\n\nSkip *zero or more* white spaces characters.\n\n**`val newline = (char, char) parser`**\n\nParses a newline character (`'\\n'`). Returns a newline character.\n\n**`val tab = (char, char) parser`**\n\nParses a tab character (`'\\t'`). Returns a tab character.\n\n**`val upper = (char, char) parser`**\n\nParses an upper case letter (a character between 'A' and 'Z'). Returns the\nparsed character.\n\n**`val lower = (char, char) parser`**\n\nParses a lower case letter (a character between 'a' and 'z'). Returns the parsed\ncharacter.\n\n**`val digit : (char, char) parser`**\n\nParses a digit. Returns the parsed character.\n\n**`val letter = (char, char) parser`**\n\nParses a letter (an upper case or lower case letter). Returns the parsed\ncharacter.\n\n**`val alpha_num = (char, char) parser`**\n\nParses a letter or digit. Returns the parser character.\n\n**`val hex_digit = (char, char) parser`**\n\nParses a hexadecimal digit (a digit or a letter between 'a' and 'f' or 'A' and\n'F'). Returns the parsed character.\n\n**`val oct_digit = (char, char) parser`**\n\nParses an octal digit (a character between '0' and '7'). Returns the parsed\ncharacter.\n\n### Lex Helper\n\n**`val lexeme : (char, 'r) parser -\u003e (char, 'r) parser`**\n\n`lexeme p` first applies `skip_many space` and then parser `p`. Returns the\nvalue returned by `p`.\n\n**`val token : string -\u003e char input -\u003e (char, char list) monad`**\n\n`token s` skips leading white spaces and parses a sequence of characters given\nby string `s`. Returns the parsed character sequence as a list.\n\n~~~ocaml\ndiv_or_mod = token \"div\" \u003c|\u003e token \"mod\"\n~~~\n","funding_links":[],"categories":["Compilers and Compiler Tools","OCaml"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpyrocat101%2Fopal","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpyrocat101%2Fopal","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpyrocat101%2Fopal/lists"}