{"id":27057829,"url":"https://github.com/mrange/fparser","last_synced_at":"2025-04-05T11:33:33.392Z","repository":{"id":45093163,"uuid":"445592895","full_name":"mrange/FParser","owner":"mrange","description":"Experimenting creating Parser combinator library around InlineIfLambda","archived":false,"fork":false,"pushed_at":"2022-01-09T16:20:17.000Z","size":32,"stargazers_count":5,"open_issues_count":1,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-02T02:09:04.976Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"F#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mrange.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-01-07T17:00:11.000Z","updated_at":"2023-05-26T09:24:04.000Z","dependencies_parsed_at":"2022-09-22T17:03:05.301Z","dependency_job_id":null,"html_url":"https://github.com/mrange/FParser","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrange%2FFParser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrange%2FFParser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrange%2FFParser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrange%2FFParser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mrange","download_url":"https://codeload.github.com/mrange/FParser/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247332057,"owners_count":20921849,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-04-05T11:33:28.807Z","updated_at":"2025-04-05T11:33:33.376Z","avatar_url":"https://github.com/mrange.png","language":"F#","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Parsers combinators with F#6 `[\u003cInlineIfLambda\u003e]`\n\n_Thanks to [manofstick](https://gist.github.com/manofstick) peer reviewing the blog post._\n\n\nFor [F# Advent 2021](https://sergeytihon.com/2021/10/18/f-advent-calendar-2021/) I wrote a [blog post](https://gist.github.com/mrange/fbefd946dba6725a0b727b7d3fd81d6f) exploring how F#6 `[\u003cInlineIfLambda\u003e]` can improve data pipeline performance.\n\nI was thinking of other places where `[\u003cInlineIfLambda\u003e]` can help and decided to try to build a parser combinator library with `[\u003cInlineIfLambda\u003e]`.\n\n## Parser combinators\n\n[FParsec](http://www.quanttec.com/fparsec/) is a parser combinator library that allows you to create complex parsers by combining simple parsers. FParsec is a great library and I was amazed by it the first time I tried it.\n\nGraham Hutton and Erik Meijer have written an excellent [introduction](https://www.cs.nott.ac.uk/~pszgmh/monparsing.pdf) to parser combinators. When I read this article many years ago it was the first time that I felt that I understood what Functional Programming is all about.\n\nWe create a parser combinator library by defining what is a parser and then define various functions that combine parsers.\n\nA Parser could be this:\n\n```fsharp\ntype 'T Parser = string -\u003e ('T*string) option\n```\n\nThis function takes a string, produces a value and the unconsumed string. If the parser fails it returns `None`.\n\nIt's possible to build a parser combinator library around this definition but it won't be efficient as we will constantly create substrings.\n\n## FParser\n\nMy first attempt was this:\n\n```fsharp\ntype 'T Parser = string -\u003e int -\u003e ('T*int) option\n```\n\nThe parser takes a string and the current position and returns a value and the position of the first unconsumed char if successful.\n\nWhile experimenting with performance I found that the overhead from creating tuples and options was significant, switching to value tuples and options reduced performance but avoided GC pressure.\n\nIn the end I ended up with this:\n\n```fsharp\ntype FParserContext(input : string) =\n  class\n    [\u003cDefaultValue\u003e] val mutable Pos : int\n\n    member x.Input = input\n  end\n\ntype 'T FParser = FParserContext -\u003e 'T\n```\n\nThe `ParserContext` stores the string to be parsed and the current position. A parser takes a `ParserContext` and returns a value if successful and updates the position in the context. Upon failure, the parser returns the default value of `'T` and has set the position to a negative value.\n\nNot elegant but it was the most performant option I found. There might be better ones.\n\nIn addition; there are a number of parser combinators defined and I won't go through all of them but I will show how `\u003c|\u003e` works which is the choice combinator.\n\n```fsharp\n  // \u003c|\u003e runs the first parser and if successful returns its value\n  //  , otherwise runs the second\n  // Tag as inline + InlineIfLambda to inline both the function and the nested parsers\n  let inline (\u003c|\u003e) ([\u003cInlineIfLambda\u003e] pf : _ FParser) ([\u003cInlineIfLambda\u003e] ps : _ FParser) : _ FParser =\n    fun c -\u003e\n      // Save the current pos\n      let spos = c.Pos\n      // Runs the first parser\n      let fv = pf c\n      // Did we succeed?\n      if c.IsGood () then\n        // Yes\n        success fv\n      else\n        // Otherwise restore the position and run the second parser\n        c.Pos \u003c- spos\n        ps c\n```\n\nThe other combinators are implemented in similar style.\n\nWith the parser combinator library one can implement a simple math expression parser\n\n```fsharp\n// Expression tree for math expressions like: x*(2+123*y)\ntype Expr =\n  | Value     of int\n  | Variable  of string\n  | Binary    of char*(int-\u003eint-\u003eint)*Expr*Expr\n\nmodule FParserCalculator =\n  open FParser.Core\n  open FParser\n\n  let inline ptoken label v =\n    pskipChar label v\n    \u003e\u003e. pskipWhitespace ()\n\n  let inline pop exp ([\u003cInlineIfLambda\u003e] t) ([\u003cInlineIfLambda\u003e] m) =\n    pcharSat exp t |\u003e\u003e m .\u003e\u003e pskipWhitespace ()\n\n  // Normally should have been a module but that interacts poorly with\n  //  for some reason Benchmark.NET\n  type ParserConfig () =\n    class\n      // Since the grammar is recursive use pfwd to do a forward declaration of\n      //  pexpr\n      let struct (pexpr, sexpr) = pfwd\u003cExpr\u003e ()\n\n      // A term is either\n      let pterm : Expr FParser =\n            // An integer\n            (pint () |\u003e\u003e Value .\u003e\u003e pskipWhitespace ())\n            // A variable\n        \u003c|\u003e (pstringSat1 \"variable\" (fun ch p -\u003e (ch \u003e= 'A' \u0026\u0026 ch \u003c= 'Z') || (ch \u003e= 'a' \u0026\u0026 ch \u003c= 'z')) |\u003e\u003e Variable .\u003e\u003e pskipWhitespace ())\n            // Or a sub expression wrapped in parantheses\n        \u003c|\u003e (ptoken \"(\" '(' \u003e\u003e. pexpr .\u003e\u003e ptoken \")\" ')')\n\n      let op0 ch =\n        match ch with\n        | '*' -\u003e fun l r -\u003e Binary ('*', ( * ), l, r)\n        | '/' -\u003e fun l r -\u003e Binary ('/', ( / ), l, r)\n        | _   -\u003e failwith \"Expected * or /\"\n      let op1 ch =\n        match ch with\n        | '+' -\u003e fun l r -\u003e Binary ('+', ( + ), l, r)\n        | '-' -\u003e fun l r -\u003e Binary ('-', ( - ), l, r)\n        | _   -\u003e failwith \"Expected + or -\"\n      // pop0 parses expression separated by */\n      let pop0 : Expr FParser =\n        pterm \u003e+\u003e pchainLeft1 (pop \"*/\" (fun ch -\u003e ch = '*' || ch ='/') (fun ch -\u003e op0 ch))\n      // pop1 parses expression separated by +-\n      //  by splitting it in this way we get different precendece for */ and +-\n      let pop1 : Expr FParser =\n        pop0 \u003e+\u003e pchainLeft1 (pop \"+-\" (fun ch -\u003e ch = '+' || ch ='-') (fun ch -\u003e op1 ch))\n      do\n        // pop1 is the complete the forward declared pexpr\n        //  set pexpr to pop1\n        sexpr pop1\n      // The full grammar\n      let pfull = pskipWhitespace () \u003e\u003e. pop1 .\u003e\u003e peof ()\n\n      member x.Parse s = prun pfull s\n    end\n```\n\nI also implemented a math expression parser using `FParsec` and painstakingly implemented a `Baseline` math expression parser in imperative style.\n\n## Benchmark results\n\nUsing `Benchmark.net` I compared the performance of `FParser`, `FParsec` and `Baseline`. As a late bonus I aslo implemented a calculator expression parser in `C#` using a delegate based parser combinator library.\n\n```\nBenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.1415 (21H2)\nIntel Core i5-3570K CPU 3.40GHz (Ivy Bridge), 1 CPU, 4 logical and 4 physical cores\n.NET SDK=6.0.101\n  [Host] : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT DEBUG\n  PGO    : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT\n  STD    : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT\n\n\n|                     Method | Job |       Mean |    Error |   StdDev |  Gen 0 | Allocated |\n|--------------------------- |---- |-----------:|---------:|---------:|-------:|----------:|\n|   Baseline_BasicExpression | STD |   197.9 ns |  1.63 ns |  1.44 ns | 0.0610 |     192 B |\n|    FParser_BasicExpression | STD |   245.2 ns |  1.40 ns |  1.31 ns | 0.0710 |     224 B |\n|    FParsec_BasicExpression | STD |   734.3 ns |  4.70 ns |  4.17 ns | 0.1631 |     512 B |\n|   CsParser_BasicExpression | STD |   514.5 ns |  3.54 ns |  3.31 ns | 0.0610 |     192 B |\n| Baseline_ComplexExpression | STD |   579.4 ns |  2.69 ns |  2.38 ns | 0.2699 |     848 B |\n|  FParser_ComplexExpression | STD |   721.9 ns | 10.04 ns |  9.39 ns | 0.2804 |     880 B |\n|  FParsec_ComplexExpression | STD | 1,731.1 ns |  7.95 ns |  6.64 ns | 0.3986 |   1,256 B |\n| CsParser_ComplexExpression | STD | 1,375.4 ns |  6.31 ns |  5.91 ns | 0.2689 |     848 B |\n|   Baseline_BasicExpression | PGO |   171.5 ns |  1.68 ns |  1.57 ns | 0.0610 |     192 B |\n|    FParser_BasicExpression | PGO |   222.5 ns |  2.39 ns |  2.23 ns | 0.0713 |     224 B |\n|    FParsec_BasicExpression | PGO |   617.4 ns |  5.82 ns |  5.45 ns | 0.1631 |     512 B |\n|   CsParser_BasicExpression | PGO |   425.6 ns |  6.12 ns |  5.72 ns | 0.0610 |     192 B |\n| Baseline_ComplexExpression | PGO |   449.1 ns |  2.06 ns |  1.93 ns | 0.2699 |     848 B |\n|  FParser_ComplexExpression | PGO |   615.8 ns |  8.99 ns |  8.41 ns | 0.2804 |     880 B |\n|  FParsec_ComplexExpression | PGO | 1,564.1 ns | 14.88 ns | 13.92 ns | 0.3986 |   1,256 B |\n| CsParser_ComplexExpression | PGO | 1,228.4 ns |  6.07 ns |  5.68 ns | 0.2689 |     848 B |\n```\n\nThis does look very promising. As expected the `Baseline` parser does the best but `FParser` is not far behind. `FParsec` has very respectable performance but `FParser` manages to do a bit better. Now `FParsec` supports great error messages which is currently not implemented in `FParser` but there are ways to support error messages without too much overhead.\n\nAnother nice aspect of `FParser` is that it adds less memory overhead than `FParsec`, the overhead comes from creating `FParserContext`\n\n`CsParser` falls between `FParser` and `FParsec` which is pretty good considering it's delegate based. `CsParser` also don't collect error information so comparing to `FParsec` is a bit unfair here as well.\n\n`PGO` uses a new feature in the runtime called [Dynamic Profile-Guided Optimization](https://gist.github.com/EgorBo/dc181796683da3d905a5295bfd3dd95b) which allows the jitter to improve code over time and does seem to give benefit to all parser variants over the standard approach.\n\n## Conclusion\n\nThere is much more to investigate and improve but once again `[\u003cInlineIfLambda\u003e]` seems like its a powerful tool that allows us to write functional style programs with close to imperative performance.\n\nRegards,\n\nMårten\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmrange%2Ffparser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmrange%2Ffparser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmrange%2Ffparser/lists"}