{"id":21823653,"url":"https://github.com/nurdann/arithmeticajs","last_synced_at":"2025-06-15T00:04:26.623Z","repository":{"id":77346505,"uuid":"360348997","full_name":"nurdann/ArithmeticaJS","owner":"nurdann","description":null,"archived":false,"fork":false,"pushed_at":"2021-05-07T06:34:28.000Z","size":66,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-21T11:50:33.159Z","etag":null,"topics":["arithmetic-parser"],"latest_commit_sha":null,"homepage":"https://nurdann.github.io/ArithmeticaJS/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nurdann.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-04-22T00:47:04.000Z","updated_at":"2021-05-07T06:34:31.000Z","dependencies_parsed_at":null,"dependency_job_id":"6f54bbd2-b0ea-4689-a9d2-d182a8b3e6ae","html_url":"https://github.com/nurdann/ArithmeticaJS","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/nurdann/ArithmeticaJS","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nurdann%2FArithmeticaJS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nurdann%2FArithmeticaJS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nurdann%2FArithmeticaJS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nurdann%2FArithmeticaJS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nurdann","download_url":"https://codeload.github.com/nurdann/ArithmeticaJS/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nurdann%2FArithmeticaJS/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259901382,"owners_count":22929224,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arithmetic-parser"],"created_at":"2024-11-27T17:35:31.703Z","updated_at":"2025-06-15T00:04:26.600Z","avatar_url":"https://github.com/nurdann.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Arithmetic parser\n\n## Grammar\n\nWe can start with the following grammar using the same ideas from [the implementation in Haskell](https://github.com/nurdann/ArithmeticaHS),\n```\nExpr -\u003e Expr + Term | Expr - Term | Term\nTerm -\u003e Term * Factor | Term / Factor | Factor\nFactor -\u003e ( Expr ) | Number\nNumber -\u003e [0-9]+\n```\nwhere capitalized word indicates a non-terminal expression and lower-case words indicate a terminal symbol.\n\nThen we eliminate left-recursion with the introduction of additional rule that has an empty production `Epsilon`,\n```\nExpr -\u003e Term Expr'\nExpr' -\u003e + Term Expr' | - Term Expr' | Epsilon\nTerm -\u003e Factor Term'\nTerm' -\u003e * Factor Term' | / Factor Term' | Epsilon\nFactor -\u003e ( Expr ) | Number\nNumber -\u003e [0-9]+\n```\n\nThe above grammar belongs to the class of LL(`k`) grammars where LL stands for Left-to-right Left-most derivation of grammar, and `k` determines number of characters/terminals needed to determine next rule production so it is called predictive parsing. In our case `k=1` because one character is enough to determine the next production rule.\n\nFrom the above grammar we can compute `FIRST(X)` AND `FOLLOW(X)` sets of characters that are appear before and after the rule is produced, respectively. \n\nLet find first symbols for `Expr`, the first rule is `Term` whose first rule is `Factor`. So, their set of first symbols is equivalent because we can expand them as `Expr =\u003e Term Expr' =\u003e Factor Term'`.\n\n```\nFIRST(Expr) = FIRST(Term) = FIRST(Factor) = { (, [0-9] }\nFIRST(Expr') = { +, -, Epsilon }\nFIRST(Term') = { *, /, Epsilon} \n```\n\nTo compute `FOLLOW(X)` symbols we need to look for production rules where `X` appears. First, we imagine that the top-most expression has a symbol `$` to indicate end of string, i.e `Expr $`, so `$` is in the set `FOLLOW(Expr)`. Furthermore, we see that `Expr` appears in `Factor -\u003e ( Expr )` so `)` is also in the set `FOLLOW(Expr)`.\n\nFor the set `FOLLOW(Term)`, `Term` can be be always expanded as `Factor Term'` so `FOLLOW(Term) = FOLLOW(Term')`. In the rule for `Expr'`, `Term` is followed by `Expr'` so `FIRST(Expr') \\ { Epsilon }` is in the set `FOLLOW(Term)`, i.e `{ +, - }` in `FOLLOW(Term)`. In the rule `Expr`, `Expr'` has an Epsilon-production so it can be expressed as `Expr -\u003e Term Epsilon`, or equivalently `Expr -\u003e Term`. Thus, `FOLLOW(Expr)` is subset of `FOLLOW(Term)`, so we finally get `FOLLOW(Term) = { +, -, $, ) }`.\n\n```\nFOLLOW(Expr) = FOLLOW(Expr') = { $, ) }\nFOLLOW(Term) = FOLLOW(Term') = { +, -, $, ) }\nFOLLOW(Factor) = FIRST(Term) U FIRST(Expr') \\ { Epsilon } U FOLLOW(Expr) = { +, -, *, /, ), $}\n```\nwhere `$` indicates end of string.\n(Section 4.4.2 from [Aho et al.](https://www.pearson.com/us/higher-education/program/Aho-Compilers/PGM2809377.html))\n\nSince expressions with more than one possible production, i.e. `Expr', Term', Factor`, have disjoint set of first symbols, we can use a lookahead symbol to determine a production (Section 4.4.2 from [Aho et al.](https://www.pearson.com/us/higher-education/program/Aho-Compilers/PGM2809377.html)).\n\nWe can now form predictive parsing table `M` as follows, for each production rule `X -\u003e Y`\n1. For each character `y` in `FIRST(Y)` add `X -\u003e Y` to `M[X, y]`\n2. If `Epsilon` is in `FIRST(Y)`, then for each `x` in `FOLLOW(X)` add `X -\u003e Y` to `M[X, b]`\n\nFor example, to fill the row for `Expr'` we look for three of its possible productions:\n- For `FIRST(+ Term Expr') = { + }`, so entry `M[Expr', +] = + Term Expr'`\n- For `FIRST(- Term Expr') = { - }`, so entry `M[Expr', -] = - Term Expr'`\n- For `FIRST(Epsilon) = { Epsilon }`, we need `FOLLOW(Expr') = { $, ) }` so entries `M[Expr', $] = M[Expr', )] = Epsilon`\n\n\n| Rule \\ Character | +                  | -             | *         | /         | [0-9]         | (             | )         | $       |\n---                |---                 |---            |---        |---        |---            |---            |---        |---      |\n| `Expr`           |                    |               |           |           | `Term Expr'`  | `Term Expr'`  |           |         |\n| `Expr'`          | `+ Term Expr'`     | `- Term Expr'`|           |           |               |               | `Epsilon` | `Epsilon`|\n| `Term`           |                    |               |           |           | `Factor Term'` | `Factor Term'` |           |          |\n| `Term'`          | `Epsilon`          | `Epsilon`     | `* Factor Term'`| `/ Factor Term'`|   |               | `Epsilon` | `Epsilon`| \n| `Factor`         |                    |               |           |           | `Number`      | `( Expr )`    |\n\n### Forming Syntax Tree\n\nThe predictive algorithm described below traverses grammar rules in preorder (Section 4.4.4 from Aho et al.).\n![Predictive parsing algorithm](img/predictive-parsing.jpg)\n\nIn order to build the syntax tree, we need to keep an object next to a rule so that when a rule is expanded the objects points to its production terms.\n\nSource: https://stackoverflow.com/a/27206881/1374078 \n\n## Shunting-Yard algorithm\n\nAnother approach is to convert the infix notation to a postfix notation, e.g. `3+5-7` as postfix `3 5 + 7 -`. So, the operand can always be applied to its arguments to the left. The algorithm is taken from https://brilliant.org/wiki/shunting-yard-algorithm/ but some modifications need to take place. We can tokenize strings right away as we are parsing. Also, the line 5 should be changed from `greater precedence` to `greater or equal precedence` because in instances where operands have equal precedence the left association wins out `3*2/5`.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnurdann%2Farithmeticajs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnurdann%2Farithmeticajs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnurdann%2Farithmeticajs/lists"}