{"id":23977980,"url":"https://github.com/ertgl/xformula","last_synced_at":"2025-09-16T17:05:15.967Z","repository":{"id":105041491,"uuid":"565178874","full_name":"ertgl/xformula","owner":"ertgl","description":"Highly customizable language front-end, aimed to be a base for custom DSL evaluators.","archived":false,"fork":false,"pushed_at":"2024-12-12T05:31:56.000Z","size":374,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-01T12:11:28.367Z","etag":null,"topics":["language-frontends","parser-generator","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ertgl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-11-12T15:31:52.000Z","updated_at":"2024-12-12T05:32:00.000Z","dependencies_parsed_at":null,"dependency_job_id":"a7685b3d-5397-4ed9-8f68-581ef465bbbc","html_url":"https://github.com/ertgl/xformula","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ertgl%2Fxformula","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ertgl%2Fxformula/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ertgl%2Fxformula/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ertgl%2Fxformula/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ertgl","download_url":"https://codeload.github.com/ertgl/xformula/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":232844637,"owners_count":18585234,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["language-frontends","parser-generator","python"],"created_at":"2025-01-07T08:15:54.564Z","updated_at":"2025-09-16T17:05:15.841Z","avatar_url":"https://github.com/ertgl.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# XFormula\n\nA highly customizable language front-end and parser generator. Designed for\nrapid prototyping of\n[domain-specific language](https://en.wikipedia.org/wiki/Domain-specific_language)\nimplementations. *Applicable also for general-purpose languages.*\n\n___\n\n## Table of Contents\n\n- [Overview](#overview)\n- [Usage](#usage)\n  - [Defining Tokens](#defining-tokens)\n  - [Defining AST Nodes](#defining-ast-nodes)\n  - [Transforming Tokens into AST Nodes](#transforming-tokens-into-ast-nodes)\n  - [Setting Up the Feature](#setting-up-the-feature)\n  - [Initializing the Parser](#initializing-the-parser)\n  - [Dynamic Syntax Concept](#dynamic-syntax-concept)\n- [Portability](#portability)\n- [Real-world Example](#real-world-example)\n- [License](#license)\n\n## Overview\n\nXFormula is a language front-end tool that enables developers to define\nlanguage syntax and semantics using the object-oriented paradigm, achieving\nexceptional modularity and flexibility. It offers a set of built-in, commonly\nused features for general-purpose languages that can be omitted or extended as\nneeded for rapid prototyping.\n\nWhile XFormula itself does not provide any compilation or evaluation\ncapabilities, it allows you to define a highly flexible and customizable\n[AST](https://en.wikipedia.org/wiki/Abstract_syntax_tree) structure in a\nmodular way. Additionally, it includes a parser that generates an AST based on\na given input string.\n\nAt its core, XFormula is a parser generator that leverages the powerful\n[Lark Parser Toolkit](https://lark-parser.readthedocs.io/) under the hood. Lark\nsupports [LALR(1)](https://en.wikipedia.org/wiki/LALR_parser), Earley, and CYK\nparsing algorithms, and XFormula's default features are designed to be\ncompatible with the LALR(1) algorithm, which is renowned for its speed and\nefficiency in both time (CPU) and space (memory).\n\n## Usage\n\nIf you are already familiar with the terminology, the best way to understand\nhow to use XFormula is by examining the following modules:\n\n- [xformula.syntax.ast](src/xformula/syntax/ast/nodes/abc)\n- [xformula.syntax.core.features](src/xformula/syntax/core/features)\n- [xformula.syntax.core.operations.default_operator_precedences](src/xformula/syntax/core/operations/default_operator_precedences.py#L16)\n\nThe final [EBNF](https://en.wikipedia.org/wiki/Extended_Backus–Naur_form)\ngrammar, generated automatically by XFormula using the default features, is\noutput to the [out/Grammar.lark](out/Grammar.lark) file.\n\nTo illustrate the development process, let’s define a few simple syntax\nfeatures and their corresponding AST nodes for the `none` and `bool` types.\n\n### Defining Tokens\n\nThe first step is to specify how the tokens should be parsed.\n\n```python\nfrom xformula.runtime.core.context.abc import RuntimeContext\nfrom xformula.syntax.grammar.ebnf import non_terminal\nfrom xformula.syntax.grammar.terminals.abc import Terminal\nfrom xformula.syntax.lexer.tokens.abc import Token\n\n\nclass NONE(\n    # Note: The `Terminal` class is generic and requires the type of the\n    # transformed token as a type argument.\n    Terminal[None],\n):\n    class Meta:\n\n        # The priority of the terminal in the lexer rules.\n        priority = 2000\n\n        tags = {\n            # The priority of this terminal in the `None` non-terminal grammar\n            # rules group.\n            non_terminal(\"None\"): 0,\n        }\n\n    # The grammar definition of the terminal,\n    # that will be used by the lexer.\n    def build_grammar(self) -\u003e str:\n        define = self.ebnf.define\n        regex = self.ebnf.regex\n\n        bound = self.regex.bound\n        word = self.regex.word\n\n        return define(regex(bound(word(\"none\"))))\n\n    # The runtime transformation of the token.\n    def transform_token(\n        self,\n        runtime_context: RuntimeContext,\n        token: Token,\n    ) -\u003e None:\n        # The token is already of type `None`.\n        return None\n```\n\nNext, define the `bool` type:\n\n```python\nfrom xformula.runtime.core.context.abc import RuntimeContext\nfrom xformula.syntax.grammar.ebnf import non_terminal\nfrom xformula.syntax.grammar.terminals.abc import Terminal\nfrom xformula.syntax.lexer.tokens.abc import Token\n\n\nclass BOOL(\n    Terminal[bool],\n):\n    class Meta:\n\n        priority = 2000\n\n        tags = {\n            non_terminal(\"Bool\"): 1000,\n        }\n\n    def build_grammar(self) -\u003e str:\n        define = self.ebnf.define\n        regex = self.ebnf.regex\n\n        any_of = self.regex.any_of\n        bound = self.regex.bound\n        word = self.regex.word\n\n        return define(\n            regex(\n                any_of(\n                    bound(word(\"false\")),\n                    bound(word(\"true\")),\n                ),\n            ),\n        )\n\n    def transform_token(\n        self,\n        runtime_context: RuntimeContext,\n        token: Token,\n    ) -\u003e bool:\n        # Transform the `false` and `true` tokens into False and True,\n        # respectively.\n        return token.value.lower() == \"true\"\n```\n\n### Defining AST Nodes\n\nDefining the AST nodes for the `none` and `bool` types is straightforward,\nthanks to the comprehensive `xformula.syntax.ast` module.\n\nFor the `none` type:\n\n```python\nimport dataclasses\n\nfrom xformula.syntax.ast.nodes import Literal\n\n\n@dataclasses.dataclass()\nclass None_(\n    # Note: The `Literal` class is generic and requires the type of the\n    # transformed token as a type argument.\n    Literal[None],\n):\n\n    # The `value` field is implemented with the `None` type.\n    value: None = dataclasses.field(\n        kw_only=True,\n        init=False,\n        default=None,\n    )\n```\n\nAnd for the `bool` type:\n\n```python\nimport dataclasses\n\nfrom xformula.syntax.ast.nodes import Literal\n\n\n@dataclasses.dataclass()\nclass Bool(\n    Literal[bool],\n):\n\n    # The `value` field is implemented with the `bool` type.\n    value: bool = dataclasses.field(\n        kw_only=True,\n        default=bool(),\n    )\n```\n\n### Transforming Tokens into AST Nodes\n\nOnce the syntax features and AST nodes are defined, the next step is to specify\nhow tokens should be transformed into AST nodes.\n\nDefine the `None` non-terminal as follows (note that `None` is reserved in\nPython, so we use `None_` instead):\n\n```python\nfrom xformula.runtime.core.context.abc import RuntimeContext\nfrom xformula.syntax.core.features.literals.ast.nodes import None_ as NoneNode\nfrom xformula.syntax.grammar.ebnf import non_terminal\nfrom xformula.syntax.grammar.non_terminals.abc import NonTerminal\nfrom xformula.syntax.parser.trees.abc import ParseTree\n\n\n# `None` is reserved in Python, so we use `None_` instead.\nclass None_(\n    NonTerminal[NoneNode],\n):\n    class Meta:\n\n        # This will replace the default definition name `None_`.\n        definition_name = \"None\"\n\n        # Mark this non-terminal as atomic.\n        # This means that we want to use our custom transformation logic\n        # for the parse tree of this non-terminal.\n        # See the `transform_parse_tree` method below.\n        atomic = True\n\n        tags = {\n            non_terminal(\"Literal\"): -1000,\n        }\n\n    # The grammar definition of the non-terminal.\n    def build_grammar(self) -\u003e str:\n        # Since we tagged the `NONE` terminal with the `None` non-terminal,\n        # we can automatically reference it here.\n        # And, if another feature is added that tags the `None` non-terminal,\n        # the related terminals/non-terminals will be included here as well.\n        # Respecting the priority levels of the tags.\n        return self.ebnf.define_tagged_alternation()\n\n    # The runtime transformation of the parse tree.\n    # This is where we transform the parse tree into an AST node.\n    def transform_parse_tree(\n        self,\n        runtime_context: RuntimeContext,\n        tree: ParseTree,\n    ) -\u003e NoneNode:\n        return NoneNode()\n```\n\nSimilarly, define the `Bool` non-terminal:\n\n```python\nfrom typing import cast\n\nfrom xformula.runtime.core.context.abc import RuntimeContext\nfrom xformula.syntax.core.features.literals.ast.nodes import Bool as BoolNode\nfrom xformula.syntax.grammar.ebnf import non_terminal\nfrom xformula.syntax.grammar.non_terminals.abc import NonTerminal\nfrom xformula.syntax.parser.trees.abc import ParseTree\n\n\nclass Bool(\n    NonTerminal[BoolNode],\n):\n    class Meta:\n\n        atomic = True\n\n        tags = {\n            non_terminal(\"Literal\"): -2000,\n        }\n\n    def build_grammar(self) -\u003e str:\n        return self.ebnf.define_tagged_alternation()\n\n    def transform_parse_tree(\n        self,\n        runtime_context: RuntimeContext,\n        tree: ParseTree[bool],\n    ) -\u003e BoolNode:\n        # The `Bool` non-terminal has only one child, the transformed value\n        # from the `BOOL` terminal.\n        value = cast(bool, tree.children[0])\n        # Return the `bool` node with that transformed value.\n        return BoolNode(\n            value=value,\n        )\n```\n\nLastly, define the non-terminal for `Literal`:\n\n```python\nfrom typing import TypeVar, cast\n\nfrom xformula.runtime.core.context.abc import RuntimeContext\nfrom xformula.syntax.ast.nodes.abc import Literal as LiteralNode\nfrom xformula.syntax.grammar.ebnf import non_terminal\nfrom xformula.syntax.grammar.non_terminals.abc import NonTerminal\nfrom xformula.syntax.parser.trees.abc import ParseTree\n\n\nT = TypeVar(\"T\")\n\n\nclass Literal(\n    NonTerminal[\n        LiteralNode[T],\n    ],\n):\n    class Meta:\n\n        tags = {\n            # Mark this non-terminal as the start rule of the grammar.\n            non_terminal(\"Start\"): -1,\n        }\n\n    def build_grammar(self) -\u003e str:\n        return self.ebnf.define_tagged_alternation()\n\n    # Default transformation for non-atomic non-terminals.\n    def transform_parse_tree(\n        self,\n        runtime_context: RuntimeContext,\n        tree: ParseTree[T],\n    ) -\u003e LiteralNode[T]:\n        return cast(LiteralNode, tree.children[0])\n```\n\n### Setting Up the Feature\n\nTo use these features, define a feature class that modifies the syntax context\nduring setup:\n\n```python\nfrom xformula.syntax.core.features.abc import Feature\n\n\nclass LiteralFeature(Feature):\n\n    def setup(self) -\u003e None:\n        self.non_terminal_types.extend(\n            [\n                None_,\n                Bool,\n                Literal,\n            ],\n        )\n\n        self.terminal_types.extend(\n            [\n                NONE,\n                BOOL,\n            ],\n        )\n```\n\n### Initializing the Parser\n\nFinally, initialize the parser:\n\n```python\nfrom xformula.syntax.core.context import SyntaxContext\n# Note: We tagged our `Literal` non-terminal with the `Start` non-terminal.\n# If the `Start` non-terminal is not defined, you can either define it or use\n# the `PolyfillFeature` to handle it automatically.\nfrom xformula.syntax.core.features.polyfill import PolyfillFeature\nfrom xformula.syntax.parser import Parser\n\n\nsyntax_context = SyntaxContext(\n    feature_types=[\n        LiteralFeature,\n        # `PolyfillFeature` automatically defines any missing tags as\n        # non-atomic non-terminals during setup.\n        PolyfillFeature,\n    ],\n)\n\nparser = Parser(\n    syntax_context=syntax_context,\n    # Optionally, you can also pass a runtime context here to customize or\n    # override the default.\n)\n\n# Parse an input string.\nast = parser.parse(\"true\")\n\n# Outputs: Bool(value=True)\nprint(ast)\n\n# Outputs: True\nprint(ast.value)\n\n# Outputs: None_(value=None)\nprint(parser.parse(\"none\"))\n```\n\nThe generated grammar is accessible via the `ebnf_document` attribute of the\nparser. In this example, it might look like:\n\n```ebnf\n?start : literal\n\n?literal : bool\n         | none\n\nbool : BOOL\n\nnone : NONE\n\nBOOL.2000 : /\\bfalse\\b|\\btrue\\b/\n\nNONE.2000 : /\\bnone\\b/\n```\n\nAs you can see, the `start` rule is prefixed with a `?` character. This is\nbecause the `PolyfillFeature` automatically defines the `Start` non-terminal as\nnon-atomic. Since it does not have specific transformation logic and is used\nonly for tagging purposes, the parser automatically replaces the `Start`\nnon-terminal's parse tree with that of the `Literal` non-terminal. Likewise,\nbecause `Literal` is also non-atomic, its parse tree is further replaced by\nthat of the `Bool` or `None` non-terminal based on the input. This approach\nhelps avoid deep nesting of AST nodes while leveraging the inheritance and\npolymorphism features of OOP.\n\nTo observe the default polymorphism, you can inspect the\n[MRO](https://docs.python.org/3/howto/mro.html) of an AST node class, similar\nto the following:\n\n```python\n\u003e\u003e\u003e parser.parse(\"none\").__class__.__mro__\n(\n  \u003cclass 'xformula.syntax.core.features.literals.ast.nodes.none_.None_'\u003e,\n  \u003cclass 'xformula.syntax.ast.nodes.abc.literal.Literal'\u003e,\n  \u003cclass 'xformula.syntax.ast.nodes.abc.term.Term'\u003e,\n  \u003cclass 'xformula.syntax.ast.nodes.abc.primary.Primary'\u003e,\n  \u003cclass 'xformula.syntax.ast.nodes.abc.operand.Operand'\u003e,\n  \u003cclass 'xformula.syntax.ast.nodes.abc.simple_expression.SimpleExpression'\u003e,\n  \u003cclass 'xformula.syntax.ast.nodes.abc.expression.Expression'\u003e,\n  \u003cclass 'xformula.syntax.ast.nodes.abc.has_value.HasValue'\u003e,\n  \u003cclass 'typing.Generic'\u003e,\n  \u003cclass 'xformula.syntax.ast.nodes.abc.node.Node'\u003e,\n  \u003cclass 'xformula.arch.meta.configurable.Configurable'\u003e,\n  \u003cclass 'abc.ABC'\u003e,\n  \u003cclass 'object'\u003e\n)\n```\n\n### Dynamic Syntax Concept\n\nAs demonstrated above, syntax features are defined in a modular way. The\ndynamic syntax concept further enhances this modularity by allowing you to plug\nin or remove features without modifying the core syntax definition.\n\nFor more low-level details, refer to the following classes:\n\n- [xformula.syntax.EBNFExpressionBuilderProtocol](src/xformula/syntax/grammar/definitions/abc/ebnf_expression_builder_protocol.py#L164)\n- [xformula.syntax.SyntaxContext](src/xformula/syntax/core/context/abc/syntax_context.py#L104)\n- [xformula.syntax.TaggedDefinitionIterator](src/xformula/syntax/core/customization/tagging/tagged_definition_iterator.py#L13)\n\n## Portability\n\nSince Lark is available for various programming languages, the grammars\ngenerated by XFormula can be used in those languages out of the box. To achieve\nthe same dynamic transformation capabilities as XFormula's generated parser, it\nis necessary to align with the\n[NonTerminalOperationClassBuilder.transform_parse_tree](src/xformula/syntax/core/features/operations/runtime/reflection/non_terminal_operation_class_builder.py#L236)\nfunction, which automatically resolves operator associativity and precedence.\n\nFor a list of available Lark implementations, see the\n[extra features](https://lark-parser.readthedocs.io/en/stable/features.html#extra-features)\nsection in the Lark documentation.\n\n## Real-world Example\n\nFor an overview of the default features and an example runtime implementation\nfor the language, check out the\n[django-xformula](https://github.com/ertgl/django-xformula) project. This\n[Django](https://www.djangoproject.com/) application transforms formulas into\nSQL queries using Django's powerful\n[ORM](https://en.wikipedia.org/wiki/Object–relational_mapping) capabilities.\n\n## License\n\nThis project is licensed under the\n[MIT License](https://opensource.org/license/mit).\n\nSee the [LICENSE](LICENSE) file for more information.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fertgl%2Fxformula","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fertgl%2Fxformula","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fertgl%2Fxformula/lists"}