{"id":19699236,"url":"https://github.com/benjamin-hodgson/sawmill","last_synced_at":"2025-04-04T20:10:30.391Z","repository":{"id":39613494,"uuid":"105478438","full_name":"benjamin-hodgson/Sawmill","owner":"benjamin-hodgson","description":"Simple tools for working with immutable trees","archived":false,"fork":false,"pushed_at":"2025-03-18T00:06:58.000Z","size":3138,"stargazers_count":58,"open_issues_count":2,"forks_count":0,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-03-28T19:07:52.327Z","etag":null,"topics":["ast","compiler","csharp","dotnet","dotnet-core","tree"],"latest_commit_sha":null,"homepage":"https://www.benjamin.pizza/Sawmill/","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/benjamin-hodgson.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-10-01T22:14:37.000Z","updated_at":"2025-03-18T00:04:24.000Z","dependencies_parsed_at":"2023-11-16T20:03:59.844Z","dependency_job_id":"e0532ea0-3a65-4b7f-aa6e-95699c562d08","html_url":"https://github.com/benjamin-hodgson/Sawmill","commit_stats":{"total_commits":236,"total_committers":2,"mean_commits":118.0,"dds":"0.18220338983050843","last_synced_commit":"298664b978545d47f5a1e3beb6ad36b971927ded"},"previous_names":[],"tags_count":22,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benjamin-hodgson%2FSawmill","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benjamin-hodgson%2FSawmill/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benjamin-hodgson%2FSawmill/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/benjamin-hodgson%2FSawmill/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/benjamin-hodgson","download_url":"https://codeload.github.com/benjamin-hodgson/Sawmill/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247242678,"owners_count":20907134,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ast","compiler","csharp","dotnet","dotnet-core","tree"],"created_at":"2024-11-11T20:01:54.551Z","updated_at":"2025-04-04T20:10:30.372Z","avatar_url":"https://github.com/benjamin-hodgson.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"Sawmill\n=======\nSimple tools for working with immutable trees, based on [_Uniform Boilerplate and List Processing_](http://ndmitchell.com/downloads/paper-uniform_boilerplate_and_list_processing-30_sep_2007.pdf) and developed at Stack Overflow.\n\nInstalling\n----------\n\nSawmill is [available on Nuget](https://www.nuget.org/packages/Sawmill/). API docs are hosted [on my website](https://www.benjamin.pizza/Sawmill).\n\nTutorial\n--------\n\nSawmill contains functions which make it easy to work with immutable tree-shaped data such as abstract syntax trees. It factors out the boilerplate associated with recursively traversing a tree, allowing you to write queries and transformations which get straight to the point.\n\nSawmill is designed to be extremely simple and lightweight (it's built as a set of extension methods for a single simple interface); it works well with modern C# features like lambdas and pattern matching (the days of the clunky old visitor pattern are over!); and it doesn't get in the way when you need to go it alone and write traversals without Sawmill's help.\n\nI've written a step-by-step tutorial on the library's core idea on [my blog](https://www.benjamin.pizza/posts/2017-11-13-recursion-without-recursion.html).\n\n### Getting started\n\nFor example, suppose you're working with a simple language of arithmetic expressions featuring literal numbers, variables, addition, and unary subtraction. Each syntactic construct corresponds to a subclass of an `Expr` base type, so an expression like `(2 + x) + (-4)` would be represented as `new Add(new Add(new Lit(2), new Var(x)), new Neg(new Lit(4)))`.\n\nFor your tree type to work with Sawmill, it must implement [the `IRewritable\u003cT\u003e` interface](https://github.com/benjamin-hodgson/Sawmill/blob/master/Sawmill/IRewritable.cs). An object is rewritable if it knows how to access its collection of immediate children; accordingly, `IRewritable\u003cT\u003e` contains `GetChildren` and `SetChildren` methods. Implementations of `IRewritable` should ensure that the rewritable type conforms to the following two-point specification:\n\n  * You get out what you put in - `x.SetChildren(children).GetChildren() == children`\n  * Setting twice is the same as setting once - `x.SetChildren(children1).SetChildren(children2) == x.SetChildren(children2)`\n\n[See below](#implementing-irewritablet) for a full example of how to implement the `Expr` type outlined above. You can also use the supplied `AutoRewriter` or `RewriterBuilder` classes to assist in implementing `IRewritable`.\n\n### Querying a tree\n\nHere's a function to extract a list of the variables mentioned in a given `Expr` (for example, a compiler writer might want to find the variables captured by a lambda expression):\n\n```csharp\nIEnumerable\u003cstring\u003e GetVariables(Expr expr)\n{\n    switch (expr)\n    {\n        case Lit l:\n            return Enumerable.Empty\u003cstring\u003e();\n        case Var v:\n            return new[] { v.Name };\n        case Add a:\n            return GetVariables(a.Left).Concat(GetVariables(a.Right));\n        case Neg n:\n            return GetVariables(n.Operand);\n    }\n    throw new ArgumentOutOfRangeException(nameof(expr));\n}\n```\n\nThe only _interesting_ line of code here is the `Var` case. The rest is just boilerplate to recursively call `GetVariables` on the nodes' children and combine the results. With Sawmill, `GetVariables` is one line of simple, direct code:\n\n```csharp\nIEnumerable\u003cstring\u003e GetVariables(Expr expr)\n    =\u003e expr.SelfAndDescendants().OfType\u003cVar\u003e().Select(v =\u003e v.Name);\n```\n\n`SelfAndDescendants` returns an enumerable containing the current node and all of the nodes in the rest of the tree. The example above uses the standard `OfType` and `Select` LINQ methods to find the names of all the variables mentioned in `expr`. It also has two cousins, `DescendantsAndSelf` and `SelfAndDescendantsBreadthFirst`, which differ in the order in which they yield nodes.\n\nBy the way, you can totally \"go it alone\" and write complex or performance-critical traversals without Sawmill's help. You can just use explicit recursion, as in the first example. Sawmill is intended to be _useful_, not _opinionated_.\n\n### Transforming a tree\n\nSawmill makes tree transformations easier too. Here's an example of a simple optimisation pass, which removes double-negatives:\n\n```csharp\nExpr RemoveDoubleNegation(Expr expr)\n{\n    switch (expr)\n    {\n        case Neg n1 when n.Operand is Neg n2:\n            return RemoveDoubleNegation(n2);\n        case Neg n:\n            return new Neg(RemoveDoubleNegation(n.Operand));\n        case Lit l:\n            return l;\n        case Var v:\n            return v;\n        case Add a:\n            return new Add(RemoveDoubleNegation(a.Left), RemoveDoubleNegation(a.Right));\n    }\n    throw new ArgumentOutOfRangeException(nameof(expr));\n}\n```\n\nOnce again, Sawmill tackles the boilerplate - recursively taking each node apart and putting them back together - so you can focus on the important part of your operation.\n\n```csharp\nExpr RemoveDoubleNegation(Expr expr)\n    =\u003e expr.Rewrite(node =\u003e\n        node is Neg n1 \u0026\u0026 n1.Operand is Neg n2\n            ? n2.Operand\n            : node\n    );\n```\n\n`Rewrite` takes a transformation function and rebuilds a tree by applying the function to every node in the tree. For example, given a representation of the expression `(2 + x) + (-4)` and a transformer function, `expr.Rewrite(transformer)` is equivalent to:\n\n```csharp\ntransformer(new Add(\n    transformer(new Add(\n        transformer(new Lit(2)),\n        transformer(new Var(\"x\"))\n    )),\n    transformer(new Neg(\n        transformer(new Lit(4))\n    ))\n))\n```\n\nSo the transformation function gets applied to every node in the tree exactly once. `Rewrite` is a _mapping_ operation, like LINQ's `Select`.\n\nSawmill takes care to avoid rebuilding parts of the tree which the transformation function leaves unchanged, so `Rewrite` will typically be more efficient than a naïve handwritten implementation.\n\nSawmill also contains tools for some more niche operations:\n\n### Putting an expression into a normal form\n\nNormalising an expression typically involves repeatedly applying a set of rewrite rules until they can't be applied any more.\n\nFor example, to put an arithmetic expression into [_negation normal form_](https://en.wikipedia.org/wiki/Negation_normal_form), so that all of the minus signs appear only next to variables or literal numbers, you distribute `-` over `+` (so `-(3+2)` becomes `(-3)+(-2)`). Since performing this distribution might produce more places where the result of an addition is negated (consider `-((1+2)+3) -\u003e (-(1+2))+(-3)`), you need to do so repeatedly until you can't do it any more.\n\n`RewriteIter` packages up this pattern. It applies a transformation function to every node in the tree from bottom to top, repeating this until the function is a no-op for each node in the tree. (In other words, `x.RewriteIter(f).DescendantsAndSelf().All(n =\u003e f(n) == n) == true`.)\n\n```csharp\nExpr ToNegationNormalForm(Expr expr)\n    =\u003e expr.RewriteIter(node =\u003e \n        node is Neg n \u0026\u0026 n.Operand is Add a\n            ? new Add(new Neg(a.Left), new Neg(a.Right))\n            : n\n    );\n```\n\nIt'd be pretty tedious to write this operation by hand as a recursive function!\n\n### Reducing a tree to a value\n\nLINQ has the `Aggregate` method, which passes an accumulator value along an enumerable, using an aggregation function to combine elements. But while an element of an enumerable has only one predecessor, a node of a tree may have many children. So Sawmill's `Fold` method passes multiple accumulator values up a tree, using an aggregation function to flatten them into a single value.\n\nHere's an example of compiling our expression tree into code for a hypothetical stack machine.\n\n```csharp\nstring Compile(Expr expr)\n    =\u003e expr.Fold\u003cExpr, string\u003e(\n        (n, children) =\u003e n switch\n        {\n            Lit l =\u003e \"PUSH \" + l.Value + \";\",\n            Var v =\u003e \"LOAD \" + v.Name + \";\",\n            Add a =\u003e children[0] + children[1] + \"ADD;\",\n            Neg n =\u003e children[0] + \"NEGATE;\",\n            _ =\u003e throw new ArgumentOutOfRangeException(nameof(n))\n        }\n    );\n```\n\n### Replacing individual nodes in a tree\n\nThere are several \"`InContext`\" extension methods:\n\n  * `ChildrenInContext`\n  * `SelfAndDescendantsInContext`\n  * `DescendantsAndSelfInContext`\n  * `SelfAndDescendantsInContextBreadthFirst`\n\nThese all have a return type of `IEnumerable\u003c(T item, Func\u003cT, T\u003e replace)\u003e`: a list of tuples containing a node and a function to build a new tree with a different node in its place. This might be useful in mutation testing, where you want to see all the ways you can change a `Lit` node in a tree. You can think of the function as representing the node's _context_ in the tree; calling the function with a new node \"plugs the hole\" that was created by removing the node from the tree.\n\n### Inspecting and replacing a node and its neighbours\n\nThe `Cursor()` method generalises the `InContext` methods by returning a `Cursor\u003cT\u003e` - a mutable builder object representing a _focus_ on a particular node in a tree. You can efficiently replace the currently focused node by setting the cursor's `Focus` property. The `Up`, `Down`, `Left` and `Right` methods allow you to efficiently move the cursor's focus to the current node's parent, the current node's first child, and the current node's next and previous siblings, respectively. This is useful if you need to make a complex sequence of edits to a particular area in a tree, such as if a user is editing part of a text file. Moving the cursor all the way back to the `Top` rebuilds the whole tree with the new nodes in place of the old ones.\n\n### Implementing `IRewritable\u003cT\u003e`\n\nTo use Sawmill with your own expression types, you implement the `IRewritable` interface. You explain how to read and write the nodes' immediate children, and Sawmill does the boring work of recursively traversing the children's children.\n\n```csharp\nabstract class Expr : IRewritable\u003cExpr\u003e\n{\n    public abstract int CountChildren();\n    public abstract void GetChildren(Span\u003cExpr\u003e span);\n    public abstract Expr SetChildren(ReadOnlySpan\u003cExpr\u003e newChildren);\n}\n// literal numbers are leaf nodes; they have no children\nclass Lit : Expr\n{\n    public int Value { get; }\n    \n    public Lit(int value)\n    {\n        Value = value;\n    }\n\n    public override int CountChildren() =\u003e 0;\n    public override void GetChildren(Span\u003cExpr\u003e span)\n    {\n    }\n    public override Expr SetChildren(ReadOnlySpan\u003cExpr\u003e newChildren)\n        =\u003e this;\n}\n// variables also have no children\nclass Var : Expr\n{\n    public string Name { get; }\n    \n    public Var(string name)\n    {\n        Name = name;\n    }\n\n    public override int CountChildren() =\u003e 0;\n    public override void GetChildren(Span\u003cExpr\u003e span)\n    {\n    }\n    public override Expr SetChildren(ReadOnlySpan\u003cExpr\u003e newChildren)\n        =\u003e this;\n}\nclass Neg : Expr\n{\n    public Expr Operand { get; }\n    \n    public Neg(Expr operand)\n    {\n        Operand = operand;\n    }\n\n    public override int CountChildren() =\u003e 1;\n    public override void GetChildren(Span\u003cExpr\u003e span)\n    {\n        span[0] = Operand;\n    }\n    public override Expr SetChildren(ReadOnlySpan\u003cExpr\u003e newChildren)\n        =\u003e new Neg(newChildren[0]);\n}\nclass Add : Expr\n{\n    public Expr Left { get; }\n    public Expr Right { get; }\n    \n    public Add(Expr left, Expr right)\n    {\n        Left = left;\n        Right = right;\n    }\n\n    public override int CountChildren() =\u003e 2;\n    public override void GetChildren(Span\u003cExpr\u003e span)\n    {\n        span[0] = Left;\n        span[1] = Right;\n    }\n    public override Expr SetChildren(ReadOnlySpan\u003cExpr\u003e newChildren)\n        =\u003e new Add(newChildren[0], newChildren[1]);\n}\n```\n\n#### If you can't implement `IRewritable`\n\nThere's also an `IRewriter\u003cT\u003e` interface, which is useful if you can't change the tree type to implement `IRewritable`. Sawmill comes bundled with `IRewriter` implementations (and extension methods) for some tree-shaped objects in the BCL, namely `Expression`, `XElement`, and `XmlNode`. In the box you'll also find `RewriterBuilder`, which is a domain-specific language for buiding `IRewriter` implementations, and an experimental reflection-based `AutoRewriter`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenjamin-hodgson%2Fsawmill","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbenjamin-hodgson%2Fsawmill","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbenjamin-hodgson%2Fsawmill/lists"}