{"id":16730544,"url":"https://github.com/c42f/julialowering.jl","last_synced_at":"2025-03-15T18:14:25.792Z","repository":{"id":231187123,"uuid":"777483328","full_name":"c42f/JuliaLowering.jl","owner":"c42f","description":"Julia code lowering with precise provenance","archived":false,"fork":false,"pushed_at":"2024-05-22T10:08:08.000Z","size":225,"stargazers_count":41,"open_issues_count":1,"forks_count":1,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-05-22T11:28:22.950Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Julia","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/c42f.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-25T23:46:21.000Z","updated_at":"2024-05-28T10:34:12.316Z","dependencies_parsed_at":null,"dependency_job_id":"de272741-c16b-4cd3-86d7-a8f34fb21a0f","html_url":"https://github.com/c42f/JuliaLowering.jl","commit_stats":null,"previous_names":["c42f/julialowering.jl"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/c42f%2FJuliaLowering.jl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/c42f%2FJuliaLowering.jl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/c42f%2FJuliaLowering.jl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/c42f%2FJuliaLowering.jl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/c42f","download_url":"https://codeload.github.com/c42f/JuliaLowering.jl/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243769986,"owners_count":20345217,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-12T23:33:53.026Z","updated_at":"2025-03-15T18:14:25.785Z","avatar_url":"https://github.com/c42f.png","language":"Julia","funding_links":[],"categories":[],"sub_categories":[],"readme":"# JuliaLowering\n\n[![Build Status](https://github.com/c42f/JuliaLowering.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/c42f/JuliaLowering.jl/actions/workflows/CI.yml?query=branch%3Amain)\n\nJuliaLowering.jl is an experimental port of Julia's code lowering compiler\npasses, written in Julia itself. \"Code lowering\" is the set of compiler passes\nwhich *symbolically* transform and simplify Julia's syntax prior to type\ninference.\n\n## Goals\n\nThis work is intended to\n* Bring precise code provenance to Julia's lowered form (and eventually\n  downstream in type inference, stack traces, etc). This has many benefits\n    - Talk to users precisely about their code via character-precise error and\n      diagnostic messages from lowering \n    - Greatly simplify the implementation of critical tools like Revise.jl\n      which rely on analyzing how the user's source maps to the compiler's data\n      structures\n    - Allow tools like JuliaInterpreter to use type-inferred and optimized\n      code, with the potential for huge speed improvements.\n* Bring improvements for macro authors\n    - Prototype \"automatic hygiene\" (no more need for `esc()`!)\n    - Precise author-defined error reporting from macros\n    - Sketch better interfaces for syntax trees (hopefully!)\n\n## Trying it out\n\nNote this is a work in progress; many types of syntax are not yet handled.\n\n1. You need a 1.12-DEV build of Julia: At least 1.12.0-DEV.512. Commit `263928f9ad4` is currentl known to work. Note that JuliaLowering relies on Julia internals and may be broken on the latest Julia dev version from time to time. (In fact it is currently broken on the latest `1.12-DEV`.)\n2. Check out the main branch of [JuliaSyntax](https://github.com/JuliaLang/JuliaSyntax.jl)\n3. Get the latest version of [JuliaSyntaxFormatter](https://github.com/c42f/JuliaSyntaxFormatter.jl)\n4. Run the demo `include(\"test/demo.jl\")`\n\n# Design notes\n\n## Syntax trees\n\nWant something something better than `JuliaSyntax.SyntaxNode`! `SyntaxTree` and\n`SyntaxGraph` provide this. Some future version of these should end up in\n`JuliaSyntax`.\n\nWe want to allow arbitrary attributes to be attached to tree nodes by analysis\npasses. This separates the analysis pass implementation from the data\nstructure, allowing passes which don't know about each other to act on a shared\ndata structure.\n\nDesign and implementation inspiration comes in several analogies:\n\nAnalogy 1: the ECS (Entity-Component-System) pattern for computer game design.\nThis pattern is highly successful because it separates game logic (systems)\nfrom game objects (entities) by providing flexible storage\n* Compiler passes are \"systems\"\n* AST tree nodes are \"entities\"\n* Node attributes are \"components\"\n\nAnalogy 2: The AoS to SoA transformation. But here we've got a kind of\ntree-of-structs-with-optional-attributes to struct-of-Dicts transformation.\nThe data alignment / packing efficiency and concrete type safe storage benefits\nare similar.\n\nAnalogy 3: Graph algorithms which represent graphs as a compact array of node\nids and edges with integer indices, rather than using a linked data structure.\n\n### References\n\nSander Mertens, the author of the Flecs ECS has a blog post series discussing\nECS data structures and the many things that may be done with them. We may want\nto use some of these tricks to make `SyntaxTree` faster, eventually. See, for\nexample,\n[Building Games in ECS with Entity Relationships](https://ajmmertens.medium.com/building-games-in-ecs-with-entity-relationships-657275ba2c6c)\n\n### Structural assertions / checking validity of syntax trees\n\nSyntax trees in Julia `Expr` form are very close to lisp lists: a symbol at the\n`head` of the list which specifies the syntactic form, and a sequence of\nchildren in the syntax tree. This is a representation which `JuliaSyntax` and\n`JuliaLowering` follow but it does come with certain disadvantages. One of the\nmost problematic is that the number of children affects the validity (and\nsometimes semantics) of an AST node, as much as the `head` symbol does.\n\nIn `JuliaSyntax` we've greatly reduced the overloading of `head` in order to\nsimplify the interpretation of child structures in the tree. For example,\nbroadcast calls like `f.(x,y)` use the `K\"dotcall\"` kind rather than being a\nnode with `head == Symbol(\".\")` and a tuple as children.\n\nHowever, there's still many ways for lowering to encounter invalid expressions\nof type `SyntaxTree` and these must be checked. In JuliaSyntax we have several\nlevels of effort corresponding to the type of errors conditions we desire to\ncheck and report:\n\n* For invalid syntax which is accepted by the `JuliaSyntax`\n  parser but is invalid in lowering we use manual `if` blocks followed by\n  throwing a `LoweringError`. This is more programming effort but allows for\n  the highest quality error messages for the typical end user.\n* For invalid syntax which can only be produced by macros (ie, not by the\n  parser) we mostly use the `@chk` macro. This is a quick tool for validating\n  input but gives lesser quality error messages.\n* For JuliaLowering's internal invariants we just use `@assert` - these should\n  never be hit and can be compiled out in principle.\n\n## Provenance tracking\n\nExpression provenance is tracked through lowering by attaching provenance\ninformation in the `source` attribute to every expression as it is generated.\nFor example when parsing a source file we have\n\n```julia\njulia\u003e ex = parsestmt(SyntaxTree, \"a + b\", filename=\"foo.jl\")\nSyntaxTree with attributes kind,value,name_val,syntax_flags,source\n[call-i]                                │ \n  a                                     │ \n  +                                     │ \n  b                                     │ \n\njulia\u003e ex[3].source\na + b\n#   ╙ ── these are the bytes you're looking for 😊\n```\n\nThe `provenance` function should be used to look up the `source` attribute and\nthe `showprov` function used to inspect the content (this is preferred because\nthe encoding of `source` is an implementation detail). For example:\n\n```julia\njulia\u003e showprov(ex[3])\na + b\n#   ╙ ── in source\n# @ foo.jl:1\n```\n\nDuring macro expansion and lowering provenance gets more complicated because an\nexpression can arise from multiple sources. For example, we want to keep track\nof the entire stack of macro expansions an expression was generated by, while\nalso recording where it occurred in the original source file.\n\nFor this, we use a tree data structure. Let's look at the following pair of\nmacros\n\n```julia\njulia\u003e JuliaLowering.include_string(Main, raw\"\"\"\n       module M\n           macro inner()\n               :(2)\n           end\n\n           macro outer()\n               :((1, @inner))\n           end\n       end\n       \"\"\", \"some_macros.jl\")\n```\n\nThe tree which arises from macro expanding this is pretty simple:\n\n```julia\njulia\u003e expanded = JuliaLowering.macroexpand(Main, parsestmt(SyntaxTree, \"M.@outer()\"))\nSyntaxTree with attributes scope_layer,kind,value,var_id,name_val,syntax_flags,source\n[tuple-p]                               │ \n  1                                     │ \n  2                                     │ \n```\n\nbut the provenance information recorded for the second element `2` of this\ntuple is not trivial; it includes the macro call expressions for `@inner` and\n`@outer`. We can show this in tree form:\n\n```julia\njulia\u003e showprov(expanded[2], tree=true)\n2\n├─ 2\n│  └─ @ some_macros.jl:3\n└─ (macrocall @inner)\n   ├─ (macrocall @inner)\n   │  └─ @ some_macros.jl:7\n   └─ (macrocall-p (. M @outer))\n      └─ @ foo.jl:1\n```\n\nor as a more human readable flattened list highlighting of source ranges:\n\n```julia\nmodule M\n    macro inner()\n        :(2)\n#         ╙ ── in source\n    end\n\n# @ some_macros.jl:3\n\n\n    macro outer()\n        :((1, @inner))\n#             └────┘ ── in macro expansion\n    end\nend\n# @ some_macros.jl:7\n\nM.@outer()\n└────────┘ ── in macro expansion\n# @ foo.jl:1\n```\n\n## Problems with Hygiene in Julia's exiting macro system\n\nTo write correct hygienic macros in Julia (as of 2024), macro authors must use\n`esc()` on any any syntax passed to the macro so that passed identifiers escape\nto the macro caller scope. However\n\n* This is not automatic and the correct use of `esc()` is one of the things\n  that new macro authors find most confusing. (My impression, based on various\n  people complaining about how confusing `esc()` is.)\n* `esc()` wraps expressions in `Expr(:escape)`, but this doesn't work well when\n  macros pass such escaped syntax to an inner macro call. As discussed in\n  [Julia issue #37691](https://github.com/JuliaLang/julia/issues/37691), macros\n  in Julia's existing system are not composable by default. Writing\n  composable macros in the existing system would require preserving the escape\n  nesting depth when recursing into any macro argument nested expressions.\n  Almost no macro author knows how to do this and is prepared to pay for the\n  complexity of getting it right.\n\nThe requirement to use `esc()` stems from Julia's pervasive use of the simple\n`Expr` data structure which represents a unadorned AST in which names are plain\nsymbols. For example, a macro call `@foo x` gets passed the  symbol `:x`\nwhich is just a name without any information attached to indicate that it came\nfrom the scope where `@foo` was called.\n\n### Hygiene References\n\n* [Toward Fearless Macros](https://lambdaland.org/posts/2023-10-17_fearless_macros) -\n  a blog post by Ashton Wiersdorf\n* [Towards the Essence of Hygiene](https://michaeldadams.org/papers/hygiene/hygiene-2015-popl-authors-copy.pdf) - a paper by Michael Adams\n* [Bindings as sets of scopes](https://www-old.cs.utah.edu/plt/scope-sets/) - a description of Racket's scope set mechanism by Matthew Flatt\n\n# Overview of lowering passes\n\nJuliaLowering uses six symbolic transformation passes:\n\n1. Macro expansion - expanding user-defined syntactic constructs by running the\n   user's macros. This pass also includes a small amount of other symbolic\n   simplification.\n2. Syntax desugaring - simplifying Julia's rich surface syntax down to a small\n   number of syntactic forms.\n3. Scope analysis - analyzing identifier names used in the code to discover\n   local variables, closure captures, and associate global variables to the\n   appropriate module. Transform all names (kind `K\"Identifier\"`) into binding\n   IDs (kind `K\"BindingId\"`) which can be looked up in a table of bindings.\n4. Closure conversion - convert closures to types and deal with captured\n   variables efficiently where possible.\n5. Flattening to untyped IR - convert code in hierarchical tree form to a\n   flat array of statements; convert control flow into gotos.\n6. Convert untyped IR to `CodeInfo` form for integration with the Julia runtime.\n\n## Pass 1: Macro expansion\n\nThis pass expands macros and quoted syntax, and does some very light conversion\nof a few syntax `Kind`s in preparation for syntax desugaring.\n\n### Hygiene in JuliaLowering\n\nIn JuliaLowering we make hygiene automatic and remove `esc()` by combining names\nwith scope information. In the language of the paper [*Towards the Essence of\nHygiene*](https://michaeldadams.org/papers/hygiene/hygiene-2015-popl-authors-copy.pdf)\nby Michael Adams, this combination is called a \"syntax object\". In\nJuliaLowering our representation is the tuple `(name,scope_layer)`, also called\n`VarId` in the scope resolution pass.\n\nJuliaLowering's macro expander attaches a unique *scope layer* to each\nidentifier in a piece of syntax. A \"scope layer\" is an integer identifer\ncombined with the module in which the syntax was created.\n\nWhen expanding macros,\n\n* Any identifiers passed to the macro are tagged with the scope layer they were\n  defined within.\n* A new unique scope layer is generated for the macro invocation, and any names\n  in the syntax produced by the macro are tagged with this layer.\n\nSubsequently, the `(name,scope_layer)` pairs are used when resolving bindings.\nThis ensures that, by default, we satisfy the basic rules for hygenic macros\ndiscussed in Adams' paper:\n\n1. A macro can't insert a binding that can capture references other than those\n   inserted by the macro.\n2. A macro can't insert a reference that can be captured by bindings other than\n   those inserted by the macro.\n\nTODO: Write more here...\n\n## Pass 2: Syntax desugaring\n\nThis pass recursively converts many special surface syntax forms to a smaller\nset of syntax `Kind`s, following the AST's hierarchical tree structure. Some\nsuch as `K\"scope_block\"` are internal to lowering and removed during later\npasses. See `kinds.jl` for a list of these internal forms.\n\nThis pass is implemented in `desugaring.jl`. It's quite large because Julia has\nmany special syntax features.\n\n### Desugaring of function definitions\n\nDesugaring of function definitions is particularly complex because of the cross\nproduct of features which need to work together consistently:\n\n* Positional arguments (with and without defaults, with and without types)\n* Keyword arguments (with and without defaults, with and without types)\n* Type parameters with `where` syntax\n* Argument slurping syntax with `...`\n* Fancy arguments (argument destructuring)\n\nThe combination of positional arguments with defaults and keyword arguments is\nparticularly complex. Here's an example.  Suppose we're given the function\ndefinition\n\n```julia\nfunction f(a::A=a_default, b::B=b_default; x::X=x_default,y::Y=y_default)\n    body\nend\n```\n\nThis generates\n* One method of `f` for each number of positional arguments which can be\n  called when `f` is called without keyword args\n* One overload of `Core.kwcall(kws, ::typeof(f), ...)` for each number of\n  positional arguments (when called with a nonzero number of keyword args; the\n  tuple `kws` being constructed by the caller)\n* One internal method for the body of the function (we can call it `f_kw`\n  though it will be named something like `#f#18`)\n\nFirst, partially expanding the kw definitions this roughly looks like\n\n```julia\nfunction f_kw(x::X, y::X, f_self::typeof(f), a::A, b::B)\n    body\nend\n\nfunction f(a::A=a_default, b::B=b_default)\n    f_kw(x_default, y_default, var\"#self#\", a, b)\nend\n\nfunction Core.kwcall(kws::NamedTuple, self::typeof(f), a::A=a_default, b::B=b_default)\n    if Core.isdefined(kws, :x)\n        x_tmp = Core.getfield(kws, :x)\n        if x_tmp isa X\n            nothing\n        else\n            Core.throw($(Expr(:new, Core.TypeError, Symbol(\"keyword argument\"), :x, X, x_tmp)))\n        end\n        x = x_tmp\n    else\n        x = 1\n    end\n    if Core.isdefined(kws, :y)\n        y_tmp = Core.getfield(kws, :y)\n        if y_tmp isa Y\n            nothing\n        else\n            Core.throw($(Expr(:new, Core.TypeError, Symbol(\"keyword argument\"), :y, Y, y_tmp)))\n        end\n        y = y_tmp\n    else\n        y = 2\n    end\n    if Base.isempty(Base.diff_names(Base.keys(kws), (:x, :y)))\n        nothing\n    else\n        # Else unsupported kws\n        Base.kwerr(kws, self, a, b)\n    end\n    f_kw(x, y, self, a, b)\nend\n```\n\nWe can then pass this to function expansion for default arguments which expands\neach of the above into three more methods. For example, for the first\ndefinition we conceptually expand `f(a::A=a_default, b::B=b_default)` into the\nmethods\n\n```julia\n# The body\nfunction f(a::A, b::B)\n    f_kw(x_default, y_default, var\"#self#\", a, b)\nend\n\n# And two methods for the different numbers of default args\nfunction f(a::A)\n    var\"#self#\"(a, b_default)\nend\n\nfunction f()\n    var\"#self#\"(a_default, b_default)\nend\n```\n\nIn total, this expands a single \"function definition\" into seven methods.\n\nNote that the above is only a sketch! There's more fiddly details when `where`\nsyntax comes in\n\n### Desugaring of generated functions\n\nA brief description of how this works. Let's consider the generated function\n\n```julia\nfunction gen(x::NTuple{N}, y) where {N,T}\n    shared = :shared\n    # Unnecessary use of @generated, but it shows what's going on.\n    if @generated\n        quote\n            maybe_gen = ($x, $N)\n        end\n    else\n        maybe_gen = (typeof(x), N)\n    end\n    (shared, maybe_gen)\nend\n```\n\nThis is desugared into the following two function definitions. First, a code\ngenerator which will generate code for the body of the function, given the\nstatic parameters `N`, `T` and the positional arguments `x`, `y`.\n(`var\"#self#\"::Type{typeof(gen)}` is also provided by the Julia runtime to\ncomplete the full signature of `gen`, though the user won't normally use this.)\n\n```julia\nfunction var\"#gen@generator#0\"(__context__::JuilaSyntax.MacroContext, N, T, var\"#self#\", x, y)\n    gen_stuff = quote\n        maybe_gen = ($x, $N)\n    end\n    quote\n        shared = :shared\n        $gen_stuff\n        (shared, maybe_gen)\n    end\nend\n```\n\nSecond, the non-generated version, using the `if @generated` else branches, and\ncontaining mostly normal code.\n\n```julia\nfunction gen(x::NTuple{N}, y) where {N,T}\n    $(Expr(:meta, :generated,\n        Expr(:call, JuliaLowering.GeneratedFunctionStub,\n             :var\"#gen@generator#0\", sourceref_of_gen,\n             :(Core.svec(:var\"#self\", :x, :y))\n             :(Core.svec(:N, :T)))))\n    shared = :shared\n    maybe_gen = (typeof(x), N)\n    (shared, maybe_gen)\nend\n```\n\nThe one extra thing added here is the `Expr(:meta, :generated)` which is an\nexpression creating a callable wrapper for the user's generator, to be\nevaluated at top level. This wrapper will then be invoked by the runtime\nwhenever the user calls `gen` with a new signature and it's expected that a\n`CodeInfo` be returned from it. `JuliaLowering.GeneratedFunctionStub` differs\nfrom `Core.GeneratedFunctionStub` in that it contains extra provenance\ninformation (the `sourcref_of_gen`) and expects a `SyntaxTree` to be returned\nby the user's generator code.\n\n## Pass 3: Scope analysis / binding resolution\n\nThis pass replaces variables with bindings of kind `K\"BindingId\"`,\ndisambiguating variables when the same name is used in different scopes. It\nalso fills in the list of non-global bindings within each lambda and metadata\nabout such bindings as will be used later during closure conversion.\n\nScopes are documented in the Juila documentation on\n[Scope of Variables](https://docs.julialang.org/en/v1/manual/variables-and-scoping/)\n\nDuring scope resolution, we maintain a stack of `ScopeInfo` data structures.\n\nWhen a new `lambda` or `scope_block` is discovered, we create a new `ScopeInfo` by\n1. Find all identifiers bound or used within a scope. New *bindings* may be\n   introduced by one of the `local`, `global` keywords, implicitly by\n   assignment, as function arguments to a `lambda`, or as type arguments in a\n   method (\"static parameters\"). Identifiers are *used* when they are\n   referenced.\n2. Infer which bindings are newly introduced local or global variables (and\n   thus require a distinct identity from names already in the stack)\n3. Assign a `BindingId` (unique integer) to each new binding\n\nWe then push this `ScopeInfo` onto the stack and traverse the expressions\nwithin the scope translating each `K\"Identifier\"` into the associated\n`K\"BindingId\"`. While we're doing this we also resolve some special forms like\n`islocal` by making use of the scope stack.\n\nThe detailed rules for whether assignment introduces a new variable depend on\nthe `scope_block`'s `scope_type` attribute when we are processing top-level\ncode.\n* `scope_type == :hard` (as for bindings inside a `let` block) means an\n  assignment always introduces a new binding\n* `scope_type == :neutral` - inherit soft or hard scope from the parent scope.\n* `scope_type == :soft` - assignments are to globals if the variable\n  exists in global module scope. Soft scope doesn't have surface syntax and is\n  introduced for top-level code by REPL-like environments.\n\n## Pass 4: Closure conversion / lower bindings\n\nThe main goal of this pass is closure conversion, but it's also used for\nlowering typed bindings and global assignments. Roughly, this is passes 3 and 4\nin the original `julia-syntax.scm`. In JuliaLowering it also comes in two steps:\n\nThe first step (part of `scope_resolution.jl`) is to compute metadata related\nto bindings, both per-binding and per-binding-per-closure-scope.\n\nProperties which are computed per-binding which can help with symbolic\noptimizations include:\n* Type is declared (`x::T` syntax in a statement): type conversions must be\n  inserted at every assignment of `x`.\n* Never undefined: value is always assigned to the binding before being read\n  hence this binding doesn't require the use of `Core.NewvarNode`.\n* Single assignment: (TODO how is this defined, what is it for and does it go\n  here or below?)\n\nProperties of non-globals which are computed per-binding-per-closure include:\n* Read: the value of the binding is used.\n* Write: the binding is asssigned to.\n* Captured: Bindings defined outside the closure which are either Read or Write\n  within the closure are \"captured\" and need to be one of the closure's fields.\n* Called: the binding is called as a function, ie, `x()`. (TODO - what is this\n  for?)\n\nThe second step uses this metadata to\n* Convert closures into `struct` types\n* Lower bindings captured by closures into references to boxes as necessary\n* Deal with typed bindings (`K\"decl\"`) and their assignments\n* Lower const and non-const global assignments\n* TODO: probably more here.\n\n\n### Q\u0026A\n\n#### When does `function` introduce a closure?\n\nClosures are just functions where the name of the function is *local* in scope.\nHow does the function name become a local? The `function` keyword acts like an\nassignment to the function name for the purposes of scope resolution. Thus\n`function f() body end` is rather like `f = ()-\u003ebody` and may result in the\nsymbol `f` being either `local` or `global`. Like other assignments, `f` may be\ndeclared global or local explicitly, but if not `f` is subject to the usual\nrules for assignments inside scopes. For example, inside a `let` scope\n`function f() ...` would result in the symbol `f` being local.\n\nExamples:\n\n```julia\nbegin\n    # f is global because `begin ... end` does not introduce a scope\n    function f()\n        body\n    end\n\n    # g is a closure because `g` is explicitly declared local\n    local g\n    function g()\n        body\n    end\nend\n\nlet\n    # f is local so this is a closure becuase `let ... end` introduces a scope\n    function f()\n        body\n    end\n\n    # g is not a closure because `g` is declared global\n    global g\n    function g()\n        body\n    end\nend\n```\n\n#### How do captures work with non-closures?\n\nYes it's true, you can capture local variables into global methods. For example:\n\n```julia\nbegin\n    local x = 1\n    function f(y)\n        x + y\n    end\n    x = 2\nend\n```\n\nThe way this works is to put `x` in a `Box` and interpolate it into the AST of\n`f` (the `Box` can be eliminated in some cases, but not here). Essentially this\nlowers to code which is almost-equivalent to the following:\n\n```julia\nbegin\n    local x = Core.Box(1)\n    @eval function f(y)\n        $(x.contents) + y\n    end\n    x.contents = 2\nend\n```\n\n#### How do captures work with closures with multiple methods?\n\nSometimes you might want a closure with multiple methods, but those methods\nmight capture different local variables. For example,\n\n```julia\nlet\n    x = 1\n    y = 1.5\n    function f(xx::Int)\n        xx + x\n    end\n    function f(yy::Float64)\n        yy + y\n    end\n\n    f(42)\nend\n```\n\nIn this case, the closure type must capture both `x` and `y` and the generated\ncode looks rather like this:\n\n```julia\nstruct TheClosureType\n    x\n    y\nend\n\nlet\n    x = 1\n    y = 1.5\n    f = TheClosureType(x,y)\n    function (self::TheClosureType)(xx::Int)\n        xx + self.x\n    end\n    function (self::TheClosureType)(yy::Int)\n        yy + self.y\n    end\n\n    f(42)\nend\n```\n\n#### When are `method` defs lifted to top level?\n\nClosure method definitions must be lifted to top level whenever the definitions\nappear inside a function. This is allow efficient compilation and avoid world\nage issues.\n\nConversely, when method defs appear in top level code, they are executed\ninline.\n\n## Pass 5: Convert to untyped IR\n\nThis pass is implemented in `linear_ir.jl`.\n\n### Untyped IR (JuliaLowering form)\n\nJuliaLowering's untyped IR is very close to the runtime's `CodeInfo` form (see\nbelow), but is more concretely typed as `JuliaLowering.SyntaxTree`.\n\nMetadata is generally represented differently:\n* The statements retain full code provenance information as `SyntaxTree`\n  objects. See `kinds.jl` for a list of which `Kind`s occur in the output IR\n  but not in surface syntax.\n* The list of slots is `Vector{Slot}`, including `@nospecialize` metadata\n\n### Lowering of exception handlers\n\nException handling involves a careful interplay between lowering and the Julia\nruntime. The forms `enter`, `leave` and `pop_exception` dynamically modify the\nexception-related state on the `Task`; lowering and the runtime work together\nto maintain correct invariants for this state.\n\nLowering of exception handling must ensure that\n\n* Each `enter` is matched with a `leave` on every possible non-exceptional\n  program path (including implicit returns generated in tail position).\n* Each `catch` block which is entered and handles the exception - by exiting\n  via a non-exceptional program path - is matched with a `pop_exception`\n* Each `finally` block runs, regardless of the way it's entered - either by\n  normal program flow, an exception, early `return` or a jump out of an inner\n  context via `break`/`continue`/`goto` etc.\n\nThe following special forms are emitted into the IR:\n\n* `(= tok (enter catch_label dynscope))` -\n  push exception handler with catch block at `catch_label` and dynamic\n  scope `dynscope`, yielding a token which is used by `leave` and\n  `pop_exception`. `dynscope` is only used in the special `tryfinally` form\n  without associated source level syntax (see the `@with` macro)\n* `(leave tok)` -\n    pop exception handler back to the state of the `tok` from the associated\n    `enter`. Multiple tokens can be supplied to pop multiple handlers using\n    `(leave tok1 tok2 ...)`.\n* `(pop_exception tok)` - pop exception stack back to state of associated enter\n\nWhen an `enter` is encountered, the runtime pushes a new handler onto the\n`Task`'s exception handler stack which will jump to `catch_label` when an\nexception occurs.\n\nThere are two ways that the exception-related task state can be restored\n\n1. By encountering a `leave` which will restore the handler state with `tok`.\n2. By throwing an exception. In this case the runtime will pop one handler\n   automatically and jump to the catch label with the new exception pushed\n   onto the exception stack. On this path the exception stack state must be\n   restored back to the associated `enter` by encountering `pop_exception`.\n\nNote that the handler and exception stack represent two distinct types of\nexception-related state restoration which need to happen. Note also that the\n\"handler state restoration\" actually includes several pieces of runtime state\nincluding GC flags - see `jl_eh_restore_state` in the runtime for that.\n\n#### Lowering finally code paths\n\nWhen lowering `finally` blocks we want to emit the user's finally code once but\nmultiple code paths may traverse the finally block. For example, consider the\ncode\n\n```julia\nfunction foo(x)\n    while true\n        try\n            if x == 1\n                return f(x)\n            elseif x == 2\n                g(x)\n                continue\n            else\n                break\n            end\n        finally\n            h()\n        end\n    end\nend\n```\n\nIn this situation there's four distinct code paths through the finally block:\n1. `return f(x)` needs to call `val = f(x)`, leave the `try` block, run `h()` then\n   return `val`.\n2. `continue` needs to call `h()` then jump to the start of the while loop\n3. `break` needs to call `h()` then jump to the exit of the while loop\n4. If an exception occurs in `f(x)` or `g(x)`, we need to call `h()` before\n   falling back into the while loop.\n\nTo deal with these we create a `finally_tag` variable to dynamically track\nwhich action to take after the finally block exits. Before jumping to the block\nwe set this variable to a unique integer tag identifying the incoming code\npath. At the exit of the user's code (`h()` in this case) we perform the jump\nappropriate to the `break`, `continue` or `return` as necessary based on the tag.\n\n(TODO - these are the only four cases which can occur, but, for example,\nmultiple `return`s create multiple tags rather than assigning to a single\nvariable. Collapsing these into a single case might be worth considering? But\nalso might be worse for type inference in some cases?)\n\n## Pass 6: Convert IR to `CodeInfo` representation\n\nThis pass convert's JuliaLowering's internal representation of untyped IR into\na form the Julia runtime understands. This is a necessary decoupling which\nseparates the development of JuliaLowering.jl from the evolution of the Julia\nruntime itself.\n\n### Untyped IR (`CodeInfo` form)\n\nThe final lowered IR is expressed as `CodeInfo` objects which are a sequence of\n`code` statments containing\n* Literals\n* Restricted forms of `Expr` (with semantics different from surface syntax,\n  even for the same `head`! for example the arguments to `Expr(:call)` in IR\n  must be \"simple\" and aren't evaluated in order)\n* `Core.SlotNumber` \n* Other special forms from `Core` like `Core.ReturnNode`, `Core.EnterNode`, etc.\n* `Core.SSAValue`, indexing any value generated from a statement in the `code`\n  array.\n* Etc (todo)\n\nThe IR obeys certain invariants which are checked by the downstream code in\nbase/compiler/validation.jl.\n\nSee also https://docs.julialang.org/en/v1/devdocs/ast/#Lowered-form\n\nCodeInfo layout (as of early 1.12-DEV):\n\n```julia\nmutable struct CodeInfo\n    code::Vector{Any}             # IR statements\n    codelocs::Vector{Int32}       # `length(code)` Vector of indices into `linetable`\n    ssavaluetypes::Any            # `length(code)` or Vector of inferred types after opt\n    ssaflags::Vector{UInt32}      # flag for every statement in `code`\n                                  #   0 if meta statement\n                                  #   inbounds_flag - 1 bit (LSB)\n                                  #   inline_flag   - 1 bit\n                                  #   noinline_flag - 1 bit\n                                  #   ... other 8 flags which are defined in compiler/optimize.jl\n                                  #   effects_flags - 9 bits\n    method_for_inference_limit_heuristics::Any\n    linetable::Any\n    slotnames::Vector{Symbol}     # names of parameters and local vars used in the code\n    slotflags::Vector{UInt8}      # vinfo flags from flisp\n    slottypes::Any                # nothing (used by typeinf)\n    rettype::Any                  # Any (used by typeinf)\n    parent::Any                   # nothing (used by typeinf)\n    edges::Any\n    min_world::UInt64\n    max_world::UInt64\n    inferred::Bool\n    propagate_inbounds::Bool\n    has_fcall::Bool\n    nospecializeinfer::Bool\n    inlining::UInt8\n    constprop::UInt8\n    purity::UInt16\n    inlining_cost::UInt16\nend\n```\n\n## Notes on toplevel-only forms and eval-related functions\n\nIn the current Julia runtime,\n\n`Base.eval()`\n- Uses `jl_toplevel_eval_in` which calls `jl_toplevel_eval_flex`\n\n`jl_toplevel_eval_flex(mod, ex)`\n- Lowers if necessay\n- Evaluates certain blessed top level forms\n  * `:.`\n  * `:module`\n  * `:using`\n  * `:import`\n  * `:public`\n  * `:export`\n  * `:global`\n  * `:const`\n  * `:toplevel`\n  * `:error`\n  * `:incomplete`\n  * Identifier and literals\n- Otherwise expects `Expr(:thunk)`\n  * Use codegen \"where necessary/profitable\" (eg ccall, has_loops etc)\n  * Otherwise interpret via `jl_interpret_toplevel_thunk`\n\nShould we lower the above blessed top level forms to julia runtime calls?\nPros:\n- Semantically sound. Lowering should do syntax checking in things like\n  `Expr(:using)` rather than doing this in the runtime support functions.\n- Precise lowering error messages\n- Replaces more Expr usage\n- Replaces a whole pile of C code with significantly less Julia code\n- Lowering output becomes more consistently imperative\nCons: \n- Lots more code to write\n- May need to invent intermediate data structures to replace `Expr`\n- Bootstrap?\n- Some forms require creating toplevel thunks\n\nIn general, we'd be replacing current *declarative* lowering targets like\n`Expr(:using)` with an *imperative* call to a `Core` API instead. The call and\nthe setup of its arguments would need to go in a thunk. We've currently got an\nodd mixture of imperative and declarative lowered code.\n\n## Bugs in Julia's lowering\n\nSubset of bugs which exist in upstream in flisp implementation, but which are fixed here\n* `f()[begin]` has the side effect `f()` twice.\n* `a[(begin=1; a=2)]` gives a weird error\n* `function A.ccall() ; end` allows `ccall` as a name but it's not allowed without the `A.`\n* `a .\u003c b .\u003c c` expands to `(a .\u003c b) .\u0026 (b .\u003c c)` where the scope of the `\u0026` is\n  the expansion module but should be `top.\u0026` to avoid scope-dependence\n  (especially in the presence of macros)\n\n## Notes on Racket's hygiene\n\nPeople look at [Racket](https://racket-lang.org/) as an example of a very\ncomplete system of hygienic macros. We should learn from them, but keeping in\nmind that Racket's macro system is inherently more complicated. Racket's\ncurrent approach to hygiene is described in an [accessible talk](https://www.youtube.com/watch?v=Or_yKiI3Ha4)\nand in more depth in [a paper](https://www-old.cs.utah.edu/plt/publications/popl16-f.pdf).\n\nSome differences which makes Racket's macro expander different from Julia:\n\n* Racket allows *local* definitions of macros. Macro code can be embedded in an\n  inner lexical scope and capture locals from that scope, but still needs to be\n  executed at compile time. Julia supports macros at top level scope only.\n* Racket goes to great lengths to execute the minimal package code necessary to\n  expand macros; the \"pass system\". Julia just executes all top level\n  statements in order when precompiling a package.\n* As a lisp, Racket's surface syntax is dramatically simpler and more uniform\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fc42f%2Fjulialowering.jl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fc42f%2Fjulialowering.jl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fc42f%2Fjulialowering.jl/lists"}