{"id":19846046,"url":"https://github.com/alexfru/regal86","last_synced_at":"2025-05-01T21:30:55.832Z","repository":{"id":43929464,"uuid":"362003503","full_name":"alexfru/regal86","owner":"alexfru","description":"Register Allocator for 8086","archived":false,"fork":false,"pushed_at":"2023-08-20T23:36:43.000Z","size":57,"stargazers_count":70,"open_issues_count":0,"forks_count":6,"subscribers_count":4,"default_branch":"master","last_synced_at":"2023-08-21T00:23:52.029Z","etag":null,"topics":["8086","assembly","code-generation","compiler","compiler-backend","compiler-design","compiler-optimization","dos","expression-evaluation","register-allocation","x86"],"latest_commit_sha":null,"homepage":"","language":"Assembly","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alexfru.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-04-27T06:23:45.000Z","updated_at":"2023-08-20T23:16:35.000Z","dependencies_parsed_at":"2023-02-08T04:01:32.182Z","dependency_job_id":null,"html_url":"https://github.com/alexfru/regal86","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexfru%2Fregal86","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexfru%2Fregal86/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexfru%2Fregal86/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alexfru%2Fregal86/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alexfru","download_url":"https://codeload.github.com/alexfru/regal86/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224278439,"owners_count":17285080,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["8086","assembly","code-generation","compiler","compiler-backend","compiler-design","compiler-optimization","dos","expression-evaluation","register-allocation","x86"],"created_at":"2024-11-12T13:10:15.038Z","updated_at":"2024-11-12T13:10:19.063Z","avatar_url":"https://github.com/alexfru.png","language":"Assembly","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Local Register Allocator for 8086\n\n## Table of contents\n\n[What is this?](#what-is-this)\n\n[Motivation](#motivation)\n\n[Scope](#scope)\n\n[Basic algorithm](#basic-algorithm)\n\n[8086-specific approach](#8086-specific-approach)\n\n[Resources](#resources)\n\n## What is this?\n\nThis is an implementation of a greedy bottom-up local register allocator for\nthe intel 8086 CPU. It may be used as part of a simple compiler. In particular,\nit can be used in a C compiler whose `int` is 16-bit and `char` is 8-bit.\n\nTo make things interesting and interactive it comes with a code generator\ncapable of generating 8086 assembly code (compilable into DOS .COM programs)\nfrom trees representing expressions with 16-bit integer ALU operations,\nconstants and memory loads and stores. A few extra instructions generated by\nthe code check the results of expression evaluation in the output register and\nthe output memory locations, if any.\n\nThus you can actually execute the generated code and tweak things around to see\nhow your changes affect code generation.\n\nSample assembly output from the register allocator/code generator:\n\n    ;         7    (vr4)\n    ;     mul     (vr5)\n    ;         5    (vr3)\n    ; add     (vr6)\n    ;         3    (vr1)\n    ;     mul     (vr2)\n    ;         2    (vr0)\n    ; ----\n    ; Regs needed (approximately): 3\n    ; --------\n        ;                           ; vr0\n        mov  ax, 2                  ; vr0\n        mov  cx, 3                  ; vr1\n        mul  cx                     ; vr2\n        mov  cx, 5                  ; vr3\n        mov  dx, 7                  ; vr4\n        xchg ax, cx                 ; vr5\n        mul  dx                     ; vr5\n        add  cx, ax                 ; vr6\n        ;                           ; vr6\n        cmp  cx, 41                 ; vr6\n        jne  failure                ; vr6\n\nN.B. This register allocator will not allocate registers perfectly in all\ncircumstances. It will occasionally generate unnecessary register moves and\nexchanges. Bear in mind, optimal register allocation is an NP-complete problem.\n\n## Motivation\n\nThe intel 8086 architecture has a number of instructions with fixed input\nand/or output registers and it's not entirely trivial to connect the ins and\nouts of these instructions. There's obviously a long history of implementing\ncompilers for this and similar architectures, but it's a bit odd that it's\nhard to find a conceptually simple algorithm like this. The \"Dragon\" book and\nmany other sources either consider architectures without instructions with such\nfixed register operands or solve the problem using graph coloring, which for\ncertain reasons may be impractical.\n\nThe intel 8086 architecture survives to this day in the modern intel Pentium\nCPUs that are compatible descendants of the old 8086 and with it survive some\nof the instructions with fixed in/out register operands. So, the problem\npersists to this day, although there are fewer limitations to deal with when\nusing newer 32-bit and 64-bit x86 instructions and registers today.\n\n8086 registers by special uses in expression evaluation (pardon the mixture of\nC and assembly operators, instructions and syntax and ASCII art):\n\n             +-- can be freely used in most ALU instructions as src/dst\n             |   +-- can be used as address to access memory\n             |   |     +-- has individual 8-bit subregisters\n             |   |     |   +-- can be sign-extended with cbw\n             |   |     |   |      +-- can be shifted\n             |   |     |   |      |     +-- can be shift count\n             v   v     v   v      v     v\n       +-\u003c\u0026|^~  [r] 8bit cbw  \u003c\u003cdst \u003c\u003ccnt  *dst *src  /dvd /dvsr  /quo /rem\n    ax       +         +   +      +        fix+ fix+  fix+        fix+\n    bx       +   +     +          +                +           +\n    cx       +         +             fix+          +           +\n    dx       +         +          +                +                   fix+\n    si       +   +                +                +           +\n    di       +   +                +                +           +\n                                              ^    ^     ^     ^     ^    ^\n                             can be product --+    |     |     |     |    |\n                             can be multiplicand --+     |     |     |    |\n                                       can be dividend --+     |     |    |\n                                              can be divisor --+     |    |\n                                                   can be quotient --+    |\n                                                       can be remainder --+\n\n\n\n## Scope\n\nOutside of the scope of this work are:\n*   data types other than 8-bit bytes and 16-bit words (e.g. floats, structs)\n*   conditional/ternary operators like in C/C++\n*   function calls and calling conventions\n    (however, a simple implementation is included)\n*   use of immediate and memory operands directly in ALU instructions\n    (we use separate load/store instructions instead)\n\nThese are left as an exercise to the reader.\n\n## Basic algorithm\n\nBefore we get to the 8086 specifics, let's describe the algorithm for an\narchitecture, where there aren't instructions with fixed register operands.\nWe'll be building on top of it.\n\nLet's also consider a case, where our expressions consist of only binary\noperators (unary operators will be a trivial specialization), which are\nrealized as 3-register instructions in the CPU, that is, 2 input registers and\n1 output register, much like in the MIPS architecture (e.g. `sub r2, r3, r4`\nwill do `r2 = r3 - r4;`).\n\nSuppose we have an expression tree like this:\n\n                  op\n                  -\n          op              op\n          +               |\n      op      op      op      op\n      ^       *       /       %\n    nm  nm  nm  nm  nm  nm  nm  nm\n    1   2   3   4   5   6   7   8\n\n`op` and `nm` represent operator and numeric nodes in the tree.\n\nOur hardware register allocation will be done recursively from the root towards\nthe leaves:\n\n    AllocHRegs(node):\n      recursively call AllocHRegs() for both input/child nodes\n      Ensure() that both input/child values are in registers\n      Free() both input registers\n      Allocate() one output register\n      generate an instruction using these 3 register operands\n\nFor the numeric leaf node there will only be the last two steps: Allocate() for\nthe output register and the generation of an instruction to load an integer\nconstant into that register.\n\nThe utility functions will be:\n\n    Ensure(node):\n      if node's value has no location yet:\n        Allocate() a register for it and return the register\n      else if node's value is already in a register:\n        return that register\n      else: /* the node's value must have been spilled onto the stack */\n        Allocate() a register\n        generate a pop instruction to pop a value from\n          the stack into this register\n        return this register\n\nand\n\n    Allocate(node):\n      if there's a vacant register among the N registers the allocator manages:\n        mark it as holding the node\n        mark the node as being in this register\n        return the register\n      else:\n        from the N registers managed by the allocator find the register whose\n          value won't be needed the longest, that is, whose parent/using node\n          is the most distant\n        mark the node that that register holds as spilled\n        generate a push instruction to push the node's value from the\n          register onto the stack\n        mark the register as holding the argument node passed into Allocate()\n        mark the node as being in this register\n        return the register\n\nand\n\n    Free(hardware_reg):\n      mark the node this register holds as not having a location\n      mark the register as holding no node\n\nUsing this algorithm we will arrive at this code generated for the above tree\n(let's replicate the tree here):\n\n                  op\n                  -\n          op              op\n          +               |\n      op      op      op      op\n      ^       *       /       %\n    nm  nm  nm  nm  nm  nm  nm  nm\n    1   2   3   4   5   6   7   8\n\n    mov  hr0, 1\n    mov  hr1, 2\n    xor  hr0, hr0, hr1\n    mov  hr1, 3\n    mov  hr2, 4\n    mul  hr1, hr1, hr2\n    add  hr0, hr0, hr1\n    mov  hr1, 5\n    mov  hr2, 6\n    idiv hr1, hr1, hr2\n    mov  hr2, 7\n    mov  hr3, 8\n    irem hr2, hr2, hr3\n    or   hr1, hr1, hr2\n    sub  hr0, hr0, hr1\n\nThis code needs 4 registers (hr0 through hr3).\n\n### Spilling\n\nIf we restrict the number of registers that the allocator can manage to 3,\nwe'll get this instead (let's replicate the tree one more time):\n\n                  op\n                  -\n          op              op\n          +               |\n      op      op      op      op\n      ^       *       /       %\n    nm  nm  nm  nm  nm  nm  nm  nm\n    1   2   3   4   5   6   7   8\n\n    mov  hr0, 1\n    mov  hr1, 2\n    xor  hr0, hr0, hr1\n    mov  hr1, 3\n    mov  hr2, 4\n    mul  hr1, hr1, hr2\n    add  hr0, hr0, hr1\n    mov  hr1, 5\n    mov  hr2, 6\n    idiv hr1, hr1, hr2\n    mov  hr2, 7\n    push hr0\n    mov  hr0, 8\n    irem hr0, hr2, hr0\n    or   hr0, hr1, hr0\n    pop  hr1\n    sub  hr0, hr1, hr0\n\nNote the appearance of the push and pop instructions in the above code and the\ndisappearance of register hr3.\n\nThe allocator runs out of registers once it has loaded 7 into hr2 and is about\nto load 8 into another register. At this point there are 3 partial results\noccupying all 3 registers:\n\n    hr0 = result of +\n    hr1 = result of /\n    hr2 = 7\n\nWe spill hr0, the result of +, onto the stack to make hr0 available for loading\nof 8. We choose hr0 to spill because its parent, -, is the most distant (| and\n%, the parents of / and 7, are closer). When we finally need the result of +,\nwe pop it from the stack.\n\nThe relative distance of parent nodes can be obtained by simple recursive\nassignment of unique numbers to all nodes:\n\n                         op\n                         -\n             op                      op\n             +                       |\n       op          op          op          op\n       ^           *           /           %\n    nm    nm    nm    nm    nm    nm    nm    nm\n    1     2     3     4     5     6     7     8\n\n    0  2  1  6  3  5  4  14 7  9  8  13 10 12 11 \u003c- node number / virtual reg\n\nEach node then will have this unique number (we may refer to it as the virtual\nregister) and we can either follow the node's parent link to find the parent's\nunique number or we may simply store the parent's unique number in the child\nnode instead of storing there the link.\n\nAnd so in this example of ours, 14 is the largest of the three numbers (14, 13\nand 12), which is how we decide to kick the hr0 value out onto the stack.\n\nSpilling the register that has the most distant use guarantees that unspilling\nwill occur in the exact opposite order (LIFO), which lets us use the stack with\npush and pop instructions. Specifically, with operators being at most binary\n(not ternary and so on), we'll never spill two sibling nodes, that is, we'll\nnever have to distinguish equally distant nodes.\n\nWhen implementing this algorithm we'll need a global array of N pointers to the\nexpression tree nodes, where N is the number of the registers that the register\nallocator manages. (This array is known as `NodeFromHReg[]` in our code.) By\nexamining the contents of this array we'll know if a particular hardware\nregister is occupied by the result of some node. The nodes themselves will\ncarry the location of their results (that is, it can be one of these N hardware\nregisters or the stack or nowhere yet). IOW, with such a setup we'll always be\nable to tell what's where, by examining the array or a node and we'll be\nmodifying the array and nodes appropriately to reflect the current allocations\nthroughout the process.\n\n### Evaluation order of subexpressions\n\nWhen evaluating a binary operator (e.g. a + b) we should first evaluate the\nchild/subexpression that uses more registers, then the one that uses fewer.\nThis leads to more effective register use and fewer spills. The logic is\nsimple: before you can evaluate the binary operator, you need to hold the\nresult of one of its children in a register while evaluating the other child.\nSo, if the children need, let's say, 1 and 2 registers each to evaluate, then\nevaluating first then second will need 3 registers, whereas evaluating them in\nthe opposite order will need 2 registers.\n\nWhen assigning unique node numbers (virtual register numbers) as described in\nthe preceding section, we can assign them in this order and we can then handle\nsibling nodes in this order by looking at and comparing their unique numbers.\n\nSee [Ershov number](https://en.wikipedia.org/wiki/Ershov_Number) and\n[Sethi-Ullman algorithm](https://en.wikipedia.org/wiki/Sethi%E2%80%93Ullman_algorithm).\n\n### 2-register instructions\n\nNot all CPU architectures offer 3-register ALU instructions like the MIPS does.\nThe x86 architecture's ALU instructions are 2-register. That is, the output\noverwrites one of the inputs.\n\nThis affects the previously described `AllocHRegs()` function a little. It has\nto be able to allocate a specific output register through the `Alloc()`\nfunction such that `Alloc()` would allocate the same register as one of the\ninputs. So, for example, if the inputs to the `add` instruction are in, say,\n`ax` and `cx`, the output will be in `ax` if we generate `add ax, cx` (or the\nother way around, the output will be in `cx` if we generate `add cx, ax`; keep\nin mind that not all binary operators are symmetric).\n\n## 8086-specific approach\n\nThe 8086-specific idea is:\n*   When an instruction at a node needs its inputs in specific registers, it\n    requests its child nodes to produce their outputs in _desired registers_\n    (this is done recursively through the familiar `AllocHRegs()` function).\n    However, they may be unable to satisfy the requests due to their own\n    limitations (that is, they can never produce the outputs elsewhere\n    naturally) or because the desired registers already hold something useful.\n    And so the child nodes do what they can and leave it to their parent to\n    do the rest when they can't satisfy the request.\n*   Thankfully, it's always possible to swap values in a few registers with\n    the `xchg` instruction to satisfy the needs on the input side and such\n    swaps aren't needed too often.\n\nSo the `AllocHRegs()`, `Ensure()` and `Allocate()` functions introduced earlier\nneed to receive an additional parameter, the desired register. It should be\npossible to use this parameter to request a specific register, any register or\na register from a specific subset of all allocatable registers.\n\nCurrently we're using this enumeration to specify a register:\n\n    enum HReg : int\n    {\n      // Specific regs begin\n      HReg0, HRegAX = HReg0,\n      HReg1, HRegCX = HReg1,\n      HReg2, HRegDX = HReg2,\n      HReg3, HRegBX = HReg3,\n      HReg4, HRegSI = HReg4,\n      HReg5, HRegDI = HReg5,\n      // Specific regs end\n      HRegCnt,\n      // Constants representing multiple-choice desired regs:\n      HRegAny = HRegCnt, // unspecified, any reg at all is OK\n      HRegNotCX, // prefer regs other than cx (for shifts)\n      HRegNotDXNotAX, // prefer regs other than dx and ax (for (i)div)\n      HRegNotDXNotCXNotAX, // prefer regs other than dx, cx and ax\n      HRegByte, // prefer those with individual byte components: ax, cx, dx, bx\n      HRegByteNotCX, // prefer those with individual byte components except cx: ax, dx, bx\n      HRegAddr, // prefer those that can be a memory operand: bx, si, di\n    };\n\nThe `Allocate()` function will try to allocate the requested/desired register\nfirst, but if it fails to, it'll return a different one and the caller will\nneed to deal with it (e.g. use the `xchg` instruction).\n\n### Shifts\n\nThe 8086 shift instructions (`shl`, `shr`, `sar`) take the shift count from the\nfixed register `cl` (lower half of `cx`) if the count isn't a simple 1. The\nvalue that they shift can be anywhere else.\n\nHence the implementation of `AllocHRegs()` for shift instructions should pass\nits own desired register parameter (possibly modified to exclude `cx`) to the\nleft child's `AllocHRegs()` and `Ensure()` while it should pass `HRegCX` as the\ndesired register to the right child's `AllocHRegs()` and `Ensure()`. Btw, when\nboth children need the same number of registers to compute we should prefer to\nhandle the right/count child first, to have higher chances of allocating `cx`\nfor the count.\n\n           desired register may pass modified to left child\n           |\n           v\n    shift dst, cl\n           |    |\n           |    v\n           v    desired reg = cx\n           desired reg excludes cx\n\n`Ensure()` may fail to allocate and return `HRegCX` for the right child. When\nthis happens, we either move the right child node from that other register to\n`cx` (if `cx` is vacant) or we exchange the registers between that node and the\none currently residing in `cx` (when `cx` isn't vacant). For this we update the\nnode(s) and the global array of pointers to nodes (`NodeFromHReg[]`) and\ngenerate the `mov` or `xchg` instruction. We should also remember that\n`Ensure()` may allocate `cx` for the left child, in which case we need to\nproperly update the left register.\n\n             ^\n             |\n      shift dst, cl\n             ^    ^\n             |    |\n    case 1: ?x   cx  Nothing to do\n    case 2: cx   ?x  Swap left and right\n    case 3: ?x   ?x  Swap right and cx (or move right to cx)\n\n### (i)mul\n\nThe `(i)mul` instruction multiplies `ax` by another input and outputs the\nproduct to a pair of registers, `dx`:`ax` (most significant bits:least\nsignificant bits). That is, the product is twice as wide as either multiplicand.\nHowever, for our purposes, for 16-bit multiplicands whose product isn't\nexpected to need more than 16 bits, we can treat `dx` as a clobbered register.\n\n`(i)mul` therefore disregards its caller's desired register since `(i)mul`\nalways outputs to `ax`. `(i)mul`, being symmetric, also asks both its\nchild/input nodes to produce their results in `ax` in the hopes that at least\none would end up in `ax`.\n\n            desired reg is ignored because output is in ax\n            |\n            v\n    (i)mul ax, src\n            |   |\n            |   v\n            v   desired reg = ax\n            desired reg = ax\n\nIf either of `(i)mul`'s inputs ends up in `dx`, it's kept there, since it can\nbe conveniently clobbered. However, if none of the inputs is in `dx`, `dx` may\nneed to be explicitly preserved (if `dx` holds some other partial result that\nwe don't want `(i)mul` to clobber).\n\nTo preserve `dx` we exchange the registers between one of the input nodes and\nthe node in `dx`. IOW, we reduce this to having one of the inputs in `dx`,\nwhich is safe to clobber.\n\nSo, through at most 2 register exchanges we can make one input in `ax` and the\nother, if needed, in `dx`.\n\nN.B. the 16 least significant bits of a product of two 16-bit integers are the\nsame for both signed and unsigned multiplication and so on the x86 we can use\n`mul` and `imul` interchangeably if we're interested in just those 16 bits of\nthe product.\n\n### (i)div\n\nComputing quotients and remainders with `(i)div` is overall trickier.\n\nThe instruction divides a pair of registers, `dx`:`ax` (most significant bits:\nleast significant bits), that is, a 32-bit integer, by a 16-bit input and the\noutputs are in `dx` (the remainder) and `ax` (the quotient). When we need to\ndivide 16-bit integers we zero- or sign-extend the dividend from 16 bits to 32\nbits, filling `dx` with the extension. `dx` being part of the dividend naturally\nprevents us from using `dx` for the divisor.\n\nBut first, we need the dividend in `ax`, for which we pass `HRegAX` as the\ndesired register into `AllocHRegs()` and `Ensure()` for the left input node. We\nuse `HRegNotDXNotAX` (meaning any register but `dx` or `ax`) as the desired\nregister for the right input node.\n\n             desired reg is ignored because output is in ax or dx\n             |\n             v\n    (i)div dx:ax, src\n               |   |\n               |   v\n               v   desired reg excludes dx and ax\n               desired reg = ax\n\nThen through at most one `xchg` we can make sure that the dividend is indeed in\n`ax`.\n\nIf we're unlucky and `Ensure()` (or the `xchcg`) gives us `dx` for the divisor\nor `dx` contains some other partial result, we `Allocate()` another register\n(and immediately `Free()` it, IOW, we just find an available register, spilling\nif necessary) and move to it whatever node is in `dx`, thereby freeing `dx` from\nanything useful.\n\nFinally, when we get to allocate the output register for `(i)div`, we allocate\n`ax` if we're interested in the quotient or we allocate `dx` if we're interested\nin the remainder.\n\nN.B. when computing how many registers an `(i)div`-based division/remainder\nnode needs, we factor in the above restriction around `dx`. This raises\nthe minimum from 2 to 3 registers, but otherwise the computation remains the\nsame as for e.g. `add`. The order of evaluation and the node numbering described\nearlier should take this into account.\n\n### Loads\n\nTo load a value from memory we need the memory address in one of 3 suitable\nregisters: `bx`, `si`, `di`. So, the child/input node supplying the address will\nget its `AllocHRegs()` and `Ensure()` called with `HRegAddr` as the desired\nregister. If we're unlucky to get the address in one of those registers, we can\nuse `xchg`, as usual, to bring the address into e.g. `bx`.\n\nA 16-bit value can be loaded into any 16-bit register, but an 8-bit value can\nonly be loaded into individual 8-bit subregisters, e.g. `al`, `bl`, `cl`, `dl`\n(the desired register can be set to `HRegByte` when we'd like a register with\n8-bit subregisters). Further, if we intend to sign-extend an 8-bit value from\nmemory to 16 bits with the `cbw` instruction, we need to load the value into\n`al`.\n\nOn top of that we want to load the value into the desired register if we can do\nit directly. So, before we call `Allocate()` to allocate the output register\nwe're going to examine the desired register value and see if it's available and\nsuitable. Again, if `Allocate()` doesn't give us a register with 8-bit\nsubregisters when we need one, we may use `xchg` to get us `ax`/`al` for the\noutput register.\n\n        desired reg may be ignored and \n        chosen from regs that have 8-bit subregs\n        |\n        v\n    mov r, [r]\n            |\n            v\n            desired reg can address memory\n\n### Stores\n\nStores are a bit simpler than loads, but they have the same fundamental\nrequirements of the memory address being in one of the 3 suitable registers\n(`bx`, `si`, `di`) and the 8-bit value being in one of the 4 suitable registers\n(`ax`/`al`, `bx`/`bl`, `cx`/`cl`, `dx`/`dl`) and we may need to `mov` or `xchg`\nto satisfy these requirements.\n\n             desired reg may be ignored\n             |\n             v\n    mov [r], r\n         |   |\n         |   v\n         v   desired reg may need to have 8-bit subregs\n         desired reg can address memory\n\nN.B. the store node's result (besides the direct effect of writing to memory)\ncan be the value that's being stored similar to how we can write `a = b = c;`\nin C/C++ to use the result of one assignment (`b = c`) as a value for another\nassignment (`a = ...`).\n\n## Resources\n\n*   See [Ershov number](https://en.wikipedia.org/wiki/Ershov_Number),\n    [Sethi-Ullman algorithm](https://en.wikipedia.org/wiki/Sethi%E2%80%93Ullman_algorithm).\n*   Engineering a Compiler 2nd ed. by Keith D. Cooper and Linda Torczon.\n    In particular, section 13.3.2 \"Bottom-Up Local Register Allocation\".\n*   Compiler Construction by William M. Waite and Gerhard Goos.\n    In particular, section 10.2.2 \"Targeting\".\n*   Compilers Principles, Techniques, and Tools (AKA the \"Dragon\" book) by\n    Alfred V. Aho, Ravi Sethi and Jeffrey D. Ullman.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falexfru%2Fregal86","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falexfru%2Fregal86","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falexfru%2Fregal86/lists"}