{"id":19974361,"url":"https://github.com/gbroques/compiler","last_synced_at":"2025-06-11T00:35:56.390Z","repository":{"id":94020651,"uuid":"125585931","full_name":"gbroques/compiler","owner":"gbroques","description":"A simple compiler written from scratch in C++ for an undergraduate course in program translation.","archived":false,"fork":false,"pushed_at":"2018-04-29T19:22:48.000Z","size":2590,"stargazers_count":49,"open_issues_count":0,"forks_count":13,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-05-04T02:39:13.658Z","etag":null,"topics":["assembly-language","compiler","compiler-design","compiler-frontend","compiler-optimization","lexical-analysis","parse-trees","parser","scanner"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gbroques.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-03-17T01:15:41.000Z","updated_at":"2025-04-10T14:28:44.000Z","dependencies_parsed_at":"2023-03-13T17:08:18.617Z","dependency_job_id":null,"html_url":"https://github.com/gbroques/compiler","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gbroques%2Fcompiler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gbroques%2Fcompiler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gbroques%2Fcompiler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gbroques%2Fcompiler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gbroques","download_url":"https://codeload.github.com/gbroques/compiler/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gbroques%2Fcompiler/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259176583,"owners_count":22817208,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["assembly-language","compiler","compiler-design","compiler-frontend","compiler-optimization","lexical-analysis","parse-trees","parser","scanner"],"created_at":"2024-11-13T03:14:47.206Z","updated_at":"2025-06-11T00:35:56.356Z","avatar_url":"https://github.com/gbroques.png","language":"C++","readme":"# Compiler\n\nThis is a simple compiler written for an undergraduate course in Program Translation.\n\n## Usage\n\n1. Run `make`.\n\n2. Create a program file. For example, `myprogram.txt`:\n\n```\n! myprogram.txt !\nprogram\nvar num\nstart\n  let num = 42 ,\n  print num ,\nend\n```\n\n3. Compile the program into assembly code.\n\n```\n$ comp myprogram.txt\n```\n\n4. Run the interpreter on the corresponding assembly code\n\n```\n$ asmb myprogram.asm\n```\n\n## Sample Programs and Language Features\n\n### Variables\n\n```\nprogram\nvar num\nstart\n  let num = 42 ,\n  print num ,\nend\n```\n\nOutput:\n\n```\n42\n```\n\n\n### Loops\n\n```\nprogram\nvar i\nstart\n  let i = 0 ,\n  iter (i \u003c 3)\n    start\n      print i ,\n      let i = (i + 1) ,\n    end ,\n  ,\nend\n```\n\nOutput:\n\n```\n0\n1\n2\n```\n\n\n### Conditionals\n\n```\nprogram\nstart\nif (10 \u003e 5)\n  print 1 ,\n,\nend\n```\n\nOutput:\n\n```\n1\n```\n\n#### Supported Operators\n\n* **\u003e** - Greater than\n* **\u003c** - Less than\n* **:** - Equals\n\n\n### Arithmetic and Expressions\n\n```\nprogram\nstart\n  print #(((2 + 2) * 3) / 4) ,\nend\n```\n\nOutput:\n\n```\n-3\n```\n\n**NOTE:** All operators have standard meaning except **#** means *negation*.\n\n\n### Input\n\n```\nprogram\nstart\n  var num\n  read num ,\n  print num ,\nend\n```\n\nThe program would print whatever the user input.\n\n\n### Comments\n\n```\nprogram\nstart\n  ! This is a comment !\n  print 1 ,\nend\n```\n\nComments are surrounded in exclamation points `!`.\n\n## Frontend\n\nThe frontend of our compiler is composed of two parts:\n\n1. Scanner - Converts a stream of characters into tokens\n2. Parser - Converts the tokens into a parse tree\n\nThe scanner uses a driver and state transition table.\n\n### Deterministic Finite Automaton\n![Deterministic Finite Automaton](assets/deterministic-finite-automaton.png)\n\nTo edit import `assets/deterministic-finite-automaton.json` at https://merfoo.github.io/fsm/\n\n\n### State Transition Table\n\nThe following table is located at `src/compiler/frontend/scanner/state_transition_table/state_transition_table.cpp`.\n\nThe function corresponding to the finite automaton driver is `Scanner::read()` in `src/compiler/frontend/scanner/scanner.cpp`.\n\nTo edit import `assets/state-transition-table.csv` into your favorite spreadsheet program.\n\n| 0-9         | !            | + - * / \u003c \u003e = : # | . ( ) , { } ; [ ] | a-z         | A-Z         | EoF          | White Space  |\n|-------------|--------------|-------------------|-------------------|-------------|-------------|--------------|--------------|\n| 1           | 9            | 10                | 11                | 12          | Error       | EoF          | 0            |\n| 3           | Integer      | Integer           | Integer           | Integer     | Integer     | Integer      | Integer      |\n| 2           | Integer      | Integer           | Integer           | Integer     | Integer     | Integer      | Integer      |\n| 4           | Integer      | Integer           | Integer           | Integer     | Integer     | Integer      | Integer      |\n| 5           | Integer      | Integer           | Integer           | Integer     | Integer     | Integer      | Integer      |\n| 6           | Integer      | Integer           | Integer           | Integer     | Integer     | Integer      | Integer      |\n| 7           | Integer      | Integer           | Integer           | Integer     | Integer     | Integer      | Integer      |\n| 8           | Integer      | Integer           | Integer           | Integer     | Integer     | Integer      | Integer      |\n| Error       | Integer      | Integer           | Integer           | Integer     | Integer     | Integer      | Integer      |\n| 9           | 0            | 9                 | 9                 | 9           | 9           | 9            | 9            |\n| Operator    | Operator     | Operator          | Operator          | Operator    | Operator    | Operator     | Operator     |\n| Delimiter   | Delimiter    | Delimiter         | Delimiter         | Delimiter   | Delimiter   | Delimiter    | Delimiter    |\n| 13          | Identifier   | Identifier        | Identifier        | 13          | 13          | Identifier   | Identifier   |\n| 14          | Identifier   | Identifier        | Identifier        | 14          | 14          | Identifier   | Identifier   |\n| 15          | Identifier   | Identifier        | Identifier        | 15          | 15          | Identifier   | Identifier   |\n| 16          | Identifier   | Identifier        | Identifier        | 16          | 16          | Identifier   | Identifier   |\n| 17          | Identifier   | Identifier        | Identifier        | 17          | 17          | Identifier   | Identifier   |\n| 18          | Identifier   | Identifier        | Identifier        | 18          | 18          | Identifier   | Identifier   |\n| 19          | Identifier   | Identifier        | Identifier        | 19          | 19          | Identifier   | Identifier   |\n| Error       | Identifier   | Identifier        | Identifier        | Error       | Error       | Identifier   | Identifier   |\n\n### BNF\n\nThe parser enforces the following grammar rules.\n\n\\\u003cS\u003e -\u003e **program** \\\u003cvars\u003e \\\u003cblock\u003e\n\n\\\u003cblock\u003e -\u003e **start** \\\u003cvars\u003e \\\u003cstats\u003e **end**\n\n\\\u003cvars\u003e -\u003e **var** **Identifier** \\\u003cvars\u003e | **empty**\n\n\\\u003cexpr\u003e -\u003e \\\u003cH\u003e **+** \\\u003cexpr\u003e | \\\u003cH\u003e **-** \\\u003cexpr\u003e | \\\u003cH\u003e **/** \\\u003cexpr\u003e | \\\u003cH\u003e **\\*** \\\u003cexpr\u003e | \\\u003cH\u003e\n\n\\\u003cH\u003e -\u003e **#** \\\u003cR\u003e | \\\u003cR\u003e\n\n\\\u003cR\u003e -\u003e ( \\\u003cexpr\u003e ) | **Identifier** | **Integer**\n\n\\\u003cstats\u003e -\u003e \\\u003cstat\u003e \\\u003cm_stat\u003e\n\n\\\u003cm_stat\u003e -\u003e \\\u003cstats\u003e | **empty**\n\n\\\u003cstat\u003e -\u003e \\\u003cin\u003e **,** | \\\u003cout\u003e **,** | \\\u003cblock\u003e **,** | \\\u003cifstat\u003e **,** | \\\u003cloop\u003e **,** | \\\u003cassign\u003e **,**\n\n\\\u003cin\u003e -\u003e **read** **Identifier**\n\n\\\u003cout\u003e -\u003e **print** \\\u003cexpr\u003e\n\n\\\u003cifstat\u003e -\u003e **if** **(** \\\u003cexpr\u003e \\\u003cO\u003e \\\u003cexpr\u003e **)** \\\u003cstat\u003e\n\n\\\u003cloop\u003e -\u003e **iter** **(** \\\u003cexpr\u003e \\\u003cO\u003e \\\u003cexpr\u003e **)** \\\u003cstat\u003e\n\n\\\u003cassign\u003e -\u003e **let** **Identifier** **=** \\\u003cexpr\u003e\n\n\\\u003cO\u003e -\u003e **\u003c** | **\u003e** | **:**\n\n## Backend\n\nThe backend of our compiler is composed of three parts:\n\n1. Static semantics\n2. Code generation\n3. and optimization\n\n### Static Semantics\n\nThe only static semantics imposed by the compiler are proper use of variables. Before using a variable, you must first declare it using the **var** keyword.\n\nIn our language scopes are imposed by blocks denoted by **start** and **end**, conditionals denoted by **if**, and loops denoted by **iter**.\n\nFor our compiler, we implement **local scoping** in contrast to global scoping.\n\n### Code Generation\nWe traverse the decorated parse tree for each node generate corresponding assembly code.\n\n### Optimization\nFor optimization we remove redundant assembly code statements to read from stack memory when we just wrote to that same location in stack memory.\n\nFor example, consider the following program:\n\n```\nprogram\nvar id1\nstart\nlet id1 = 2 ,\nprint id1 ,\nend\n```\n\nFor which the compiler generates the following assembly code:\n\n```\nPUSH\nPUSH\nLOAD 2\nSTACKW 1\nSTACKR 1\nSTORE T0\nWRITE T0\nPOP\nPOP\nSTOP\nT0 0\n```\n\nThe optimization removes the `STACKR 1` statement since it is immediately preceded by `STACKW 1`.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgbroques%2Fcompiler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgbroques%2Fcompiler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgbroques%2Fcompiler/lists"}