{"id":16996240,"url":"https://github.com/rustyraptor/compilers","last_synced_at":"2025-08-20T06:37:06.442Z","repository":{"id":86489618,"uuid":"330329335","full_name":"RustyRaptor/compilers","owner":"RustyRaptor","description":null,"archived":false,"fork":false,"pushed_at":"2021-09-14T15:38:03.000Z","size":3048,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-18T20:03:41.937Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"unlicense","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RustyRaptor.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-01-17T06:29:20.000Z","updated_at":"2021-09-14T15:38:07.000Z","dependencies_parsed_at":null,"dependency_job_id":"cab5b4ab-b990-42db-8b72-8bac1aee440c","html_url":"https://github.com/RustyRaptor/compilers","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/RustyRaptor/compilers","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RustyRaptor%2Fcompilers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RustyRaptor%2Fcompilers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RustyRaptor%2Fcompilers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RustyRaptor%2Fcompilers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RustyRaptor","download_url":"https://codeload.github.com/RustyRaptor/compilers/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RustyRaptor%2Fcompilers/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271279088,"owners_count":24731900,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-20T02:00:09.606Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-14T03:51:54.791Z","updated_at":"2025-08-20T06:37:06.410Z","avatar_url":"https://github.com/RustyRaptor.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DECAF Compiler\nLearning to build a compiler by implementing a compiler for the DECAF programming language. \n\n## Tools\n- Flex\n  - Lexical analysis tool. Takes text as input and generates tokens\n- Bison\n  - reads Tokens from Flex and performs a syntactical analysis.\n- GCC\n\n## Decaf\n\n### Introduction\nDecaf is a typed C-like language. The feature set is trimmed down considerably from what is usually part of a full-fledged programming language. This is done to keep the programming assignments manageable. Despite these limitations, Decaf will be able to handle interesting and non-trivial programs.\n\n##### Here is an example Decaf program:\n```decaf\nextern func print_int(int) void;\n\npackage GreatestCommonDivisor {\n    var a int;\n    var b int;\n\n    func main() int {\n        var x int;\n        var y int ;\n        var z int;\n        a = 20;\n        b = 10;\n        x = a;\n        y = b;\n        z = gcd(x, y);\n\n        // print_int is part of the standard input-output library\n        print_int(z);\n    }\n\n    // function that computes the greatest common divisor\n    func gcd(a int, b int) int {\n        if (b == 0) { return(a); }\n        else { return( gcd(b, a % b) ); }\n    }\n}\n```\n### Notation\nThe syntax is specified using Extended Backus-Naur Form (Links to an external site.) (EBNF):\n```\nProduction  = production_name \"=\" [ Expression ] \".\" .\nExpression  = Alternative { \"|\" Alternative } .\nAlternative = Term { Term } .\nTerm        = production_name | token [ \"...\" token ] | Group | Option | Repetition | Repetition+ | CommaList .\nGroup       = \"(\" Expression \")\" .\nOption      = \"[\" Expression \"]\" .\nRepetition  = \"{\" Expression \"}\" .\nRepetition+ = \"{\" Expression \"}+\" .\nCommaList   = \"{\" Expression \"}+,\" .\nProductions are expressions constructed from terms and the following operators, in increasing precedence:\n\n|    alternation\n()   grouping\n[]   option (0 or 1 Expression)\n{}   repetition (0 to n Expressions)\n{}+  repetition (1 to n Expressions)\n{}+, comma list (1 to n Expressions comma separated, e.g. x, y, z)\nLower-case production names are used to identify lexical tokens. Non-terminals are in CamelCase. Lexical tokens are enclosed in double quotes “” or back quotes \\`\\`.\n\nThe form a ... b represents the set of characters from a through b as alternatives. The horizontal ellipsis ... is also used elsewhere in the spec to informally denote various enumerations or code snippets that are not further specified. The character ... is not a token of the Decaf language.\n```\n### Source code representation\nDecaf source code is encoded as ASCII text. Upper and lower case characters are considered different characters. For example, if is defined as a keyword, but IF would be considered an identifier.\n\nASCII table\nThe ASCII table and decimal equivalent for each character is shown below:\n```\n  0 nul    1 soh    2 stx    3 etx    4 eot    5 enq    6 ack    7 bel\n  8 bs     9 ht    10 nl    11 vt    12 np    13 cr    14 so    15 si\n 16 dle   17 dc1   18 dc2   19 dc3   20 dc4   21 nak   22 syn   23 etb\n 24 can   25 em    26 sub   27 esc   28 fs    29 gs    30 rs    31 us\n 32 sp    33  !    34  \"    35  #    36  $    37  %    38  \u0026    39  '\n 40  (    41  )    42  *    43  +    44  ,    45  -    46  .    47  /\n 48  0    49  1    50  2    51  3    52  4    53  5    54  6    55  7\n 56  8    57  9    58  :    59  ;    60  \u003c    61  =    62  \u003e    63  ?\n 64  @    65  A    66  B    67  C    68  D    69  E    70  F    71  G\n 72  H    73  I    74  J    75  K    76  L    77  M    78  N    79  O\n 80  P    81  Q    82  R    83  S    84  T    85  U    86  V    87  W\n 88  X    89  Y    90  Z    91  [    92  \\    93  ]    94  ^    95  _\n 96  \\`    97  a    98  b    99  c   100  d   101  e   102  f   103  g\n104  h   105  i   106  j   107  k   108  l   109  m   110  n   111  o\n112  p   113  q   114  r   115  s   116  t   117  u   118  v   119  w\n120  x   121  y   122  z   123  {   124  |   125  }   126  ~   127 del\n```\n##### The set of valid characters in Decaf is all the ASCII characters:\n```\nall_char = /* all ASCII characters from 7 ... 13 and 32 ... 126 */ .\nchar = /* all ASCII characters from 7 ... 13 and 32 ... 126 except char 10 \"\\n\", char 92 \"\\\" and char 34 \"\"\" */ \nchar_lit_chars = /* all ASCII characters from 7 ... 13 and 32 ... 126 except char 39 \"'\" and char 92 \"\\\" */ .\nchar_no_nl = /* all ASCII characters from 7 ... 13 and 32 ... 126 except char 10 \"\\n\" */ .\nImplementation restriction: For compatibility with other tools, a compiler should always disallow the nul character (decimal: 0) in the source text.\n```\n### Letters and Digits\nThe underscore character _ is considered a letter.\n```\nletter        = \"A\" ... \"Z\" | \"a\" ... \"z\".\ndecimal_digit = \"0\" ... \"9\" .\nhex_digit     = \"0\" ... \"9\" | \"A\" ... \"F\" | \"a\" ... \"f\" .\ndigit         = \"0\" ... \"9\" .\nLexical Elements\nComments\nDecaf only has line comments that start with the character sequence // and stop at the newline character. The newline character is assumed to be part of the comment. The comment representation is as follows:\n\n// this is a line comment and it includes the newline at the end of the line\\n\n\ncomment = // { char_no_nl } \\n\nWhitespace\nWhitespace is used to separate tokens, and is defined as follows:\n\nnewline         = /* ASCII character nl : '\\n' */ .\ncarriage_return = /* ASCII character cr : '\\r' */ .\nhorizontal_tab  = /* ASCII character ht : '\\t' */ .\nvertical_tab    = /* ASCII character vt : '\\v' */ .\nform_feed       = /* ASCII character np : '\\f' */ .\nspace           = /* ASCII character sp : ' ' */ .\nwhitespace      = { newline | carriage_return | horizontal_tab | vertical_tab | form_feed | space }+ .\nThe following are special characters that are not part of white space:\n\nbell         = /* ASCII character bel : '\\a' */ .\nbackspace    = /* ASCII character bs : '\\b' */ .\n```\n### Tokens\nTokens are the vocabulary of the Decaf language. There are four classes: identifiers, keywords, operators and literals. White space is ignored except as it separates tokens that would otherwise combine into a single token. For example, int3 is a single token but int 3 is two tokens, a keyword int and integer 3; and int(3) is a sequence of four tokens: int, (, 3 and ).\n\nWhile breaking the input into tokens, the next token is the longest sequence of characters that form a valid token.\n\n### Semicolons\nThe Decaf language uses semicolons ; as a terminator in a number of productions.\n\n### Identifiers\nIdentifiers name program entities such as variables and types. An identifier is a sequence of one or more letters and digits. The first character in an identifier must be a letter.\n```\nidentifier = letter { letter | digit | _  } .\n```\n##### For example:\n```\na\nx9\nThisVariableIsInCamelCase\nType and constant identifiers are predeclared.\n```\n### Keywords\nThe following keywords are reserved and may not be used as identifiers.\n```\nbool    break   continue  else   extern  false   \nfor     func    if        int    null    package \nreturn  string  true      var    void    while  \n```\n### Operators and Delimiters\nThe following character sequences represent operators (see Operators section below) and delimiters.\n```\n{  }   [   ]   ,   ;   (   )  =  \n-  !   +   *   /   \u003c\u003c  \u003e\u003e  \u003c  \u003e  \n%  \u003c=  \u003e=  ==  !=  \u0026\u0026  ||  .\n```\n### Integer literals\nAn integer literal is a sequence of digits representing an integer constant. An optional prefix sets a non-decimal base: 0x or 0X for hexadecimal. In hexadecimal literals, letters a-f and A-F represent values 10 through 15.\n```\nint_lit     = decimal_lit | hex_lit .\ndecimal_lit = { decimal_digit }+ .\nhex_lit     = \"0\" ( \"x\" | \"X\" ) { hex_digit }+ .\n```\n\n##### For example, the following are integer literals:\n\n```\n42\n0xBadFace\n170141183460469231731687303715884105727\n```\n\nFor integer literals, the semantics of range checking occurs later, so that a long sequence of digits such as the last example above which is clearly out of range is still scanned as a single token. The semantic analyzer will come in later and reject this lexeme value as a valid integer constant.\n\n \n\n### String literals\nA string literal represents a string constant obtained from concatenating a sequence of characters (also see the Constants section below).\n```\nstring_lit = `\"` { char | escaped_char } `\"` .\n```\nA string literal must start and end on a single line, it cannot be split over multiple lines. It can include escape sequences like \\n and this is distinct from a newline character inside the string constant.\n\n##### For example, the following is legal:\n```\n\"\\n\" \n```\nBut the following is not legal:\n```\n\"\n\"\n```\n\nEmpty strings are allowed.\n```\n\"\"\n```\n \n\n### Type literals\nThe following are the keywords used to specify Decaf types.\n```\nint bool void string\n```\n### Boolean constant literals\nThe following keywords are used as constants for boolean types.\n```\ntrue false\n```\n### List of Tokens\nThe following is an alphabetically sorted list of tokens for Decaf. These are the Token names you shall use in your implementation.   Single character tokens will be sent from LEX to YACC as single character values.\n```\nT_AND            \u0026\u0026\nT_ASSIGN         =\nT_BOOLTYPE       bool\nT_BREAK          break\nT_CHARCONSTANT   char_lit (see section on Character literals)\nT_CONTINUE       continue\nT_DOT            .\nT_ELSE           else\nT_EQ             ==\nT_EXTERN         extern\nT_FALSE          false\nT_FOR            for\nT_FUNC           func\nT_GEQ            \u003e=\nT_GT             \u003e\nT_ID             identifier (see section on Identifiers)\nT_IF             if\nT_INTCONSTANT    int_lit (see section on Integer literals)\nT_INTTYPE        int\nT_LEFTSHIFT      \u003c\u003c\nT_LEQ            \u003c=\nT_NEQ            !=\nT_NULL           null\nT_OR             ||\nT_PACKAGE        package\nT_RETURN         return\nT_RIGHTSHIFT     \u003e\u003e\nT_STRINGCONSTANT string_lit (see section on String literals)\nT_STRINGTYPE     string\nT_TRUE           true\nT_VAR            var\nT_VOID           void\nT_WHILE          while\n```\n### Types\nDecaf has four types: void, booleans, integers and strings. String types, however, can only be used with extern functions. Void types are for return types of functions only (called MethodType below) and not used in variable declarations.\n```\nExternType = ( string | Type ) .\nType = ( int | bool ) .\nMethodType = ( void | Type ) .\n```\nDecaf also has a limited array type for arrays of integers and booleans.\n\n### Boolean types\nA boolean type represents the set of Boolean truth values denoted by the predeclared constants true and false. The predeclared boolean type is bool. This is represented as the LLVM type Int1.\n```\nBoolConstant = ( true | false ) .\nInteger types\nA integer type refers to the set of all signed 32 bit integers (-2147483648 to 2147483647) corresponding to the LLVM type Int32. The predeclared integer type is int.\n```\n### String types\nA string type represents the set of string values. A string value is a (possibly empty) sequence of bytes. Strings are immutable: once created, it is impossible to change the contents of a string. The predeclared string type is string.\n\n### Array types\nDecaf has integer and boolean arrays.\n```\nArrayType = \"[\" int_lit \"]\" Type .\n```\nAll arrays are one-dimensional and have a size that is fixed at compile-time. Arrays are indexed from 0 to n − 1, where n \u003e 0 is the size of the array. The usual bracket notation is used to index arrays. Since arrays have a compile-time fixed size and cannot be declared as method parameters, there is no facility to query the length of an array variable in Decaf.\n\n### Constants\nDecaf has boolean constants, integer constants, and string constants. Integer constants can be created using character literals and integer literals. BoolConstant is defined (in the Types section) as either true or false.\n```\nConstant = ( int_lit |  BoolConstant ) .\n```\nString constants can only be used with extern functions. See the Types section for more details.\n\n### Decaf program structure\n#### Program\n\nA Decaf program starts with optional external function declarations followed by the package definition (a Decaf package is like a module or namespace). A package has optional global variables (called field variables) followed by method (function) definitions.\n```\nProgram = Externs package identifier \"{\" FieldDecls MethodDecls \"}\" .\n```\n#### External Functions\nA Decaf program can access external function that are linked, such as the Decaf standard library functions which are implemented in C, and accessed from within the Decaf program as external functions. For now, only external functions are allowed. External data cannot be declared.\n```\nExterns    = { ExternDefn } .\nExternDefn = extern func identifier \"(\" [ { ExternType }+, ] \")\" MethodType \";\" .\n```\nGlobal variables\nDecaf has global variables with scope limited to their package that appear before any method declarations. Global variables in Decaf are called field declarations.  Variables are always defined using the var reserved word.\n```\nFieldDecls = { FieldDecl } .\nFieldDecl  = var identifier  Type \";\" .\nFieldDecl  = var  identifier  ArrayType \";\" .\n```\nThe assignment to an identifier has to be a constant:\n```\npackage foo { var a int; var b int = a; } // Invalid!\n```\nThe following is an example of an array field declaration. Notice the array type has the length before the type of the elements of the array.\n```\npackage foo { var list [100]int; } // Array declaration\n```\n#### Method declarations\nFunctions or methods in Decaf start with the reserved word func, then the name of the method and in parentheses is the argument list followed by the return type of the method.\n```\nMethodDecls = { MethodDecl } .\nMethodDecl  = func identifier \"(\" [ { identifier Type }+, ] \")\" MethodType Block .\n```\nThe program must contain a declaration for a method called main that has no parameters. The return type of the method main can be either type int or void. Execution of a Decaf program starts at this method main. Methods defined as part of a package can have zero or more parameters and must have a return statement of type MethodType explicitly or implicitly defined, e.g. if your main function which does not have an explicit return statement can either have a either int or void.\n\n#### Blocks\nDecaf blocks have a section for local variable definitions first followed by statements.\n```\nBlock = \"{\" VarDecls Statements \"}\" .\n```\n#### Variable Declarations\nLocal variables are declared using the reserved word var followed by an identifier for  and followed by the type of the variable. They cannot be assigned a value when they are defined.\n```\nVarDecls = { VarDecl } .\nVarDecl  = var  identifier  Type \";.\nVarDecl  = var identifier  ArrayType \";\"\n .\n ```\nThere is no assignment allowed for local variables:\n```\nfunc foo() int { var a int = 10; } // Invalid!\n```\nStatements\nStatements in Decaf consist of variable assignment, method calls, syntax for various kinds of control flow, special statements for breaking out of or continuing to the top of the block.\n\nStatements = { Statement } .\nBlocks statement\nStatements can also be Blocks (see section on Blocks).\n\nStatement = Block .\nAssign statement\nAssignment to an Lvalue is a statement in Decaf. The location for the Lvalue can be either a scalar variable or an array location.\n\nStatement = Assign \";\" .\nAssign    = Lvalue \"=\" Expr .\nLvalue    = identifier | identifier \"[\" Expr \"]\" .\nMethod calls\nStatement  = MethodCall \";\" .\nMethodCall = identifier \"(\" [ { MethodArg }+, ] \")\" .\nMethodArg  = Expr | string_lit .\nExternal functions are declared using the extern keyword. These functions are provided at using a separate library which is linked with your Decaf program at runtime. Some minimal type checking is done using the declaration. The most useful library functions that you will use are the print_string, print_int and read_int functions.\n\nUnless it is void, the return value is a type that can be assigned to an Lvalue:\n\nz = read_int(); \nIn this case, the integer variable z receives the result of calling the read_int library function. The return value can also be declared to be void in which case assigning the output of a library function to an Lvalue will result in a semantic error.\n\nIf statement\nStatement = if \"(\" Expr \")\" Block [ else Block ] .\nWhile statement\nStatement =  while \"(\" Expr \")\" Block .\nFor statement  (THis if for GRAD STUDENTS ONLYE)\nThe for loop in Decaf has the usual structure for ( init ; check ; post ) followed by the Block of the for loop.\n\nStatement = for \"(\" { Assign }+, \";\" Expr \";\" { Assign }+, \")\" Block .\nThe init, check and post parts of the for loop cannot be empty:\n\nfor(; a \u003c b; ) // Invalid!\nReturn statement\nStatement = return [ \"(\" [ Expr ] \")\" ] \";\" .\nThe following are all valid return statements:\n\nreturn(3);\nreturn(b); // where b was declared as \"var b bool;\"\nreturn();\nreturn;\nBreak statement\nA break statement terminates execution of the innermost for or while loop (branches to end of loop).\n\nStatement = break \";\" .\nContinue statement\nA continue statement begins the next iteration of the innermost for or while loop at its post statement (see For statement).\n\nStatement = continue \";\" .\nExpressions\nOperands\nOperands are the elementary values in an expression.\n\nExpr = identifier .\nExpr = MethodCall .\nExpr = Constant .\nUnary Operators\nThere are only one unary operators in Decaf. One for logical negation. The result of UnaryNot is of type bool.\n\nUnaryNot = \"!\" .\nBinary Operators\nBinary operators are split into boolean binary operators and arithmetic binary operators. The result of using a boolean operator is the type bool and the result of using an arithmetic operator is the type int.\n\nBinaryOperator = ( ArithmeticOperator | BooleanOperator ) .\nArithmeticOperator = ( \"+\" | \"-\" | \"*\" | \"/\" | \"\u003c\u003c\" | \"\u003e\u003e\" | \"%\" ) .\nBooleanOperator = ( \"==\" | \"!=\" | \"\u003c\" | \"\u003c=\" | \"\u003e\" | \"\u003e=\" | \"\u0026\u0026\" | \"||\" ) .\nThe boolean connectives \u0026\u0026 and || are interpreted using short circuit evaluation (Links to an external site.). This means: the second operand is not evaluated if the result of the first operand determines the value of the whole expression. For example, if the result is false for \u0026\u0026 or true for ||.\n\nBinary % computes the modulus of two numbers. Given two operands of type int, a and b: If b is positive, then a % b is a minus the largest multiple of b that is not greater than a. If b is negative, then a % b is a minus the smallest multiple of b that is not less than a (in this case the result will be less than or equal to zero).\n\nOperators and Precedence\nUnary operators have the highest precedence. For the other binary operators the precedence is defined as follows. All operators at the same precedence level get equal precedence. All operators with equal precedence associate left. The UnaryMinus operator associates to the right.\n\nPrecedence\tOperator\n6\tUnaryNot\n5\t* / % \u003c\u003c \u003e\u003e\n4\t+ -\n3\t== != \u003c \u003c= \u003e \u003e=\n2\t\u0026\u0026\n1\t||\nPrimary expressions\nPrimary expressions build larger expressions from operands, operators and parentheses. The parentheses are used to group expressions to obtain different orders of evaluation. Parentheses can be omitted if the desired evaluation is consistent with the precedence rules (see the Operators and Precedence section). The type of the Expr on the left hand side is determined by the Expr on the right hand side and the operator used.\n\nExpr = Expr BinaryOperator Expr .\nExpr = \"(\" Expr \")\" .\nIndex expression\nIn this expression, the identifier must be an Array Type (see section on Array Types). The Expr is evaluated to give an array index and the result of the evaluation must be of type int. The integer value is then used to find the element of the array which is of type int or bool depending on the Array Type.\n\nExpr = identifier \"[\" Expr \"]\" .\nDecaf grammar\nThe entire set of rules that describe the Decaf grammar specification is collected in one place below. For explanation of each of the rules read the descriptions provided in the above sections.\n\nProgram = Externs package identifier \"{\" FieldDecls MethodDecls \"}\" .\nExterns    = { ExternDefn } .\nExternDefn = extern func identifier \"(\" [ { ExternType }+, ] \")\" MethodType \";\" .\nFieldDecls = { FieldDecl } .\nFieldDecl  = var identifier  Type \";\" .\nFieldDecl  = var  identifier  ArrayType \";\" .\nFieldDecl  = var identifier Type \"=\" Constant \";\" .\nMethodDecls = { MethodDecl } .\nMethodDecl  = func identifier \"(\" [ { identifier Type }+, ] \")\" MethodType Block .\nBlock = \"{\" VarDecls Statements \"}\" .\nVarDecls = { VarDecl } .\nVarDecl  = var identifier  Type \";\" .\nVarDecl =  var  identifier  ArrayType \";\" \nStatements = { Statement } .\nStatement = Block .\nStatement = Assign \";\" .\nAssign    = Lvalue \"=\" Expr .\nLvalue    = identifier | identifier \"[\" Expr \"]\" .\nStatement  = MethodCall \";\" .\nMethodCall = identifier \"(\" [ { MethodArg }+, ] \")\" .\nMethodArg  = Expr .\nStatement = if \"(\" Expr \")\" Block [ else Block ] .\nStatement =  while \"(\" Expr \")\" Block .\nStatement = return [ \"(\" [ Expr ] \")\" ] \";\" .\nStatement = break \";\" .\nStatement = continue\";\" .\n\n\nExpr : Simpleexpression\nSimpleexpression : Additiveexpression\n                  | Simpleexpression Relop Additiveexpression\n                  ;\nRelop : T_LEQ | '\u003c' | '\u003e' | T_GEQ | T_EQ |  T_NEQ\n      ;\nAdditiveexpression : Term\n                    | Additiveexpression Addop Term\n                    ;\nAddop : '+' | '-'\n      ;\nTerm : Factor\n     | Term Multop Factor\n     ;\nMultop : '*' | '/'  | T_AND | T_OR | T_LEFTSHIFT | T_RIGHTSHIFT\n       ;\nFactor : T_IDENTIFIER\n     |  MethodCall\n     | T_IDENTIFIER '[' Expr ']'\n     | Constant\n     | '(' Expr ')'\n     | '!' Factor\n     | '-' Factor\n       ;\n\n\n\nExternType = ( string | Type ) .\nType = ( int | bool ) .\nMethodType = ( void | Type ) .\nBoolConstant = ( true | false ) .\nArrayType = \"[\" int_lit \"]\" Type .\nConstant = ( int_lit | BoolConstant ) .\nDecaf Semantics\nProgram Structure\nA method called main has to exist in the Decaf program.\nType Checking\nMake sure the following type checks are implemented in the compiler.\n\nBinary + - * / % \u003e\u003e \u003c\u003c \u003c \u003e \u003c= \u003e= only work on integer expressions.\nBinary \u0026\u0026 || and unary ! only work on boolean expressions.\nBinary == != work on any type, but both operands have to have the same type.\nAssignment to a function parameter is valid and should change the value as for a local variable\nThe \u0026\u0026 and || operators are short-circuiting (this is already specified in the spec)\nIf you have multiple return statements in one block then only the first is used, but the others should still be type checked.\nIndexing a scalar is a semantic error. { var x int; x[0] = 1; } is a semantic error, and { var x,y int; y = x[0]; } is a semantic error, and the same if x is a field variable.\nIndexing with a bool is a semantic error. { var xs[10] int; func main() int { var x int; x = xs[true]; } is a semantic error.\nUsing a non-bool expression for a loop condition is a semantic error. { while (1) {} } and { var x int; for (x = 0; 1; x = x + 1) {} } are semantic errors.\nUsing a non-bool expression in an if statement condition is a semantic error. { if (0) {} } is a semantic error.\nA return statement with an expression is not allowed in function with void return type. { func foo() void { return (1); } and { func bar() void {} func foo() void { return (bar()); } are both semantic errors.\nA return statement with no expression in a non-void function produces an default return value (see “Default values” section below).\nCannot use a void function in an expression. func foo() void {} func main() int { if (foo()) {} } is invalid.\nCannot call a method with the wrong number of arguments.\nFind all cases where there is a type mismatch between the definition of the type of a variable and a value assigned to that variable. e.g. bool x; x = 10; is an example of a type mismatch.\nFind all cases where an expression is well-formed, where binary and unary operators are distinguished from relational and equality operators. e.g. true + false is an example of a mismatch but true != true is not a mismatch.\nCheck that all variables are defined in the proper scope before they are used as an lvalue or rvalue in a Decaf program.\nCheck that the return statement in a method matches the return type in the method definition. e.g. func foo() bool { return(10); } is an example of a mismatch.\nDefault values\nInteger variables default to zero when initialized.\nBoolean variables default to False when initialized.\nFor function return type: integer return type defaults to zero.\nFor function return type: boolean return type defaults to True.\nFor function return values, here is an example:\n\nfunc panama() bool {\n  print_string(\"Panama\");\n}\nif (flag \u0026\u0026 panama()) { // panama() will return True\n ...\n}\nScoping Rules\nThis section clarifies the behaviour with scoping.\n\nHaving two fields with the same name is a semantic error.\nHaving two methods with the same name is a semantic error.\nHaving a field and a method with the same name is a semantic error.\nexterns count as methods for scoping.\nHowever, having an extern function with the same name as a function inside a package is allowed (the package should defined a new scope in which the local function is defined). See the example below.\nHaving two local variables with the same name declared at the same block is a semantic error. { var x int; var x int; } is an error, but { var x int; { var x int; } } is ok.\nHaving a local variable in the outer block of a method that has a parameter with the same name is a semantic error. func foo(x int) void { var x int; } is an error, but func foo(x int) void { { var x int; } } is ok.\nA function can be referred to anywhere in the program, including before its definition. package C { func foo() void { bar() }; func bar() void {}; } is ok.\nFunctions, fields, arguments, and local variables all share the same namespace (symbol table) and can shadow each other except for the above rules. e.g. in package C { func foo() void {}; func bar() void { var foo int; foo(); } } the foo in foo(); refers to the local int variable, not the function resulting in an error.\nbreak and continue only apply to the innermost containing loop. Using break or continue outside of a loop results in a semantic error.\nThe following code is acceptable because of the scope defined by the package for the functions in the package. The foo call in main uses the locally scoped foo (defined inside the package).\n\nextern func foo() int;\n\npackage Scoping {\n    func foo(x int) void { { var x int; } }\n    func main() int { foo(1);  }\n}\nStatements\nThese are semantic errors that can occur when using statements in Decaf.\n\nThere are no restrictions on the type of main but a return statement inside main must match the return type of main. For main function missing an explicit return statement, void return types return void, and for int return zero, and for bool return true.\nAssigning a scalar to an array is considered a type mismatch.\nThe following produce undefined behaviour, but must not produce compile time semantic errors:\nUsing the value of any uninitialized scalar variable or array element (this is allowed in the reference implementation)\nA function with no return statement is equivalent to ending the function with a return statement that has no expression: return; (this is allowed in the reference implementation)\nAssigning to an array cell at an invalid index can either produce a compile-time or runtime error.\nAny bool argument to a integer parameter must be converted while keeping its value, not just for print_int.\nPassing a argument to a function parameter with a different type is a semantic error except for the special case of passing a bool as an int.\nDeclaring an array of size less than or equal to zero is a semantic error.\nAssignment to a function parameter is valid and should change the value as for a local variable.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frustyraptor%2Fcompilers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frustyraptor%2Fcompilers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frustyraptor%2Fcompilers/lists"}