Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/spaghettifunk/ayec-compiler

Compiler Project at Uppsala University
https://github.com/spaghettifunk/ayec-compiler

Last synced: 11 days ago
JSON representation

Compiler Project at Uppsala University

Awesome Lists containing this project

README

        

# AyeC-Compiler
Compiler Project at Uppsala University

This project has been made by Anton Weber and Davide Berdin to fulfil the course of Compiler Project.
AyeC compiler will understand a subset of the C language called uC Language.

- The Parser uses JavaCC (https://javacc.java.net)
- The rest of the code has been written in Java

Credits for the page goes to Alexandra Jimborean

The uC language









The uC language



uC is a small subset of C, corresponding to a typical imperative,
procedural language. The following sections describe in more detail
which language elements are present.

Every uC program is also a valid C program. The syntax and semantics
of uC is the same as that for full C, within the restrictions
described here.



Lexical elements


  • Decimal integer constants and character literals.
    A character literal contains either a single printable
    character, or the \n escape sequence (line break).
    A character literal denotes an integer constant whose
    value is the representation code of the character.
  • Alphanumeric identifiers: non-empty sequences of
    letters or digits starting with a letter.
    An underscore is treated as a letter.
  • Keywords: char, else, if, int, return, void, and while.
  • Special symbols:
    !=, !, &&, (, ), *, +, , (comma), -,
    /, ;, <=, <, ==,
    =, >=,
    >, [, ], {, }.
  • White space characters: blank (32), newline (10),
    carriage return (13), form feed (12), and tab (9).
    The numbers are the ASCII representation codes for the
    characters.

    Comments: /* followed by anything and terminated
    by */, and // followed by anything
    and terminated by end of line.





Syntax


  • Primary expressions: constants, identifiers,
    function calls, array indexing, and expressions
    within parentheses.

    Unary expressions with the ! and -
    unary operators.

    Binary expressions with the +, -,
    *, /,
    <, >,
    <=, >=,
    ==,
    !=,
    &&, and = operators.

    Statements: expression statements, the empty statement,
    if statements with or without else,
    while statements, return statements,
    and compound statements (blocks), i.e., statements
    enclosed in { }.

    Local variable declarations are only permitted at the
    top-level function body block, not in nested blocks.


  • Variable declarations: base type followed by
    variable name, and for arrays followed by the
    array size (an integer constant) in square brackets.

    Multi-dimensional arrays, pointers, and structures
    are not included.

    Initializes in variable declarations are not included.


  • Function definitions: return type or void,
    function name, parameter list, and body (compound
    statement) in that order.

    The parameter list is either void, meaning no parameters, or a
    sequence of variable declarations separated by , (comma).
    An array parameter in a function head is written without array
    size, i.e., with only the brackets.

    An external (library) function can be declared by writing
    a function definition without body, terminated with a
    ; (semi-colon).

    Variable-arity functions are not included.





Program execution


  • Execution starts at the user-defined function main
    which takes no parameters and returns int.
    Execution ends when main returns.
  • The standard library is uC-specific since uC excludes
    variable-arity functions, and this makes printf
    and scanf-like functions impossible.
    To use the library, include the following declarations
    at the start of your uC source file:

        void putint(int i);       // prints to stdout

void putstring(char s[]); // prints to stdout
int getint(void); // reads from stdin
void getstring(char s[]); // reads from stdin



Notes


  • Array parameters to functions are passed by reference,
    as in full C, but the formal parameter still behaves
    like an array variable and not as a pointer variable.
    For example:
    void f(int a[], int b[])
    
    {
    a[3] = 27; // legal
    a = b; // illegal in uC, legal in full C
    }




Example

/* This is an example uC program. */

void putint(int i);

int fac(int n)
{
if (n < 2)
return n;
return n * fac(n - 1);
}

int sum(int n, int a[])
{
int i;
int s;

i = 0;
s = 0;
while (i < n) {
s = s + a[i];
i = i + 1;
}
return s;
}

int main(void)
{
int a[2];

a[0] = fac(5);
a[1] = 27;
putint(sum(2, a)); // prints 147
return 0;
}



Informal grammar for uC

This is an informal context-free grammar for uC:


  • The start symbol is program.
  • Keywords and special symbols are written within double-quotes.

  • /empty/ denotes the empty string.

  • intconst and ident denote classes of lexical elements.
  • Associativity and precedence for expression operators is not expressed.
  • The grammar has not been adjusted to fit any particular parsing method.

program         ::= topdec_list

topdec_list ::= /empty/ | topdec topdec_list
topdec ::= vardec ";"
| funtype ident "(" formals ")" funbody
vardec ::= scalardec | arraydec
scalardec ::= typename ident
arraydec ::= typename ident "[" intconst "]"
typename ::= "int" | "char"
funtype ::= typename | "void"
funbody ::= "{" locals stmts "}" | ";"
formals ::= "void" | formal_list
formal_list ::= formaldec | formaldec "," formal_list
formaldec ::= scalardec | typename ident "[" "]"
locals ::= /empty/ | vardec ";" locals
stmts ::= /empty/ | stmt stmts
stmt ::= expr ";"
| "return" expr ";" | "return" ";"
| "while" condition stmt
| "if" condition stmt else_part
| "{" stmts "}"
| ";"
else_part ::= /empty/ | "else" stmt
condition ::= "(" expr ")"
expr ::= intconst
| ident | ident "[" expr "]"
| unop expr
| expr binop expr
| ident "(" actuals ")"
| "(" expr ")"
unop ::= "-" | "!"
binop ::= "+" | "-" | "*" | "/"
| "<" | ">" | "<=" | ">=" | "!=" | "=="
| "&&"
| "="
actuals ::= /empty/ | expr_list
expr_list ::= expr | expr "," expr_list

Expression operator precedence table

Prefix unary operators

14: - !

Infix operators

13L: * /

12L: + -

10L: < > <= >=

9L: == !=

5L: &&

2R: =

The numbers to the left indicate precedence; larger numbers indicate
higher precedence. L indicates left-associative operators and R
indicates right-associative operators. The table only contains the C
operators that are included in uC.