Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/zellyn/gocool

Go implementation of the Cool programming language
https://github.com/zellyn/gocool

compiler cool coursera go

Last synced: about 2 months ago
JSON representation

Go implementation of the Cool programming language

Awesome Lists containing this project

README

        

* gocool

#+ATTR_HTML: title="No Maintenance Intended"
[[http://unmaintained.tech/][file:http://unmaintained.tech/badge.svg]]

Go implementation of the [[http://theory.stanford.edu/~aiken/software/cool/cool.html][Cool programming language]].

This repository contains a Golang implementation of the homeworks for
the [[https://www.coursera.org/course/compilers][Coursera compilers class]].

* Parts

The compiler is divided into four parts, corresponding to the four
programming homework assignments: lexing, scanning, typechecking, and
compilation. The four shell scripts testlexer.sh, testparser.sh,
testtypes.sh, and testcgen.sh test each of these four parts. The test
cases are obtained by running the "grading.pl" script for each
homework, and copying the test input and output. Minor changes have
been made, as advised in the homework instructions, typically with
regard to exact line numbers, although an effort has been made to
match the examples.

** Lexer
The lexer is a simple hand-written lexer, defined in parser/lex.go. It
is a modified version of the [[http://golang.org/src/pkg/text/template/parse/lex.go][go template lexer]], copied and
modified. For more background, read or watch Rob Pike's excellent
talk, [[https://google.com/search?q=lexical%20scanning%20in%20go][Lexical Scanning in Go]].

** Parser
The parser is a [[http://dinosaur.compilertools.net/yacc/][yacc]]-generated LALR(1) parser. Similarities to [[https://code.google.com/p/rsc/source/browse/cc/][Russ
Cox's C parser]] are due to the fact that it was frequently studied
while wrestling with yacc. The AST classes in ast.go were defined by
hand, and parser/dump.go contains functions that print them in the
format used by the C++/Java Cool scaffolding code provided in the
course materials.

** Typechecking
There's nothing interesting about the typechecking code: it was simply
implemented bit by bit until all the tests passed. It does a bit of
work (eg. symbol table generation) that could be considered part of
code generation.

** Compilation
The compiler (main code in cgen/cgen.go and cgen/expr.go) is about as
simple as possible: a stack machine with an accumulator ($a0), and
"self" saved in $s0 for easy reference. There are just a couple of
things not suggested in the homework assignment: the "isa"
tag-checking routine, and the embedding of prototype object addresses
and initialization functions into the dispatch tables.

No effort was made to do any optimizations or to use more registers.

testcgen.sh runs all the tests three times: once with no garbage
collection, once with garbage collection turned on, and once with both
garbage collection and collect-after-every-allocation debugging turned
on. If you set the environment variable NOSLOW before running it, it
will skip the third phase.

* Supplementary notes
** MIPS registers
|----------------+--------------+------------------------------------+-------------------|
| Number | Name | Function | Callee preserves? |
|----------------+--------------+------------------------------------+-------------------|
| $0 | zero | always has value zero | |
| $1 | at | assembler temporary | |
| $2-$3 | v0,v1 | function results. Can be temporary | |
| $4-$7 | a0,a1,a2,a3 | function arguments | |
| $8-$15,$24,$25 | t0,...,t7,t9 | temporary registers | |
| $16-$23 | s0,...,s7 | saved registers | Yes |
| $26,$27 | k0,k1 | Reserved for OS kernel | |
| $28 | gp | global pointer | Yes |
| $29 | sp | stack pointer | Yes |
| $30 | fp | frame pointer | Yes |
| $31 | ra | return address | |
|----------------+--------------+------------------------------------+-------------------|

** Code generation
Stack machine with accumulator.

Accumulator: $a0: holds result of last expression evaluation.
$s0 holds a stored copy of "self".
$sp points to next location in stack (so $sp+4 is the top of the stack)
$t0, $t1, $t2 are temporary registers.

** Activation Record
- Return address
- Old frame pointer
- n arguments
- NT(e) temporary locations

| arg_1 | |
| ⋮ | |
| arg_n | |
|----------------+-----------------|
| Old FP | <-- SP on entry |
| Return address | |
| Temp NT(e) | |
| ⋮ | |
| Temp 1 | <-- New FP |
| | <-- New SP |

** Object layout

| -4 | FFFF - garbage collector tag |
| 0 | class tag |
| 4 | object size (in 32-bit words) |
| 8 | dispatch pointer |
| 12 | attributes |
| ⋮ | ⋮ |

Strings:

| 12 | Length: pointer to an Int |
| 16 | Characters, nul-terminated, zero-padded to next word |
| ⋮ | ⋮ |