An open API service indexing awesome lists of open source software.

https://github.com/8dcc/sl

Simple Lisp interpreter
https://github.com/8dcc/sl

evaluator interpreter lisp lisp-interpreter parser simple

Last synced: 24 days ago
JSON representation

Simple Lisp interpreter

Awesome Lists containing this project

README

        

#+title: Simple Lisp
#+options: toc:nil
#+startup: showeverything
#+author: 8dcc

#+TOC: headlines 2

Simple Lisp interpreter made in C with no dependencies.

Articles I wrote about Lisp, and this interpreter in particular:

- [[https://8dcc.github.io/programming/cons-of-cons.html][The pros and cons of Cons]], specifically about this implementation.
- [[https://8dcc.github.io/programming/conditional-lisp-macros.html][Replacing conditional Lisp primitives with macros]], more of a proof of concept
than anything practical.
- [[https://8dcc.github.io/programming/understanding-y-combinator.html][Understanding the Y combinator]], about lambda calculus and Lisp in general.
- [[https://8dcc.github.io/programming/pool-allocator.html][Writing a simple pool allocator in C]], for my [[https://github.com/8dcc/libpool][libpool]] project, but also used in
this interpreter.

Some good resources for learning about Lisp:

- [[https://mitp-content-server.mit.edu/books/content/sectbyfn/books_pres_0/6515/sicp.zip/index.html][Structure and Interpretation of Computer Programs]] by Hal Abelson and
Gerald Jay Sussman.
- [[https://paulgraham.com/acl.html][ANSI Common Lisp]] by Paul Graham.
- The [[https://www.scheme.org/][Official Scheme website]].
- The [[https://groups.csail.mit.edu/mac/ftpdir/scheme-7.4/doc-html/scheme_toc.html][MIT Scheme Reference]].
- The [[https://www.gnu.org/software/guile/manual/][GNU Guile manual]].
- The [[https://conservatory.scheme.org/schemers/Documents/Standards/R5RS/HTML/][Scheme R5RS Specification]].
- [[https://www.gnu.org/software/emacs/manual/html_mono/eintr.html][An Introduction to Programming in Emacs Lisp]] by Robert J. Chassell.
- [[https://www.buildyourownlisp.com/][Build Your Own Lisp]] by Daniel Holden, although this project doesn't
follow it.

* Building

#+begin_src console
$ git clone https://github.com/8dcc/sl
$ cd sl
$ make
...
#+end_src

* Usage

The full manual can be found in the [[file:doc/sl-manual.org][doc]] directory. When running =make= from that
directory, the initial Org file is exported to =.texi= using Emacs, and then from
=.texi= to =.pdf= and =.html= using =makeinfo=. Note that both Emacs and =makeinfo= can
export the original Org file to more formats, if needed.

The [[file:test/][test]] folder also contains the Lisp code used for testing the interpreter.

#+begin_src console
$ ./sl
sl> (+ 1 2 (- 5 4) (* 3 4) 10)
26

sl> (cons 'a 'b)
(a . b)

sl> (cons 'a '(b . (c . nil)))
(a b c)

sl> (define my-global 10.0)
10.000000

sl> (defun my-func (x y)
(* my-global (+ x y)))

sl> (my-func 3 4)
70.000000

sl> `(a b ,my-global c d ,@(list 1 2 3))
(a b 10.000000 c d 1 2 3)

sl> (defmacro inc (var-name)
`(define ,var-name (+ ,var-name 1)))

sl> (inc my-global)
11.000000

sl> my-global
11.000000
#+end_src

* Overview of the code

This Lisp has a few components that are responsible for their own data. This is
the basic [[https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop][REPL]] process:

1. The user input is read using =read_expr()=, defined in [[file:src/read.c][read.c]]. This function
will read a single Lisp expression across lines, and save the data in an
allocated string.
2. The raw user input is converted into an array of =Token= structures using the
=tokenize()= function, which calls the static function =get_token()=. This step
might be redundant for a simple language as Lisp, but I decided to do it
anyway. After this step the =Token= array could be printed using =token_print()=,
if needed. All these functions are defined in [[file:src/lexer.c][lexer.c]].
3. The =Token= array is converted into a linked list of =Expr= structures using the
=parse()= function from [[file:src/parser.c][parser.c]]. This linked list is essentially the
[[https://en.wikipedia.org/wiki/Abstract_syntax_tree][Abstract Syntax Tree]] (AST). At this point, nothing has been evaluated; it
should just be a different representation of the user input. All the values
allocated by =tokenize()= have been copied into the AST instead of reused, so
they can be freed safely.
4. We evaluate the expression using =eval()=, defined in [[file:src/eval.c][eval.c]]. This function
will return another linked list of =Expr= structures but, just like =parse()=, it
will not reuse any data in the heap, so the old =Expr*= can be freed
safely. This is the rough evaluation process:
1. Before evaluating the expression, it checks if the current expression is a
[[https://web.mit.edu/6.001/6.037/sicp.pdf#subsection.4.1.1][Special Form]], which are evaluated differently from procedure calls. For
example =quote=, =lambda= or =if=. The arguments of these special forms are
passed to the C primitives (defined in [[file:src/primitives.c][primitives.c]]) un-evaluated.
2. If it wasn't a special form, the expression type is checked. Some
expressions like numbers evaluate to themselves, while some others
don't.
3. If the expression was a symbol, the environment is searched to find the
expression bound to that symbol. This is done with the =env_get()= function,
defined in [[file:src/env.c][env.c]].
4. If the expression was a list, it is evaluated as a procedure or macro
call. This is the evaluation process of a call:
1. The first element of the list is evaluated. It should return either a
lambda, a macro or a C primitive.
2. If the function was a macro, the arguments are *not* evaluated and they
are passed to =macro_call()=, defined in [[file:src/lambda.c][lambda.c]]. From there, the macro
is expanded with =macro_expand()= and the expanded expression is
evaluated and returned. For more information on how macros behave in
this Lisp, see [[https://www.gnu.org/software/emacs/manual/html_node/elisp/Macros.html][Emacs Lisp manual]].
3. If the function was not a macro, each argument should be evaluated
before calling the function. This is done with the =eval_list()= static
function. Then the function is applied to the arguments with =apply()=,
also defined in [[file:src/eval.c][eval.c]].
4. The function type is checked inside =apply()=. If it's a C primitive, the
function pointer stored in the =Expr= is called with the arguments we got
from =eval()=. If it's a lambda, it is called using =lambda_call()=,
defined in [[file:src/lambda.c][lambda.c]].
5. The =lambda_call()= function operates on the =LambdaCtx= structure of the
=Expr=. It binds each formal argument to the lambda's environment; sets
the parent environment (so the body can access globals); and evaluates
each expression in the body in order, returning the last one.
5. After that, we print the evaluated expression using =expr_print()=, defined in
[[file:src/expr.c][expr.c]].

* Todo list

These are some things that need to be done. Feel free to make a PR if you want
to contribute.

** Tail-call optimization

The following code defines a /recursive procedure/ that performs an /iterative
process/.

#+begin_src lisp
(defun sum-iter (i end total)
(if (> i end)
total
(sum-iter (+ i 1)
end
(+ total i))))

(sum-iter 1 5 0) ; 15
#+end_src

Even though that /procedure/ is recursive, since it calls itself, the /process/ is
iterative, because it has all the necessary information for continuing the
computation in its parameters. The interpreter doesn't *need* to keep track of
where it was called from, it can just jump to the start of the function with the
new parameters and no information will be lost. This jump optimization is called
/tail-call optimization/, and an interpreter with this feature is called
/tail-recursive/. For more information, see [[https://web.mit.edu/6.001/6.037/sicp.pdf#subsection.1.2.1][section 1.2.1 of SICP]].