Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/radrow/radlang

A functional programming language intepreter with typeclasses, full type inference and lazy evaluation
https://github.com/radrow/radlang

haskell interpreter parsing programming-language

Last synced: 14 days ago
JSON representation

A functional programming language intepreter with typeclasses, full type inference and lazy evaluation

Awesome Lists containing this project

README

        

#+TITLE: Radlang programming language
#+AUTHOR: Radosław Rowicki

* About

Radlang is a pure functional programming language that targets to inherit most of Haskell98's features. It supports higher kinded polymorphic algebraic datatypes, full type inference, parametric and ad-hoc higher order polymorphism using typeclasses (known as intefaces) and rich syntax. The evaluation order is lazy by default, however the user is allowed to tweak it to some extent.

In the future it will probably support memory management via automated garbage collection and have read-eval-print loop.

* Build, Installation, Execution

** Build

This project uses stack, so you may want to install it as well. You can do it using your favourite package manager or by running

#+BEGIN_SRC bash
curl -sSL https://get.haskellstack.org/ | sh
#+END_SRC
(bad habit tho).

After that you can run ~stack build~ to download all dependencies and compile the interpreter.

** Install (optional)

If you want to install it locally you can run ~stack install~. The binaries will be copied to =~/.local/bin/= by default.

** Run

In case you didn't install Radlang in your ~PATH~, you can still execute the interpreter with ~stack exec rdl~. If you don't pass arguments to ~rdl~ the interpreter will read program from standard input. General usage is ~rdl filename.rdl~. Radlang will find the ~main~ value, evaluate it deeply (enforcing all lazy thunks) and print it to the stdout. ~main~ can have any type, but must be defined.

I provide two examples of programs in Radlang that can be found in ~examples~ directory:

- ~statet.rdl~ – this one shows off the support for typeclasses. The program computes the n-th fibonacci number in safe manner using monadic stack of ~StateT~ over ~OptionT~ over ~Id~. Implementation takes big advantage of interface inheritance and for-comprehensions (known in Haskell as do-notation) to be more readable and flexible. ~OptionT~ layer takes care of bad (negative) argument handling – technically this is an overkill, but it nicely presents how works the monadic action lifting.

- ~radlang-junior.rdl~ – Radlang Junior is a tiny imperative programming language interpreter written in Radlang. It is shipped with complete parser and CPS (continuation-passing-style) evaluator. Features static binding of variables, loops and if-statements. In the file there are multiple ~main~ definitions that test several programs.

* Syntax

#+BEGIN_SRC bnf

kind ::= "Type" | kind "->" kind | "(" kind ")"

general_typename ::= "~" upper_case_string

type ::= general_typename (* type var *)
| upper_case_string (* rigid type *)
| type "->" type (* function type *)
| type {type}+ (* type application *)
| "(" type ")"

predicate ::= type "is" upper_case_string

qualified_type ::= [predicate {"," predicate}* "|"] type

literal ::= signed number
| "'" character "'"
| "\"" {escaped_character}* "\""

pattern ::= lower_case_string (* var name *)
| "_" (* wildcard *)
| lower_case_string "@" pattern (* "as" pattern *)
| literal
| upper_case_string {pattern}*
| "(" pattern ")"

typedecl ::= lower_case_string ":" qualified_type

datadef ::= lower_case_string [pattern] ":=" expr

binding ::= datadef | typedecl

for_unit ::= pattern "<-" expr | expr

expr ::= lower_case_string (* variable *)
| upper_case_string (* constructor *)
| literal
| expr {expr}+ (* application *)
| "let" binding {"|" assignment}* "in" expr
| {"if" expr "then" expr}+ "else" expr (* multi way if *)
| "\ " pattern "->" expr (* lambda *)
| "match" expr "with" {"|" pattern "->" expr}+ (* pattern match *)
| "for" "{" [for_unit {"|" for_unit}*] "}" expr (* monadic for comprehension *)
| "[" [expr {"," expr}*] "]" (* list sugar *)
| "(" expr ")"

constructor_def ::= upper_case_string {type}*

newtype ::= "newtype" upper_case_string {"(" general_typename ":" kind ")"}* ":=" {constructor_def}*

interface ::=
"interface" upper_case_string "(" general_typename ":" kind ")"
["implies" upper_case_string {upper_case_string}*] (* superclasses *)
"{"
{typedecl ";;"}*
"}"

impl ::= (* interface implementation *)
"impl" qual_type "for" upper_case_string
"{"
{datadef ";;"}*
"}"

program ::= {(newtype | typedecl | datadef | interface | impl) ";;"}*

#+END_SRC

* Overview

** Basic definitions and examples

The program is a set of data definitions, type declarations, newtype definitions, interface declarations and implementations of the interfaces. Program must contain `main` value definition of any type – it will be the point where the evaluation starts. `main` will be deeply forced and will not contain any unevaluated thunk.

*** Hello world

#+BEGIN_SRC ocaml
main := "hello world";;
#+END_SRC

*** Use of toplevel function and ~if~ statement

#+BEGIN_SRC ocaml
identity x := x;;

main := if identity True then identity False else False;;
#+END_SRC

*** Type declaration and pattern matching

#+BEGIN_SRC ocaml
fun : Int -> Bool;;
fun 0 := True;;
fun x = match x with
| 4 -> False
| _ -> eqInt x 7
;;
#+END_SRC

Note that no variable may appear twice in a single set of patterns. Differing numbers of function arguments are not supported, so following code **won't** pass the syntax check:

#+BEGIN_SRC ocaml
f x := 1;;
f := const 2;;
#+END_SRC

*** For comprehension (monads <3)
This construction uses implicitly ~bind~ function like in Haskell's ~do~ notation:

#+BEGIN_SRC ocaml
main := for
{ x <- x_m
| y <- y_m
| guard (gtInt y x)
} unit (plusInt x y)
;;
#+END_SRC

*** Type declarations and data definitions

#+BEGIN_SRC ocaml
x : Int;;

notEq : ~A is Eq -> ~A -> ~A -> Bool;;

mplus : ~A is Semigroup, ~M is Monad | ~M ~A -> ~M ~A -> ~M ~A;;
#+END_SRC

In contrary to Haskell one may define variable without any value. To do so, the programmer must declare its type only:

#+BEGIN_SRC ocaml
bot : ~A;;

bot_int : Int;;
#+END_SRC

** New type definition

Types can be defined in a similar way to ADTs known from Haskell. However, the programmer must explicitly provide kind annotation for every type-argument:

#+BEGIN_SRC ocaml
newtype Bool := True | False;;

newtype List (~A : Type) := Nil | Cons ~A (List ~A);;

newtype StateT (~S : Type) (~M : Type -> Type) (~A : Type) :=
StateT (~S -> ~M (Pair ~S ~A))
#+END_SRC

Such data may be deeply pattern matched:

#+BEGIN_SRC ocaml
f l := match l with
| Nil -> 0
| Cons 3 (Cons x _) -> 1
| _ -> 2
#+END_SRC

** Laziness management

Every expression is evaluated lazily, that means following code

#+BEGIN_SRC ocaml
bot : ~A;;

main := (\a b -> a) 3 bot;;
#+END_SRC

will successfully return 3. However there are two built in functions that are able to interfere this behavior:

- ~force : ~A -> ~B -> ~B~ – forces its first argument to WHNF and returns the second (just like Haskell's ~seq~)
- ~deepForce : ~A -> ~B -> ~B~ – deeply forces all possible parts of the first argument

~main~ function will always implicitly call ~deepForce~ on its value. Examples:

#+BEGIN_SRC ocaml
test (Cons _ _) := True;;
test _ := False;;
bot : ~Any;;

main0 := test Nil;; -- False

main1 := test bot;; -- out of domain error

main2 := test (Cons bot bot);; -- True

main3 := force bot True;; -- out of domain error

main5 := let x := Cons bot bot in True;; -- True

main4 := force (Const bot bot) True;; -- True

main6 := deepForce (Cons bot bot) True;; -- out of domain error
#+END_SRC

** Interfaces

Interfaces are technically typeclasses. They may inherit each from other as long as they do not form cycles (that would be a true deviation).

One of the most fancy features of this system is that implementation may provide definitions for any upper interface in an inheriting interface:

#+BEGIN_SRC ocaml
interface Semigroup (~S : Type) {
plus : ~S -> ~S -> ~S;;
};;

interface Monoid (~S : Type) implies Semigroup {
null : ~S;;
};;

impl Int for Monoid {
plus := plusInt;;
null := 0;;
};;
#+END_SRC

** Stacktrace

Running program will keep two different stacktraces to ease debugging. The stacktraces will appear on every runtime error. For motivation of having two stacktraces, consider the following code:

#+BEGIN_SRC ocaml
main :=
let x := f 1
in g x;;

f x := divInt x 0;; -- division by zero!

g x := deepForce x x;;
#+END_SRC

This program will surely return an error, but what would its stacktrace be is a bit questional. In strict languages it would be something like – ~[main, f, divInt]~, because this is the place where runtime failed. However, because Radlang is lazy the error will be thrown at ~[deepForce, divInt]~. In order to solve this ambiguity Radlang provides both of them – first one is called "definition stacktrace", and the second one is "evaluation stacktrace".

One may wonder what happened to the ~g~ function in the evaluation stacktrace. The answer is, ~g~ was fully evaluated and returned thunk that contains ~deepForce x x~ with ~x~ assigned to ~f 1~. The error was actually thrown out of the main function so it is not mentioned either.

* What may be included, but doesn't have to

- infix operators
- explicitly typed GADTs
- garbage collection
- type aliasses
- REPL
- tensorflow bindings

* MIMUW course grade expectation

I expect to get maximum number of points if I finish all the features declared here (excluding "what may be included" section). Typeclasses are quite complicated in terms of typechecking and semantics, so in my opinion I would deserve it. I am going to take care over the syntax and in the end my target is to provide software that could be used for educational purposes.

* Special thanks

Special thanks to Wojciech Jabłoński, Krzysztof Pszeniczny and Marcin Benke who showed me necessary papers and algorithms without which writing this would be a total hell.