https://github.com/stevana/smarrow-lang

Experimental programming language where programs are state machines expressed in arrow notation
https://github.com/stevana/smarrow-lang

distributed-systems programming-language state-machines

Last synced: 2 months ago
JSON representation

Experimental programming language where programs are state machines expressed in arrow notation

Host: GitHub
URL: https://github.com/stevana/smarrow-lang
Owner: stevana
License: mit
Created: 2023-06-05T08:32:22.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2023-06-28T11:39:20.000Z (almost 2 years ago)
Last Synced: 2025-02-08T11:34:52.811Z (4 months ago)
Topics: distributed-systems, programming-language, state-machines
Language: Haskell
Homepage:
Size: 132 KB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE

Awesome Lists containing this project

README

# smarrow-lang

`smarrow-lang` is an experimental programming language where programs are state
machines expressed in [arrow
notation](https://www.haskell.org/arrows/syntax.html).

State machines communicate with each other by sending messages, a bit like
Erlang processes but with more restrictions. For example, a state machine may
not spawn new state machines, and its state, input and output types are
specified at compile time.

These restrictions, or extra structure, are the basis for (to be implemented)
features such as:

* Hot-code swapping;
* Type-safe state migrations;
* Input and output evolution.
* [Simulation testing](https://apple.github.io/foundationdb/testing.html) and
formal verification.

The goal of the language is to provide a testbed for experimenting with how to
program distributed systems.

## Why a new language?

Barbara Liskov gave two good reasons in her Turing award
[lecture](https://amturing.acm.org/vp/liskov_1108679.cfm) (2008):

> "Programmers write programs in programming languages, so programming language
> constructs are a very good way of communicating to programmers if you are
> trying to get them to understand how to build programs."

> "There's a funny disconnect in how we write distributed programs. You write
> your individual modules, but then when you want to connect them together
> you're out of the programming language and into this other world. Maybe we
> need languages that are a little bit more complete now, so that we can write
> the whole thing in the language."

## Getting started

```bash
$ cabal run smarrow-deploy &
$ cat example/Counter.smarr
```
```
machine Counter where

state : { count : Int = 0 }

language
: Incr -> {}
| Read -> Int

function i -> case i of
{ Incr -> do { s <- get -< {}; put -< { count = s.count + 1 } }
; Read -> get -< {}
}
```
```bash
$ cabal run smarrow -- deploy counter example/Counter.smarr
Deployed: counter
$ cabal run smarrow -- invoke counter Incr
{}
$ cabal run smarrow -- invoke counter Read
1
```
```diff
$ diff -u example/Counter.smarr example/Counter2.smarr
- put -< { count = s.count + 1 }
+ put -< { count = s.count + 2 }
$ cabal run smarrow -- deploy counter example/Counter2.smarr
Upgraded: counter
```
```bash
$ cabal run smarrow -- invoke counter Read
1
$ cabal run smarrow -- invoke counter Incr
{}
$ cabal run smarrow -- invoke counter Read
3
$ fg
cabal run smarrow-deploy
^C
```

## Features

### First-class state machines

In functional programming languages functions are first-class. In `smarrow` we
experiment with making state machines first-class instead.

### Arrow notation

State machines are to `smarrow` what functions are to functional programing
languages. A state machine has the type:

```haskell
input -> state -> (state, output)
```

It's a bit clunky to have to thread through the state explicitly though. We
could have used a state monad and thereby get do-notation, which would improve
the egonomics of writing state machines, but in `smarrow` we go down an other
path and use arrow notation instead for reasons we shall explain next.

### Hot-code swapping

Conal Elliott's work on [compiling to
categories](http://conal.net/papers/compiling-to-categories/) shows how `Arrow`s
that don't use `arr` can be compiled to first-order combinators (Cartesian
closed categories).

We use this fact to compile our state machines (which are `Arrow`s that don't
use `arr`) to a bytecode inspired by Cartesian closed categories. This bytecode
can be sent over the write to remote nodes and update state machines without
downtime, similar to hot-code swapping à la Erlang.

### Adapters

State machines are pure, if we want them to do something useful we must hook
them up to inputs from the "real world". Adapters allow us to turn, for example,
an HTTP server request into an input which we can then feed to a state machine
and reply to the request using the output that the state machine produced.

Similary console I/O or a REPL could be used as adapters.

### Simulation testing

Another useful adaptor is to deploy a bunch of communicating state machines on a
fake network where all messages are kept in an in-memory priorty queue sorted by
arrival time. We can then simulate the network, by popping the queue, advancing
the clock to the arrival time, feed it to the receiving state machine, collect
the outputs, generate arrival times for the outputs and feed them back into the
queue. The arrival times are generated using a pseudo number generater fed with
a seed, this allows us to determinstically get different interleaving of
messages. The fact that we advance the clock to the arrival time makes timeouts
happen without having to wait for them, which is useful when combined with fault
injection of network faults. Essentially this gives us a discrete-event
simulation, which is much faster than testing using Jepsen and is determinstic
too!

### Time-travelling debugger

If we record all inputs that we feed the state machines, then we can implement a
time-travelling debugger by merely feeding the inputs one-by-one to the state
machines and observing how the states change over time.

We can use snapshots and fixed sized ring-buffers of inputs in case there are
too many inputs to store then all.

### Deployments via supervision trees

XXX:

### No modules or packages

A state machine is a unit of deployment, unlike a (pure) function. A state
machine must be under version control before it can be deployed, it's therefore
globally uniquely identifiable and therefore there's no need to further package
it up.

State machines carry automatically checked semantic versioning and other
metadata making it possible to search for state machines, subscribe to
(security) updates for ones that you've deployed, etc.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/stevana/smarrow-lang

Awesome Lists containing this project

README