Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/alex-ozdemir/rlfsc


https://github.com/alex-ozdemir/rlfsc

Last synced: about 1 month ago
JSON representation

Lists

README

        

# Rust implementation of LFSC

## Installation Instructions

1. Install the Rust toolchain using [Rustup](https://rustup.rs/).
* Be sure to add Cargo's bin directory to your path
* i.e. by adding `export PATH="$HOME/.cargo/bin/:$PATH"` to your rc
file.
2. Run `cargo install rlfsc`

## Possible improvements over the original LFSC

* Tokenizer and checker are separate.
* Tokenizer is **not** character-by-character
* It's implemented using the `logos` library.
* Automatically reference-counted expressions using `std::rc::Rc`.
* No gotos/crazy stuff

## State of Affairs

On our small test proof (`pf`), we're ~30% slower than LFSC (10.8ms vs 8.0ms)
on my machine.

On the CVC4 signatures (`sig`), we're ~30% faster than LFSC (3.0ms vs 4.4ms)
on my machine.

This suggests that our tokenization is indeed better, but our checking is
slower.

We do lots of recursion, without tail recursion, causing us to blow small
stacks.

## Type checking algorithm

It's based on to Aaron's 2008 manuscript, "Proof Checking Technology for
Satisfiability Modulo Theories", but, in general, has no optimizations.

Notably:

* We always construct an AST for checked terms.
* In-scope identifiers are kept in a global map
* When an abstraction expression is instantiated, we substitute the provide
value
* e.g. `(\x t) v` becomes `t[v / x]`, literally. I.e. we take `t` apart,
looking for all the `x`'s, replace them with `v`'s, and rebuild `t`.
* In code expressions, we do not do this.
* e.g., when we evaluate `(let x v t)`, we put `x -> v` in the environment,
and evaluate `t`, rather than substituting.

## Todos

* Tail Recursion
* C historically has had poor support for tail recursion, which is why the
original LFSC manually implemented it with gotos.
* We can't (and won't) do that in Rust.
* We should implement a trampoline-based approach
* Term destruction
* When destructing terms (auto-implemented), we go into deep recursion.
* We should manually implement some destructors to prevent this.
* Implement a streaming tokenizer.
* Right now we need to bring the whole input into memory before tokenizing.
* Option 1: Modify `logos` to allow this by implementing `logos::Source`
for `std::io::Read`.
* Doesn't quite work. We'll need to modify the requirements of `Source`
a bit.
* Option 2: Switch to a different tokenizing library.
* Perhaps `nom`? It's typically used for parsing, but it does support
streaming.
* Option 3: Roll our own.
* I think this is a bad idea. If [this blog
post](https://blog.burntsushi.net/ripgrep/) taught me anything, it's
that high-performance text processing is more complex than most
people think. I think it would be very hard to match the perf of
a dedicated library like `logos` or `nom`.