Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/egison/egison-haskell

Template Haskell Implementation of Egison Pattern Matching
https://github.com/egison/egison-haskell

backtracking egison haskell non-linear-pattern pattern-matching

Last synced: about 4 hours ago
JSON representation

Template Haskell Implementation of Egison Pattern Matching

Awesome Lists containing this project

README

        

# miniEgison: Template Haskell Implementation of Egison Pattern Matching
[![Build Status](https://travis-ci.org/egison/egison-haskell.svg?branch=master)](https://travis-ci.org/egison/egison-haskell)

This Haskell library provides the users with the pattern-matching facility against non-free data types.
Non-free data types are data types whose data have no standard forms.
For example, multisets are non-free data types because the multiset {a,b,b} has two other equivalent but literally different forms {b,a,b} and {b,b,a}.
This library provides the pattern-matching facility that fulfills the following three criteria for practical pattern matching for non-free data types: (i) non-linear pattern matching with backtracking; (ii) extensibility of pattern-matching algorithms; (iii) ad-hoc polymorphism of patterns.

The design of the pattern-matching facility is originally proposed in [this paper](https://arxiv.org/abs/1808.10603) and implemented in [the Egison programming language](http://github.com/egison/egison/).

## Grammar

This library provides two syntax constructs, `matchAll`, `match`, `matchAllDFS`, and `matchDFS` for advanced pattern matching for non-free data types.

```
e ::= hs-expr -- arbitrary Haskell expression
| matchAll e e [C, ...] -- match-all expression
| match e e [C, ...] -- match expression
| matchAllDFS e e [C, ...] -- match-all expression
| matchDFS e e [C, ...] -- match expression
| Something -- Something built-in matcher

C ::= [mc| p -> e |] -- match clause

p ::= _ -- wildcard pattern
| $v -- pattern variable
| #e -- value pattern
| ?e -- predicate pattern
| (p_1, p_2, ..., p_n) -- tuple pattern
| [p_1, p_2, ..., p_n] -- collection pattern
| p & p -- and-pattern
| p | p -- or-pattern
| !p -- not-pattern
| c p_1 p_2 ... p_n -- constructor pattern
```

## Usage

### The `matchAll` expression and matchers

The `matchAll` expression evaluates the body of the match clause for all the pattern-matching results.
The expression below pattern-matches a target `[1,2,3]` as a list of integers with a pattern `cons $x $xs`.
This expression returns a list of a single element because there is only one decomposition.

```hs
matchAll [1,2,3] (List Integer) [[mc| $x : $xs -> (x, xs)|]]
-- [(1,[2,3])]
```

The other characteristic of `matchAll` is its additional argument matcher.
A matcher is a special object that retains the pattern-matching algorithms for each data type.
`matchAll` takes a matcher as its second argument.
We can change a way to interpret a pattern by changing a matcher.

For example, by changing the matcher of the above `matchAll` from `List Integer` to `Multiset Integer`, the evaluation result changes as follows:

```hs
matchAll [1,2,3] (Multiset Integer) [[mc| $x : $xs -> (x, xs)|]]
-- [(1,[2,3]),(2,[1,3]),(3,[1,2])]
```

When the `Multiset` matcher is used, `:` (the cons pattern) decomposes a target list into an element and the rest elements.

The pattern-matching algorithms for each matcher can be defined by users.
For example, the matchers such as `List` and `Multiset` can be defined by users.
The `Something` matcher is the only built-in matcher.
`something` can be used for pattern-matching arbitrary objects but can handle only pattern variables and wildcards.
The definitions of `List` and `Multiset` are found [here](https://github.com/egison/egison-haskell/blob/master/src/Control/Egison/Matcher.hs).
We will write an explanation of this definition in future.

### Non-linear pattern

Non-linear pattern matching is another important feature of Egison pattern matching.
Non-linear patterns are patterns that allow multiple occurrences of the same pattern variables in a pattern.
For example, the program below pattern-matches a list `[1,2,5,9,4]` as a multiset and extracts pairs of sequential elements.
A non-linear pattern is effectively used for expressing the pattern.

```hs
matchAll [1,2,5,9,4] (Multiset Integer) [[mc| $x : #(x+1) : _ -> x|]]
-- [1,4]
```

### The `match` expression

The `match` expression takes a target, a matcher, and match-clauses as the `matchAll` expression.
The `match` expression returns only the evaluation result of the first pattern-matching result.

```hs
match [1,2,5,9,4] (Multiset Integer) [[mc| $x : #(x+1) : _ -> x|]]
-- 1
```

The `match` expression is simply implemented using `matchAll` as follows:

```hs
match tgt m cs = head $ matchAll tgt m cs
```

### `matchAllDFS` and `matchDFS`

The `matchAll` and `match` expressions traverse a search tree for pattern matching in breadth-first order.
The reason of the default breadth-first traversal is because to enumerate all the successful pattern-matching results even when they are infinitely many.
For example, all the pairs of natural numbers can be enumerated by the following `matchAll` expression:

```hs
take 10 (matchAll [1..] (Set Integer)
[[mc| $x : $y : _ -> (x, y) |]])
-- [(1,1),(1,2),(2,1),(1,3),(2,2),(3,1),(1,4),(2,3),(3,2),(4,1)]
```

If we change the above `matchAll` to `matchAllDFS`, the order of the pattern-matching results changes as follows:

```hs
take 10 (matchAllDFS [1..] (Set Integer)
[[mc| $x : $y : _ -> (x, y) |]])
-- [(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(1,7),(1,8),(1,9),(1,10)]
```

There are cases where depth-first traversal is suitable because the depth-first order of pattern-matching results is preferable.
Furthermore, `matchAllDFS` is more efficient than `matchAll`.
It would be better to use `matchAllDFS` instead of `matchAll` when the both expressions can be used.

### Matcher definitions

The users can define pattern-matching algorithms for each pattern by themselves.

preparing...

```hs
matchAll (1,2) UnorderedEqlPair [[mc| uepair $x $y -> (x,y) |]]
-- [(1,2),(2,1)]

matchAll (1,2) UnorderedEqlPair [[mc| uepair #2 $x -> x |]]
-- [1]
```

A matcher is represented as a data type whose name and constructor's name is identical.

preparing...

```hs
data UnorderedEqlPair = UnorderedEqlPair
instance (Eq a) -> Matcher UnorderedEqlPair (a, a)

uepair :: (Eq a)
=> Pattern a Eql ctx xs
-> Pattern a Eql (ctx :++: xs) ys
-> Pattern (a, a) UnorderedEqlPair ctx (xs :++: ys)
uepair p1 p2 = Pattern (\_ UnorderedEqlPair (t1, t2) ->
[twoMAtoms (MAtom p1 Eql t1) (MAtom p2 Eql t2)
,twoMAtoms (MAtom p1 Eql t2) (MAtom p2 Eql t1)])
```

```hs
matchAll (1,2) (UnorderedPair Eql) [[mc| uepair $x $y -> (x,y) |]]
-- [(1,2),(2,1)]

matchAll (1,2) (UnorderedPair Eql) [[mc| upair #2 $x -> x |]]
-- [1]
```

```hs
data UnorderedPair m = UnorderedPair m
instance Matcher m a => Matcher (UnorderedPair m) (a, a)

upair :: (Matcher m a , a ~ (b, b), m ~ (UnorderedPair m'), Matcher m' b)
=> Pattern b m' ctx xs
-> Pattern b m' (ctx :++: xs) ys
-> Pattern a m ctx (xs :++: ys)
upair p1 p2 = Pattern (\_ (UnorderedPair m') (t1, t2) ->
[twoMAtoms (MAtom p1 m' t1) (MAtom p2 m' t2)
,twoMAtoms (MAtom p1 m' t2) (MAtom p2 m' t1)])
```

## Samples

### Twin primes

We can extract all twin primes from the list of prime numbers by pattern matching:

```hs
take 10 (matchAll primes (List Integer)
[[mc| _ ++ $p : #(p+2) : _ -> (p, p+2) |]])
-- [(3,5),(5,7),(11,13),(17,19),(29,31),(41,43),(59,61),(71,73),(101,103),(107,109)]
```

It is also possible to enumerate all the pairs of prime numbers whose form is (p, p+6):

```hs
take 10 (matchAll primes (List Integer)
[[mc| _ ++ $p : _ ++ #(p+6) : _ -> (p, p+6) |]])
-- [(5,11),(7,13),(11,17),(13,19),(17,23),(23,29),(31,37),(37,43),(41,47),(47,53)]
```

### Poker hand

```hs
poker cs =
match cs (Multiset CardM)
[[mc| card $s $n :
card #s #(n-1) :
card #s #(n-2) :
card #s #(n-3) :
card #s #(n-4) :
[] -> "Straight flush" |],
[mc| card _ $n :
card _ #n :
card _ #n :
card _ #n :
_ :
[] -> "Four of a kind" |],
[mc| card _ $m :
card _ #m :
card _ #m :
card _ $n :
card _ #n :
[] -> "Full house" |],
[mc| card $s _ :
card #s _ :
card #s _ :
card #s _ :
card #s _ :
[] -> "Flush" |],
[mc| card _ $n :
card _ #(n-1) :
card _ #(n-2) :
card _ #(n-3) :
card _ #(n-4) :
[] -> "Straight" |],
[mc| card _ $n :
card _ #n :
card _ #n :
_ :
_ :
[] -> "Three of a kind" |],
[mc| card _ $m :
card _ #m :
card _ $n :
card _ #n :
_ :
[] -> "Two pair" |],
[mc| card _ $n :
card _ #n :
_ :
_ :
_ :
[] -> "One pair" |],
[mc| _ -> "Nothing" |]]
```

## Benchmark

We benchmarked this library using the program that enumerates the first 50 (p, p+6) primes.
This Haskell library is much faster than the original Egison interpreter!

```
$ cabal new-bench prime-pairs
...
benchmarking (p, p+6) pairs/50/egison
time 5.066 s (4.610 s .. 5.608 s)
0.999 R² (0.995 R² .. 1.000 R²)
mean 4.932 s (4.807 s .. 5.017 s)
std dev 120.2 ms (34.72 ms .. 161.7 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking (p, p+6) pairs/50/miniEgison
time 2.415 ms (2.264 ms .. 2.527 ms)
0.984 R² (0.975 R² .. 0.991 R²)
mean 2.196 ms (2.106 ms .. 2.266 ms)
std dev 252.3 μs (219.0 μs .. 296.6 μs)
variance introduced by outliers: 73% (severely inflated)
...
```

## Sponsors

Egison is sponsored by [Rakuten, Inc.](http://global.rakuten.com/corp/) and [Rakuten Institute of Technology](http://rit.rakuten.co.jp/).