Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/spl/dlist
Difference lists in Haskell
https://github.com/spl/dlist
data-structures difference-lists haskell
Last synced: 6 days ago
JSON representation
Difference lists in Haskell
- Host: GitHub
- URL: https://github.com/spl/dlist
- Owner: spl
- License: bsd-3-clause
- Created: 2013-11-12T10:35:22.000Z (about 11 years ago)
- Default Branch: main
- Last Pushed: 2024-07-04T12:33:34.000Z (6 months ago)
- Last Synced: 2024-12-24T10:44:34.225Z (18 days ago)
- Topics: data-structures, difference-lists, haskell
- Language: Haskell
- Homepage: https://hackage.haskell.org/package/dlist
- Size: 229 KB
- Stars: 65
- Watchers: 6
- Forks: 15
- Open Issues: 7
-
Metadata Files:
- Readme: readme.md
- Changelog: changelog.md
- License: license.md
Awesome Lists containing this project
README
# Difference Lists
[![test-badge][]][test]
[![hackage-badge][]][hackage-dlist]
[![packdeps-badge][]][packdeps]_**List-like types supporting O(1) `append` and `snoc` operations.**_
## Installation
[`dlist`][hackage-dlist] is a Haskell package available from [Hackage][hackage].
It can be installed with [`cabal`][cabal] or [`stack`][stack].See the [change log][changelog] for the changes in each version.
## Usage
Here is an example of “flattening” a `Tree` into a list of the elements in its
`Leaf` constructors:```haskell
import qualified Data.DList as DListdata Tree a = Leaf a | Branch (Tree a) (Tree a)
flattenSlow :: Tree a -> [a]
flattenSlow = go
where
go (Leaf x) = [x]
go (Branch left right) = go left ++ go rightflattenFast :: Tree a -> [a]
flattenFast = DList.toList . go
where
go (Leaf x) = DList.singleton x
go (Branch left right) = go left `DList.append` go right
```(The above code can be found in the [benchmark][].)
`flattenSlow` is likely to be slower than `flattenFast`:
1. `flattenSlow` uses `++` to concatenate lists, each of which is recursively
constructed from the `left` and `right` `Tree` values in the `Branch`
constructor.2. `flattenFast` does not use `++` but constructs a composition of functions,
each of which is a “cons” introduced by `DList.singleton` (`(x :)`). The
function `DList.toList` applies the composed function to `[]`, constructing
a list in the end.To see the difference between `flattenSlow` and `flattenFast`, consider some
rough evaluations of the functions applied to a `Tree`:```haskell
flattenSlow (Branch (Branch (Leaf 'a') (Leaf 'b')) (Leaf 'c'))
= go (Branch (Branch (Leaf 'a') (Leaf 'b')) (Leaf 'c'))
= go (Branch (Leaf 'a') (Leaf 'b')) ++ go (Leaf 'c')
= (go (Leaf 'a') ++ go (Leaf 'b')) ++ "c"
= ("a" ++ "b") ++ "c"
= ('a' : [] ++ "b") ++ "c"
= ('a' : "b") ++ "c"
= 'a' : "b" ++ "c"
= 'a' : 'b' : [] ++ "c"
= 'a' : 'b' : "c"
``````haskell
flattenFast (Branch (Branch (Leaf 'a') (Leaf 'b')) (Leaf 'c'))
= toList $ go (Branch (Branch (Leaf 'a') (Leaf 'b')) (Leaf 'c'))
= toList $ go (Branch (Leaf 'a') (Leaf 'b')) `append` go (Leaf 'c')
= unsafeApplyDList (go (Branch (Leaf 'a') (Leaf 'b'))) . unsafeApplyDList (go (Leaf 'c')) $ []
= unsafeApplyDList (go (Branch (Leaf 'a') (Leaf 'b'))) (unsafeApplyDList (go (Leaf 'c')) [])
= unsafeApplyDList (go (Branch (Leaf 'a') (Leaf 'b'))) (unsafeApplyDList (singleton 'c') [])
= unsafeApplyDList (go (Branch (Leaf 'a') (Leaf 'b'))) (unsafeApplyDList (UnsafeDList ((:) 'c')) [])
= unsafeApplyDList (go (Branch (Leaf 'a') (Leaf 'b'))) "c"
= unsafeApplyDList (UnsafeDList (unsafeApplyDList (go (Leaf 'a')) . unsafeApplyDList (go (Leaf 'b')))) "c"
= unsafeApplyDList (go (Leaf 'a')) (unsafeApplyDList (go (Leaf 'b')) "c")
= unsafeApplyDList (go (Leaf 'a')) (unsafeApplyDList (singleton 'b') "c")
= unsafeApplyDList (go (Leaf 'a')) (unsafeApplyDList (UnsafeDList ((:) 'b')) "c")
= unsafeApplyDList (go (Leaf 'a')) ('b' : "c")
= unsafeApplyDList (singleton 'a') ('b' : "c")
= unsafeApplyDList (UnsafeDList ((:) 'a')) ('b' : "c")
= 'a' : 'b' : "c"
```The left-nested `++` in `flattenSlow` results in intermediate list constructions
that are immediately discarded in the evaluation of the outermost `++`. On the
other hand, the evaluation of `flattenFast` involves no intermediate list
construction but rather function applications and `newtype` constructor wrapping
and unwrapping. This is where the efficiency comes from._**Warning!**_ Note that there is truth in the above, but there is also a lot of
hand-waving and intrinsic complexity. For example, there may be GHC rewrite
rules that apply to `++`, which will change the actual evaluation. And, of
course, strictness, laziness, and sharing all play a significant role. Also, not
every function in the `dlist` package is the most efficient for every situation._**Moral of the story:**_ If you are using `dlist` to speed up your code, check
to be sure that it actually does. Benchmark!## Design Notes
These are some notes on design and development choices made for the `dlist`
package.### Avoid `++`
The original intent of Hughes' representation of lists as first-class functions
was to provide an abstraction such that the list `append` operation found in
functional programming languages (and now called `++` in Haskell) would not
appear in left-nested positions to avoid duplicated structure as lists are
constructed. The lesson learned by many people using list over the years is that
the `append` operation can appear, sometimes surprisingly, in places they don't
expect it.One of our goals is for the `dlist` package to avoid surprising its users with
unexpected insertions of `++`. Towards this end, there should be a minimal set
of functions in `dlist` in which `++` can be directly or indirectly found. The
list of known uses of `++` includes:* `DList`: `fromList`, `fromString`, `read`
* `DNonEmpty`: `fromList`, `fromNonEmpty`, `fromString`, `read`If any future requested functions involve `++` (e.g. via `fromList`), the burden
of inclusion is higher than it would be otherwise.### Abstraction
The `DList` representation and its supporting functions (e.g. `append`, `snoc`,
etc.) rely on an invariant to preserve its safe use. That is, without this
invariant, a user may encounter unexpected outcomes.(We use safety in the sense that the semantics are well-defined and expected,
not in the sense of side of referential transparency. The invariant does not
directly lead to side effects in the `dlist` package, but a program that uses an
unsafely generated `DList` may do something surprising.)The invariant is that, for any `xs :: DList a`:
```haskell
fromList (toList xs) = xs
```To see how this invariant can be broken, consider this example:
```haskell
xs :: DList a
xs = UnsafeDList (const [])fromList (toList (xs `snoc` 1))
= fromList (toList (UnsafeDList (const []) `snoc` 1))
= fromList (toList (UnsafeDList (unsafeApplyDList (UnsafeDList (const [])) . (x :))))
= fromList (toList (UnsafeDList (const [] . (x :))))
= fromList (($ []) . unsafeApplyDList $ UnsafeDList (const [] . (x :)))
= fromList (const [] . (x :) $ [])
= fromList (const [] [x])
= fromList []
= UnsafeDList (++ [])
```The invariant can also be stated as:
```haskell
toList (fromList (toList xs)) = toList xs
```And we can restate the example as:
```haskell
toList (fromList (toList (xs `snoc` 1)))
= toList (UnsafeDList (++ []))
= []
```It would be rather unhelpful and surprising to find ``(xs `snoc` 1)`` turned out
to be the empty list.To preserve the invariant on `DList`, we provide it as an abstract type in the
`Data.DList` module. The constructor, `UnsafeDList`, and record label,
`unsafeApplyDList`, are not exported because these can be used, as shown above,
to break the invariant.All of that said, there have been numerous requests to export the `DList`
constructor. We are not convinced that it is necessary, but we are convinced
that users should decide for themselves.To use the constructor and record label of `DList`, you import them as follows:
```haskell
import Data.DList.Unsafe (DList(UnsafeDList, unsafeApplyDList))
```If you are using Safe Haskell, you may need to add this at the top of your
module:```haskell
{-# LANGUAGE Trustworthy #-}
```Just be aware that the burden of proof for safety is on you.
## References
These are various references where you can learn more about difference lists.
### Research
* **A novel representation of lists and its application to the function
“reverse.”** John Hughes. Information Processing Letters. Volume 22, Issue 3.
1986-03. Pages 141-144. [PDF][hughes-pdf]This is the original published source for a representation of lists as
first-class functions.### Background
* [Wikipedia][wikipedia]
* [Haskell Wiki][wiki-haskell]
* [Stack Overflow][stack-overflow]### Blogs and Mailing Lists
* [Using Difference Lists][blog-auclair-1]. Douglas M. Auclair. 2008-08-13.
* [A Sort of Difference][blog-kmett]. Edward Kmett. 2008-09-18.
* [Reference for technique wanted][mail-okeefe]. Richard O'Keefe, et al.
2010-10-31.
* [24 Days of Hackage: dlist][blog-charles]. Oliver Charles. 2012-12-14.
* [Constructing a list in a Monad][blog-breitner]. Joachim Breitner. 2013-11-13.
* [Demystifying DList][blog-ellis] ([Reddit][blog-ellis-reddit]). Tom Ellis.
2014-01-24.
* [keepEquals with Difference Lists][blog-auclair-2]. Douglas M. Auclair.
2014-06-21.### Books
* [Chapter 13. Data Structures][book-real-world-haskell]. Real World Haskell.
2008-12-05.## License
[BSD 3-Clause “New” or “Revised” License][license] © Don Stewart, Sean Leather,
contributors[changelog]: https://github.com/spl/dlist/blob/main/changelog.md#change-log
[benchmark]: https://github.com/spl/dlist/blob/main/bench/Main.hs
[blog-auclair-1]: https://logicaltypes.blogspot.com/2008/08/using-difference-lists.html
[blog-auclair-2]: https://logicaltypes.blogspot.com/2014/06/keepequals-with-difference-lists.html
[blog-breitner]: https://www.joachim-breitner.de/blog/620-Constructing_a_list_in_a_Monad
[blog-charles]: https://ocharles.org.uk/blog/posts/2012-12-14-24-days-of-hackage-dlist.html
[blog-ellis-reddit]: https://www.reddit.com/r/haskell/comments/1w5duf/demystifying_dlist/
[blog-ellis]: http://h2.jaguarpaw.co.uk/posts/demystifying-dlist/
[blog-kmett]: https://web.archive.org/web/20080918101635/comonad.com/reader/2008/a-sort-of-difference/
[book-real-world-haskell]: http://book.realworldhaskell.org/read/data-structures.html
[cabal]: https://cabal.readthedocs.io/
[hackage-badge]: https://img.shields.io/hackage/v/dlist.svg?maxAge=3600
[hackage-dlist]: https://hackage.haskell.org/package/dlist
[hackage]: https://hackage.haskell.org/
[hughes-pdf]: https://www.cs.tufts.edu/~nr/cs257/archive/john-hughes/lists.pdf
[license]: https://github.com/spl/dlist/blob/main/license.md
[mail-okeefe]: https://www.mail-archive.com/[email protected]/msg83699.html
[packdeps-badge]: https://img.shields.io/hackage-deps/v/dlist.svg?maxAge=3600
[packdeps]: http://packdeps.haskellers.com/feed?needle=dlist
[stack-overflow]: https://stackoverflow.com/questions/3352418/what-is-a-dlist
[stack]: https://docs.haskellstack.org/
[test-badge]: https://github.com/spl/dlist/actions/workflows/haskell-ci.yml/badge.svg
[test]: https://github.com/spl/dlist/actions/workflows/haskell-ci.yml
[wiki-haskell]: https://wiki.haskell.org/Difference_list
[wikipedia]: https://en.wikipedia.org/wiki/Difference_list