https://github.com/mmzk1526/cea
Generic marshalling between C-pointer and Haskell data types
https://github.com/mmzk1526/cea
c haskell
Last synced: 2 months ago
JSON representation
Generic marshalling between C-pointer and Haskell data types
- Host: GitHub
- URL: https://github.com/mmzk1526/cea
- Owner: MMZK1526
- Created: 2023-05-29T21:52:15.000Z (almost 2 years ago)
- Default Branch: master
- Last Pushed: 2023-06-28T16:51:06.000Z (almost 2 years ago)
- Last Synced: 2024-12-29T08:43:21.744Z (4 months ago)
- Topics: c, haskell
- Language: Haskell
- Homepage:
- Size: 152 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
# Cea
One of the core features of Haskell is immutability. It has greatly simplified the reasoning of Haskell programs, and together with the type system it can capture many bugs before they happen. However, immutability introduces overhead, and while the overhead is sometimes neglibible thanks to the various means of optimisations, there are situations where we want to avoid it.
In Haskell, a common way of achieving mutability is to use `IORef`s. However, `IORef`s are notoriously slow. In fact, they are often slower than pure `StateT`s.
A second way relies on Haskell's Foreign Function Interface (FFI), where we can have access to C pointers. Many common Haskell libraries use these pointers under the hood. However, working with raw, untyped pointers are error-prone, and it is easy to introduce memory leaks and segfaults. While Haskell has a `Storable` type class that offers basic types of pointers, it is only implemented for the basic types, and one have to write much boilerplate if they want to use custom data types.
The library `cea` is an attempt to provide a safe, high-level interface to C pointers. It can derive `Pointable` instances for most custom data types, and provides a variety of accessor functions that allows one to modify a certain field of a data structure without reading the whole data structure. All the type guarantees are checked at compile time, introducing minimum overhead over raw pointers.
Currently, it is still a prototype that can only handle non-recursive product types and arrays, but in the future I will extend it to support sum types to make it more useful.
## Example
TODO## Basic Usage
To use `Cea`, you will need to import the following modules and enable the following extensions:
```Haskell
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DerivingVia #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE UndecidableInstances #-}import Cea.Pointer
import Cea.Pointer.Accessor
```### Pointable
Use `make` to create a pointer, use `load` and `store` to read and write to the pointer. Use `delete` to free the pointer.
```Haskell
main :: IO ()
main = do
ptr <- make (0 :: Int)
val <- load ptr
print val -- 0
store ptr 1
val <- load ptr
print val -- 1
delete ptr -- free the pointer
```After calling `delete`, the pointer is no longer valid and should not be used.
`make` can be used on any type that is an instance of `Pointable`, which is a partial generalisation of `Storable`. It has been implemented on many common data types, for example, tuples:
```Haskell
main :: IO ()
main = do
ptr <- make (0 :: Int, 0 :: Int)
val <- load ptr
print val -- (0, 0)
store ptr (1, 2)
val <- load ptr
print val -- (1, 2)
delete ptr
```To make custom data types an instance of `Pointable`, we need to derive `Generic` and `Cea` before we can derive `Pointable`:
```Haskell
data Foo = Foo Int Int
deriving stock Generic
deriving anyclass (Cea, Pointable)data Bar = Bar Foo Foo
deriving stock Generic
deriving anyclass (Cea, Pointable)
```Here `Cea` is a dummy type class that enables the generic derivation of `Pointable`. In most of the cases, we just need to derive `Cea` and `Pointable` for our custom data types, and the compiler will do the rest for us. In the rare occasion where customised `Pointable` instances are needed, we **must not** derive `Cea` as it will automatically derive the generic implementation of `Accessible` for us, which may lead to errors when working with custom `Pointable`. More details can be found in [this section](#accessor).
```Haskell
main :: IO ()
main = do
ptr <- make (Bar (Foo 0 0) (Foo 0 0))
val <- load ptr
print val -- Bar (Foo 0 0) (Foo 0 0)
store ptr (Bar (Foo 1 2) (Foo 3 4))
val <- load ptr
print val -- Bar (Foo 1 2) (Foo 3 4)
delete ptr
```In the example above, the fields of `Bar`, namely the two `Foo`s, are stored as
pointers, and the fields of `Foo`, namely the two `Int`s, are stored as values.This is because custom types are considered as "indirect" types, and their
fields are stored as pointers.Finally, the `delete` function at the end of the `do`-block frees the structure
recursively, so that we do not (and shouldn't) access the pointer of each fields
and free them manually.Built-in tuples, on the other hand, are treated as direct types, therefore `((1, 2), (3, 4))` is stored in one contiguous block of memory:
```Haskell
main :: IO ()
main = do
ptr <- make ((0, 0), (0, 0))
val <- load ptr
print val -- ((0, 0), (0, 0))
store ptr ((1, 2), (3, 4))
val <- load ptr
print val -- ((1, 2), (3, 4))
let ptr' = castPtr ptr :: Ptr (Int, Int, Int, Int)
val' <- load ptr'
print val' -- (1, 2, 3, 4)
delete ptr
```Note that we does not support deriving `Pointable` for sum types yet.
### Accessor
Sometimes, we may only want to modify a certain field of a data structure. For example, we may want to increment the first field of a tuple, or the first field of a custom data type:
```Haskell
main :: IO ()
main = do
ptr <- make ((0, 0, 0, 0) :: (Int, Int, Int, Int))
(a, b, c, d) <- load ptr
store ptr (a + 1, b, c, d)
delete ptr
```This is not only tedious to write, but also introduces extra overhead since we need to read the whole tuple, modify it, and write it back.
Accessor functions provide ways to modify a certain field of a data structure without reading the whole data structure. For example, we can use `access` to acquire a pointer that points to the
first field of a tuple:```Haskell
main :: IO ()
main = do
ptr <- make ((0, 0, 0, 0) :: (Int, Int, Int, Int))
ptr0 <- access @0 ptr
store ptr0 1
val <- load ptr
print val -- (1, 0, 0, 0)
delete ptr
```The type application `@n` specifies the `n`-th field. Note that the index starts from 0.
The `access` function comes from the `Accessible` type class, which is automatically provided for `Cea` and `Pointable` instances. Therefore, if the `Pointable` instance is not derived but an explicit customised implementation, we **must not** derive `Cea` and **must** implement `Accessible` manually. In the vast majority of the cases, of course, the derived instances are sufficient and we don't need to worry about implementing `Accessible`.
```Haskell
`access` is often used directly with a `load` or a `store`. In this case, we can use the shorthand functions `loadAt` and `storeAt`, which reads and writes to the pointer returned by `access`:
```Haskell
main :: IO ()
main = do
ptr <- make ((0, 0, 0, 0) :: (Int, Int, Int, Int))
storeAt @0 ptr 1
val <- load ptr
print val -- (1, 0, 0, 0)
val0 <- loadAt @0 ptr
print val0 -- 1
delete ptr
```If the fields has nested fields, we can use `accesses` to acquire a pointer that points to the nested field:
```Haskell
main :: IO ()
main = do
ptr <- make (((0, 1), (2, 3)) :: ((Int, Int), (Int, Int)))
ptr0 <- accesses @[0, 0] ptr
store ptr0 114
val <- load ptr
print val -- ((114, 1), (2, 3))
delete ptr
```The type application `@[n1, n2, ...]` specifies the path to the nested field, namely the n1-th field's n2-th field's ... n-th field. Again the indices start from 0. In particular `accesses @[] ptr` is the same as the orginal poinrter `ptr`.
Similarly, we have the shorthand functions `loadsAt` and `storesAt`.
If we are using a custom data type with selector names, we can also use the selector names themselves to access the fields:
```Haskell
data Foo = Foo { foo1 :: Int, foo2 :: Int }
deriving stock (Generic, Show)
deriving anyclass (Cea, Pointable)main :: IO ()
main = do
ptr <- make (Foo 0 0)
storeAt @"foo1" ptr 1
val <- load ptr
print val -- Foo {foo1 = 1, foo2 = 0}
delete ptr
```## Array
There are two types of `Pointable` arrays, one has compile-time known size, and the other has runtime known size. To use arrays, import `Cea.Array`:
```Haskell
-- Previous imports...
import Cea.Array
```### Creation, Read & Write
To create an array with compile-time known size, we can use `makeArr`:
```Haskell
main :: IO ()
main = do
arr <- makeArr @4 (0 :: Int)
e0 <- readArr' @0 arr
print e0 -- 0
writeArr' @0 arr 1
e0 <- readArr' @0 arr
print e0 -- 1
len <- arrLen arr
print len -- 4
deleteArr arr
````makeArr` takes a type application that specifies the size of the array, and a value that specifies the initial value of the array (the same initial value is used for all elements, similar to `MArray`'s `newArray`).
To read and write to an array, we can use `readArr'` and `writeArr'`, which takes a type application that specifies the index of the element to read/write. If we feed an index that is out of bound, the program will not compile.
The size of the array can be acquired by `arrLen`. In this case it is already known at compile time, but this function works with any array in general.
Finally, we use `deleteArr` to free the array. Note that the array is assumed to take ownership of all its elements, so we do not need to free the elements manually, and we **should not** free the elements manually.
To create an array with runtime known size, we can use `makeArrFromList`:
```Haskell
main :: IO ()
main = do
arr <- makeArrFromList @Int [0, 1, 2, 3]
e0 <- readArr 0 arr
print e0 -- 0
writeArr 1 arr 1
e0 <- readArr 1 arr
print e0 -- 1
len <- arrLen arr
print len -- 4
deleteArr arr
```Here we create an array directly from a list (or any `Foldable`), and the size of the array is determined by the length of the list. In this case we know that the length is 4, but in general the length cannot be determined at compile time.
To read/write the array, we can use `readArr` and `writeArr`, which takes an `Int` (as opposed to the type application in `readArr'` and `writeArr'`). If we feed an index that is out of bound, the program will throw an error. Note that these functions also work with arrays with compile-time known size; in fact, all functions that work with arrays with runtime known size also work with arrays with compile-time known size, but apparently not *vice versa*.
### Take Snapshots
We can take a "snapshot" of the array an turn it into an immutable `Array` or list using `loadArr` or `loadArrToList`, similar to the `freeze` function for `MArray`s:
```Haskell
-- Assuming the relevant imports are already there
main :: IO ()
main = do
arr <- makeArrFromList @Int [0, 1, 2, 3]
arr' <- loadArr arr
print $ toList arr' -- [0, 1, 2, 3]
list <- loadArrToList arr
print list -- [0, 1, 2, 3]
deleteArr arr
```### Access Element Pointers
So far, when we talk about writing to an array, we are overwriting the entire element, but sometimes our elements could represent a more complicated data type, and we only want to modify a sub-field of them. In this case, we can use the `accessArr` function to acquire a pointer to the element, and then use `load` and `store` to read/write to the element (as discussed in [the previous section](#pointable)).
Assuming we have the following data type `Student`:
```Haskell
data Student = Student { name :: String
, age :: Int
, isDeanList :: Bool }
deriving stock (Generic, Show)
deriving anyclass (Cea, Pointable)
```Here's an example of creating an array of two students, then adding them to the Dean's List:
```Haskell
main :: IO ()
main = do
let bob = Student { name = "Bob", age = 20, isDeanList = False }
tom = Student { name = "Tom", age = 21, isDeanList = False }
arr <- makeArrFromList [bob, tom]
bobPtr <- accessArr 0 arr
tomPtr <- accessArr 1 arr
storeAt @"isDeanList" bobPtr True
storeAt @"isDeanList" tomPtr True
list <- loadArrToList arr
print $ all isDeanList list -- True
deleteArr arr
```For arrays with compile-time known size, we can also use `accessArr'` instead, which takes a type application that specifies the index of the element to access, similar to `readArr'` and `writeArr'`.