An open API service indexing awesome lists of open source software.

https://github.com/sam0x17/interned

A rust crate that provides generic, thread-local internment of arbitrary types as well as memoization
https://github.com/sam0x17/interned

Last synced: 3 months ago
JSON representation

A rust crate that provides generic, thread-local internment of arbitrary types as well as memoization

Awesome Lists containing this project

README

          

[![Crates.io](https://img.shields.io/crates/v/interned)](https://crates.io/crates/interned)
[![docs.rs](https://img.shields.io/docsrs/interned?label=docs)](https://docs.rs/interned/latest/interned/)
[![Build Status](https://img.shields.io/github/actions/workflow/status/sam0x17/interned/ci.yaml)](https://github.com/sam0x17/interned/actions/workflows/ci.yaml?query=branch%3Amain)
[![MIT License](https://img.shields.io/github/license/sam0x17/interned)](https://github.com/sam0x17/interned/blob/main/LICENSE)

Interned provides highly optimized, thread-local, generic
[interning](https://en.wikipedia.org/wiki/String_interning) via `Interned` and a
[memoization](https://en.wikipedia.org/wiki/Memoization) layer built on top of this interning
layer, provided by `Memoized`, which can cache the result of an arbitrary input `I: Hash`
and _intern_ this result in the underlying interning layer.

Blanket implementations supporting `T` are provided for all primitives, slices of `Sized` `T`
(including `&[u8]`), as well as `str` slices (`&str`). Support for additional arbitrary types
can be added by implementing `DataType`, `Staticize`, and `Hash`. `str` slices have a custom
implementation since they are the only built-in unsized type with slice support.

All values are heap-allocated `'static`s and benefit from `TypeId`-specific locality of
reference in the heap. Any two `Interned` instances that have the same value of `T` will be
guaranteed to point to the same memory address in the heap. Among other things, this allows for
`O(1)` (in the size of the data) equality comparisons since the heap addresses are compared
instead of having to compare the underlying data bit-by-bit. This makes interned types
especially suited for parsing and similar low-entropy data tasks.

A caveat of the `'static` lifetime and immutability of the underlying heap data is that unique
values of `Interned` and `Memoized` _leak_ in the sense that they can never be
de-allocated. This allows us to implement `Copy` on all interned types, because we can rely on
the heap pointer to continue existing for the life of the program once it has been created for
a particular value. For this reason, you should _not_ use this crate for long-running programs
that will encounter an unbounded number of unique values, such as those created by an unending
stream of user input.

Because the internal size of an `Interned` _on the stack_ is the size of a `usize` (pointer)
plus a `u64` (cached hash code), it would be silly to use `Interned` with integer types
directly, however it makes sense to do so for the purposes of memoizing an expensive
computation via `Memoized`.

An interned string type, `InStr`, is also provided as a convenient wrapper around
`Interned<&'static str>`. It has a number of extra impls and should be your go-to type if you
want to work with interned strings.

### Interned Example

### Memoized Examples

The following demonstrates how "scopes" work with `Memoized`: