https://github.com/sam0x17/interned
A rust crate that provides generic, thread-local internment of arbitrary types as well as memoization
https://github.com/sam0x17/interned
Last synced: 3 months ago
JSON representation
A rust crate that provides generic, thread-local internment of arbitrary types as well as memoization
- Host: GitHub
- URL: https://github.com/sam0x17/interned
- Owner: sam0x17
- License: mit
- Created: 2023-07-04T04:56:55.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-08-28T21:00:14.000Z (over 2 years ago)
- Last Synced: 2025-02-13T08:14:53.561Z (11 months ago)
- Language: Rust
- Size: 136 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.docify.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://crates.io/crates/interned)
[](https://docs.rs/interned/latest/interned/)
[](https://github.com/sam0x17/interned/actions/workflows/ci.yaml?query=branch%3Amain)
[](https://github.com/sam0x17/interned/blob/main/LICENSE)
Interned provides highly optimized, thread-local, generic
[interning](https://en.wikipedia.org/wiki/String_interning) via `Interned` and a
[memoization](https://en.wikipedia.org/wiki/Memoization) layer built on top of this interning
layer, provided by `Memoized`, which can cache the result of an arbitrary input `I: Hash`
and _intern_ this result in the underlying interning layer.
Blanket implementations supporting `T` are provided for all primitives, slices of `Sized` `T`
(including `&[u8]`), as well as `str` slices (`&str`). Support for additional arbitrary types
can be added by implementing `DataType`, `Staticize`, and `Hash`. `str` slices have a custom
implementation since they are the only built-in unsized type with slice support.
All values are heap-allocated `'static`s and benefit from `TypeId`-specific locality of
reference in the heap. Any two `Interned` instances that have the same value of `T` will be
guaranteed to point to the same memory address in the heap. Among other things, this allows for
`O(1)` (in the size of the data) equality comparisons since the heap addresses are compared
instead of having to compare the underlying data bit-by-bit. This makes interned types
especially suited for parsing and similar low-entropy data tasks.
A caveat of the `'static` lifetime and immutability of the underlying heap data is that unique
values of `Interned` and `Memoized` _leak_ in the sense that they can never be
de-allocated. This allows us to implement `Copy` on all interned types, because we can rely on
the heap pointer to continue existing for the life of the program once it has been created for
a particular value. For this reason, you should _not_ use this crate for long-running programs
that will encounter an unbounded number of unique values, such as those created by an unending
stream of user input.
Because the internal size of an `Interned` _on the stack_ is the size of a `usize` (pointer)
plus a `u64` (cached hash code), it would be silly to use `Interned` with integer types
directly, however it makes sense to do so for the purposes of memoizing an expensive
computation via `Memoized`.
An interned string type, `InStr`, is also provided as a convenient wrapper around
`Interned<&'static str>`. It has a number of extra impls and should be your go-to type if you
want to work with interned strings.
### Interned Example
### Memoized Examples
The following demonstrates how "scopes" work with `Memoized`: