https://github.com/artichoke/intaglio
🗃 UTF-8 string, byte string, and C string interner
https://github.com/artichoke/intaglio
artichoke bytes interner rust rust-crate string-interning symbol symbol-table utf-8
Last synced: 2 months ago
JSON representation
🗃 UTF-8 string, byte string, and C string interner
- Host: GitHub
- URL: https://github.com/artichoke/intaglio
- Owner: artichoke
- License: mit
- Created: 2020-06-13T19:58:01.000Z (about 5 years ago)
- Default Branch: trunk
- Last Pushed: 2025-03-25T00:11:06.000Z (3 months ago)
- Last Synced: 2025-03-30T19:08:50.662Z (3 months ago)
- Topics: artichoke, bytes, interner, rust, rust-crate, string-interning, symbol, symbol-table, utf-8
- Language: Rust
- Homepage: https://crates.io/crates/intaglio
- Size: 2.29 MB
- Stars: 27
- Watchers: 4
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# intaglio
[](https://github.com/artichoke/intaglio/actions)
[](https://codecov.artichokeruby.org/intaglio/index.html)
[](https://discord.gg/QCe2tp2)
[](https://twitter.com/artichokeruby)
[](https://crates.io/crates/intaglio)
[](https://docs.rs/intaglio)
[](https://artichoke.github.io/intaglio/intaglio/)UTF-8 string and byte string interner and symbol table. Used to implement
storage for the [Ruby `Symbol`][symbol] table and the constant name table in
[Artichoke Ruby][artichoke].> Symbol objects represent names and some strings inside the Ruby interpreter.
> They are generated using the `:name` and `:"string"` literals syntax, and by
> the various `to_sym` methods. The same `Symbol` object will be created for a
> given name or string for the duration of a program's execution, regardless of
> the context or meaning of that name.Intaglio is a UTF-8 and byte string interner, which means it stores a single
copy of an immutable `&str` or `&[u8]` that can be referred to by a stable `u32`
token.Interned strings and byte strings are cheap to compare and copy because they are
represented as a `u32` integer._Intaglio_ is an alternate name for an _engraved gem_, a gemstone that has been
carved with an image. The Intaglio crate is used to implement an immutable
Symbol store in Artichoke Ruby.## Usage
Add this to your `Cargo.toml`:
```toml
[dependencies]
intaglio = "1.10.0"
```Then intern UTF-8 strings like:
```rust
fn intern_and_get() -> Result<(), Box> {
let mut table = intaglio::SymbolTable::new();
let name: &'static str = "abc";
let sym = table.intern(name)?;
let retrieved = table.get(sym);
assert_eq!(Some(name), retrieved);
assert_eq!(sym, table.intern("abc".to_string())?);
Ok(())
}
```Or intern byte strings like:
```rust
fn intern_and_get() -> Result<(), Box> {
let mut table = intaglio::bytes::SymbolTable::new();
let name: &'static [u8] = b"abc";
let sym = table.intern(name)?;
let retrieved = table.get(sym);
assert_eq!(Some(name), retrieved);
assert_eq!(sym, table.intern(b"abc".to_vec())?);
Ok(())
}
```Or intern C strings like:
```rust
use std::ffi::{CStr, CString};fn intern_and_get() -> Result<(), Box> {
let mut table = intaglio::cstr::SymbolTable::new();
let name: &'static CStr = CStr::from_bytes_with_nul(b"abc\0")?;
let sym = table.intern(name)?;
let retrieved = table.get(sym);
assert_eq!(Some(name), retrieved);
assert_eq!(sym, table.intern(CString::new(*b"abc")?)?);
Ok(())
}
```Or intern platform strings like:
```rust
use std::ffi::{OsStr, OsString};fn intern_and_get() -> Result<(), Box> {
let mut table = intaglio::osstr::SymbolTable::new();
let name: &'static OsStr = OsStr::new("abc");
let sym = table.intern(name)?;
let retrieved = table.get(sym);
assert_eq!(Some(name), retrieved);
assert_eq!(sym, table.intern(OsString::from("abc"))?);
Ok(())
}
```Or intern path strings like:
```rust
use std::path::{Path, PathBuf};fn intern_and_get() -> Result<(), Box> {
let mut table = intaglio::path::SymbolTable::new();
let name: &'static Path = Path::new("abc");
let sym = table.intern(name)?;
let retrieved = table.get(sym);
assert_eq!(Some(name), retrieved);
assert_eq!(sym, table.intern(PathBuf::from("abc"))?);
Ok(())
}
```## Implementation
Intaglio interns owned and borrowed strings with no additional copying by
leveraging `Cow` and a bit of unsafe code. CI runs `drop` tests under Miri and
LeakSanitizer.## Crate features
All features are enabled by default.
- **bytes** - Enables an additional symbol table implementation for interning
byte strings (`Vec` and `&'static [u8]`).
- **cstr** - Enables an additional symbol table implementation for interning C
strings (`CString` and `&'static CStr`).
- **osstr** - Enables an additional symbol table implementation for interning
platform strings (`OsString` and `&'static OsStr`).
- **path** - Enables an additional symbol table implementation for interning
path strings (`PathBuf` and `&'static Path`).### Minimum Supported Rust Version
This crate requires at least Rust 1.76.0. This version can be bumped in minor
releases.## License
`intaglio` is licensed under the [MIT License](LICENSE) (c) Ryan Lopopolo.
[symbol]: https://ruby-doc.org/core-3.1.2/Symbol.html
[artichoke]: https://github.com/artichoke/artichoke