{"id":19541421,"url":"https://github.com/ickk/sinter","last_synced_at":"2025-04-26T17:30:48.805Z","repository":{"id":246846342,"uuid":"824200879","full_name":"ickk/sinter","owner":"ickk","description":"An easy to use \u0026 fast global interning pool","archived":false,"fork":false,"pushed_at":"2024-07-11T20:29:45.000Z","size":37,"stargazers_count":3,"open_issues_count":4,"forks_count":1,"subscribers_count":2,"default_branch":"dev","last_synced_at":"2025-04-03T03:35:10.765Z","etag":null,"topics":["string-interning"],"latest_commit_sha":null,"homepage":"https://crates.io/crates/sinter","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ickk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE-APACHE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-04T15:11:23.000Z","updated_at":"2025-01-21T09:10:13.000Z","dependencies_parsed_at":null,"dependency_job_id":"57146134-5db3-470c-9a39-adab89597479","html_url":"https://github.com/ickk/sinter","commit_stats":null,"previous_names":["ickk/sinter"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ickk%2Fsinter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ickk%2Fsinter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ickk%2Fsinter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ickk%2Fsinter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ickk","download_url":"https://codeload.github.com/ickk/sinter/tar.gz/refs/heads/dev","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251025574,"owners_count":21524826,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["string-interning"],"created_at":"2024-11-11T03:10:26.939Z","updated_at":"2025-04-26T17:30:48.556Z","avatar_url":"https://github.com/ickk.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"`sinter`\n==========\n[crates.io](https://crates.io/crates/sinter) |\n[docs.rs](https://docs.rs/sinter) |\n[github](https://github.com/ickk/sinter)\n\nAn easy to use \u0026 fast global interning pool.\n\nInterned strings are stored contiguously in memory, which may help with memory\nlocality or fragmentation. Additional pages of memory for the interner are\nallocated as required, doubling in size with each successive page - amortising\nthe cost of the underlying allocations.\n\nCalling [`intern`] on a string that has already previously been interned is\nfast \u0026 lockless, though still potentially more expensive than holding onto an\n[`IStr`] you already have.\n\nIn the worst case a call to [`intern`] can be relatively expensive, since if\nthe string doesn't already exist then some synchronisation with other threads\nis required, and the operation may also require allocating a new memory page\nfor the pool.\n\n`IStr`\n------\n\nZero-cost conversion to `\u0026'static str` or `\u0026'static CStr`:\n```rust\n# use sinter::IStr;\n# use ::core::ffi::CStr;\nlet istr = IStr::new(\"hello, sinter!\");\nlet s: \u0026'static str = istr.as_str();\nlet cstr: \u0026'static CStr = istr.as_c_str();\n```\n\n[`IStr`] Derefs to `\u0026str`:\n```rust\n# use sinter::IStr;\n# use ::core::ffi::CStr;\nlet istr = IStr::new(\"hello, sinter!\");\nlet s: \u0026str = \u0026*istr;\n```\n\nAn [`IStr`] can be compared to another `IStr` extremely cheaply; under the hood\n[`Eq`] is implemented by a single pointer comparison:\n```rust\n# use sinter::intern;\n# use ::core::ffi::CStr;\nlet a = intern(\"aaa\");\nlet a2 = intern(\"aaa\");\nlet b = intern(\"bbb\");\n\nassert!(a == a2);\nassert!(a != b);\n```\n\nOr you can compare to a regular `\u0026str`:\n```rust\n# use sinter::IStr;\nassert!(IStr::new(\"sinter\") == \"sinter\");\n```\n\nFlexible to construct:\n```rust\n# use sinter::{intern, IStr};\n# use ::std::ffi::{CStr, CString};\nlet a = intern(\"a\");\nlet b = IStr::new(\"b\");\nlet c = IStr::from(\"c\");\nlet d = IStr::from(String::from(\"d\"));\nlet e: IStr = \"e\".into();\nlet f = IStr::try_from(CString::new(\"f\").unwrap()).unwrap();\nlet g = IStr::try_from(CString::new(\"g\").unwrap().as_c_str()).unwrap();\n# assert_eq!(\n#   [a, b, c, d, e, f, g],\n#   [\n#     intern(\"a\"),\n#     intern(\"b\"),\n#     intern(\"c\"),\n#     intern(\"d\"),\n#     intern(\"e\"),\n#     intern(\"f\"),\n#     intern(\"g\"),\n#   ],\n# );\n```\n\nFind out if a given string has already been interned with [`get_interned`].\nThis will always be fast/lockless and returns the [`IStr`] if found:\n```rust\n# use sinter::{get_interned, intern};\nintern(\"exists\");\nassert!(get_interned(\"exists\").is_some());\nassert!(get_interned(\"doesn't exist\").is_none());\n```\n\nThe [`::core::ops::Deref`] implementation gives you all the\nuseful `\u0026str` methods \u0026 operations, such as subslicing:\n```rust\n# use sinter::IStr;\nlet hello_world = IStr::new(\"hello, world!\");\nlet world: \u0026str = \u0026hello_world[7..];\nassert_eq!(world, \"world!\");\n```\n\nThe [`::core::borrow::Borrow\u003cstr\u003e`] implementation lets you create `HashMap`s\nwith `IStr` keys, and then ergonomically lookup values with `\u0026str`:\n```rust\n# use sinter::IStr;\n# use ::std::collections::HashMap;\nlet mut map: HashMap\u003cIStr, f32\u003e = HashMap::new();\nmap.insert(IStr::new(\"e\"), 2.718);\nlet val = map.get(\"e\");\n# assert_eq!(val, Some(\u00262.718));\n```\n\nArchitecture\n------------\n\nInternally, an `Interner` data structure manages the pool of interned strings.\n\nWhen adding a new string to the pool, the Interner acquires a lock on one half\nof the pool. This could be a somewhat slow operation if there is a lot of\ncontention with other threads (although this should normally be very unlikely).\n\nOn the other hand, the Interner uses lockless concurrency primitives to enable\nreaders (callers to `intern` that do not require allocating a new string, and\ninstead can fetch an existing `IStr` instance) to avoid locking entirely,\nallowing superfluous calls to `intern` to still be very fast.\n\nThe concurrency scheme is as follows:\n\n1. We maintain a linked-list of memory pages where the strings themselves are\n   stored. New strings are appended strictly to the tail of the last memory\n   page, and new pages are allocated as needed. This means all existing `IStr`s\n   have stable static memory locations and data.\n\n2. We maintain a pair of redundant hash tables mapping a string's hash to the\n   `IStr` (the pointer to the string data in the memory page), facilitating\n   fast lookup for already interned strings. The tables are atomically swapped\n   by the writer, allowing readers to safely get new updates without locking.\n\n   When a thread wants to inspect the \"readable table\" they increment an atomic\n   counter. This counter is incremented again when the reader is finished. This\n   allows a writer to reliably wait on lingering reads after the atomic table\n   swap.\n\n   If each thread's counter is even, then the writer knows they are not reading\n   at all. If the thread's counter is odd, then the writer waits for the\n   counter to increment at least once. Note, waiting for a single increment is\n   sufficient and should be fairly quick (as reads are quick). After any\n   increment the writer can be sure the reader will fetch the new table's\n   pointer before starting a new read.\n\n3. When a thread terminates it calls the destructor for the `LocalKey` which\n   contains a pointer to our epoch atomic-counter. In this destructor we set\n   the value of the epoch to a special value to mark this thread as dead. Later\n   when some other code is holding the write_lock on the interner, it checks\n   the list of epochs to see if any threads are dead, and then frees the memory\n   holding the atomic and removes that epoch from Interner's list.\n\n   This solves the small memory leak that might occur if the user keeps\n   spawning lots of temporary threads. It doesn't require that the LocalKey\n   destructor waits until it can get the write_lock, and it avoids accumulating\n   dangling pointers in the Interner datastructure.\n\nLicense\n-------\n\nThis crate is licensed under any of the\n[Apache license, Version 2.0](./LICENSE-APACHE),\nor the\n[MIT license](./LICENSE-MIT),\nor the\n[Zlib license](./LICENSE-ZLIB)\nat your option.\n\nUnless explicitly stated otherwise, any contributions you intentionally submit\nfor inclusion in this work shall be licensed accordingly.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fickk%2Fsinter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fickk%2Fsinter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fickk%2Fsinter/lists"}