{"id":22115744,"url":"https://github.com/yeslogic/unicode-case-mapping","last_synced_at":"2025-07-14T15:33:26.365Z","repository":{"id":36580890,"uuid":"228728820","full_name":"yeslogic/unicode-case-mapping","owner":"yeslogic","description":"Fast mapping of char to lowercase, uppercase, or titlecase in Rust.","archived":false,"fork":false,"pushed_at":"2024-10-08T01:11:29.000Z","size":143,"stargazers_count":7,"open_issues_count":0,"forks_count":5,"subscribers_count":14,"default_branch":"master","last_synced_at":"2025-04-02T04:54:21.451Z","etag":null,"topics":["lowercase","rust","titlecase","ucd","unicode","uppercase"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yeslogic.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-12-18T01:00:17.000Z","updated_at":"2024-12-30T22:24:43.000Z","dependencies_parsed_at":"2022-07-22T11:18:06.077Z","dependency_job_id":null,"html_url":"https://github.com/yeslogic/unicode-case-mapping","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/yeslogic/unicode-case-mapping","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yeslogic%2Funicode-case-mapping","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yeslogic%2Funicode-case-mapping/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yeslogic%2Funicode-case-mapping/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yeslogic%2Funicode-case-mapping/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yeslogic","download_url":"https://codeload.github.com/yeslogic/unicode-case-mapping/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yeslogic%2Funicode-case-mapping/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265312075,"owners_count":23745177,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["lowercase","rust","titlecase","ucd","unicode","uppercase"],"created_at":"2024-12-01T12:17:43.052Z","updated_at":"2025-07-14T15:33:26.100Z","avatar_url":"https://github.com/yeslogic.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"unicode-case-mapping\n====================\n\n\u003cdiv align=\"center\"\u003e\n  \u003ca href=\"https://github.com/yeslogic/unicode-case-mapping/actions/workflows/ci.yml\"\u003e\n    \u003cimg src=\"https://github.com/yeslogic/unicode-case-mapping/actions/workflows/ci.yml/badge.svg\" alt=\"Build Status\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://docs.rs/unicode-case-mapping\"\u003e\n    \u003cimg src=\"https://docs.rs/unicode-case-mapping/badge.svg\" alt=\"Documentation\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://crates.io/crates/unicode-case-mapping\"\u003e\n    \u003cimg src=\"https://img.shields.io/crates/v/unicode-case-mapping.svg\" alt=\"Version\"\u003e\n  \u003c/a\u003e\n  \u003cimg src=\"https://img.shields.io/badge/unicode-16.0-informational\" alt=\"Unicode Version\"\u003e\n  \u003ca href=\"https://github.com/yeslogic/unicode-case-mapping/blob/master/LICENSE\"\u003e\n    \u003cimg src=\"https://img.shields.io/crates/l/unicode-case-mapping.svg\" alt=\"License\"\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n\n\u003cbr\u003e\n\nFast mapping of a `char` to lowercase, uppercase, titlecase, or its simple case folding\nin Rust using Unicode 16.0 data.\n\nUsage\n-----\n\n```rust\nfn main() {\n    assert_eq!(unicode_case_mapping::to_lowercase('İ'), ['i' as u32, 0x0307]);\n    assert_eq!(unicode_case_mapping::to_lowercase('ß'), ['ß' as u32, 0]);\n    assert_eq!(unicode_case_mapping::to_uppercase('ß'), ['S' as u32, 'S' as u32, 0]);\n    assert_eq!(unicode_case_mapping::to_titlecase('ß'), ['S' as u32, 's' as u32, 0]);\n    assert_eq!(unicode_case_mapping::to_titlecase('-'), [0; 3]);\n    assert_eq!(unicode_case_mapping::case_folded('I'), NonZeroU32::new('i' as u32));\n    assert_eq!(unicode_case_mapping::case_folded('ß'), None);\n    assert_eq!(unicode_case_mapping::case_folded('ẞ'), NonZeroU32::new('ß' as u32));\n}\n```\n\nMotivation / When to Use\n------------------------\n\nThe Rust standard library supplies [to_uppercase] and [to_lowercase] methods on\n`char` so you might be wondering why this crate was created or when to use it.\nYou should almost certainly use the standard library, unless:\n\n* You need support for titlecase conversion or case folding according to the\n  Unicode character database (UCD).\n* You need lower level access to the mapping table data, compared to the iterator\n  interface supplied by the standard library.\n* You _need_ faster performance than the standard library.\n\nAn additional motivation for creating this crate was to be able to version the\nUCD data used independent of the Rust version. This allows us to ensure all\nour Unicode related crates are all using the same UCD version.\n\nPerformance \u0026 Implementation Notes\n----------------------------------\n\n[ucd-generate] is used to generate `tables.rs`. A build script (`build.rs`)\ncompiles this into a three level look up table. The look up time is constant as\nit is just indexing into the arrays.\n\nThe multi-level approach maps a code point to a block, then to a position\nwithin a block, which is then the index of a record describing how to map that\ncodepoint to lower, upper, and title case. This allows the data to be\ndeduplicated, saving space, whilst also providing fast lookup. The code is\nparameterised over the block size, which must be a power of 2. The value in the\nbuild script is optimal for the data set.\n\nThis approach trades off some space for faster lookups. The tables take up\nabout 101KiB. Benchmarks (run with `cargo bench`) show this approach to be\n~5–10× faster than the binary search approach used in the Rust standard\nlibrary.\n\nIt's possible there are further optimisations that could be made to eliminate\nsome runs of repeated values in the first level array.\n\nRegenerating `tables.rs`\n------------------------\n\n1. Regenerate with `yeslogic-ucd-generate` (run `make`).\n2. Add/restore `#[allow(dead_code)]` to each table to prevent warnings.\n\n[ucd-generate]: https://github.com/yeslogic/ucd-generate\n[to_uppercase]: https://doc.rust-lang.org/std/primitive.char.html#method.to_uppercase\n[to_lowercase]: https://doc.rust-lang.org/std/primitive.char.html#method.to_lowercase\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyeslogic%2Funicode-case-mapping","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyeslogic%2Funicode-case-mapping","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyeslogic%2Funicode-case-mapping/lists"}