{"id":15634932,"url":"https://github.com/ww898/utf-cpp","last_synced_at":"2025-04-14T03:13:12.716Z","repository":{"id":16613257,"uuid":"80338398","full_name":"ww898/utf-cpp","owner":"ww898","description":"UTF-8/16/32 C++11 header only library for Windows / Linux / macOS","archived":false,"fork":false,"pushed_at":"2024-01-30T12:52:12.000Z","size":92,"stargazers_count":125,"open_issues_count":0,"forks_count":19,"subscribers_count":8,"default_branch":"master","last_synced_at":"2024-01-30T13:54:02.532Z","etag":null,"topics":["arm64","armv7","clang","cpp11","gcc","linux","macos","msvc","unicode","utf","utf-16","utf-32","utf-8","utf16","utf32","utf8","wchar-support","windows","x64","x86"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ww898.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-01-29T09:39:06.000Z","updated_at":"2024-01-24T07:13:08.000Z","dependencies_parsed_at":"2024-01-27T23:13:06.279Z","dependency_job_id":"87fdf2e6-19b3-414a-8655-d7ddb684bb48","html_url":"https://github.com/ww898/utf-cpp","commit_stats":null,"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ww898%2Futf-cpp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ww898%2Futf-cpp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ww898%2Futf-cpp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ww898%2Futf-cpp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ww898","download_url":"https://codeload.github.com/ww898/utf-cpp/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248813802,"owners_count":21165634,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arm64","armv7","clang","cpp11","gcc","linux","macos","msvc","unicode","utf","utf-16","utf-32","utf-8","utf16","utf32","utf8","wchar-support","windows","x64","x86"],"created_at":"2024-10-03T10:59:13.679Z","updated_at":"2025-04-14T03:13:12.697Z","avatar_url":"https://github.com/ww898.png","language":"C++","readme":"# UTF-8/16/32 C++ library\r\nThis is the C++11 template based header only library under Windows/Linux/MacOs to convert UFT-8/16/32 symbols and strings. The library transparently support `wchar_t` as UTF-16 for Windows and UTF-32 for Linux and MacOs.\r\n\r\nUTF-8 and UTF-32 (UCS-32) both support 31 bit wide code points `[0‥0x7FFFFFFF]`with no restriction. UTF-16 supports only unicode code points `[0‥0x10FFFF]`, where high `[0xD800‥0xDBFF]` and low `[0xDC00‥0xDFFF]` surrogate regions are prohibited.\r\n\r\nThe maximum UTF-16 symbol size is 2 words (4 bytes, both words should be in the surrogate region). UFT-32 (UCS-32) is always 1 word (4 bytes). UTF-8 has the maximum symbol size (see [conversion table](#utf-8-conversion-table) for details):\r\n- 4 bytes for unicode code points\r\n- 6 bytes for 31bit code points\r\n\r\n###### UTF-16 surrogate decoder:\r\n|High\\Low|DC00|DC01|…|DFFF|\r\n|:-:|:-:|:-:|:-:|:-:|\r\n|**D800**|010000|010001|…|0103FF|\r\n|**D801**|010400|010401|…|0107FF|\r\n|**⋮**|⋮|⋮|⋱|⋮|\r\n|**DBFF**|10FC00|10FC01|…|10FFFF|\r\n\r\n![UTF-16 Surrogates](https://upload.wikimedia.org/wikipedia/commons/thumb/b/b8/Utf-16.svg/512px-Utf-16.svg.png)\r\n\r\n## Supported compilers\r\n\r\nTested on following compilers:\r\n- [Visual Studio 2013 v12.0.40629.00 Update 5](perf/vc120_win.md)\r\n- [Visual Studio 2015 v14.0.25431.01 Update 3](perf/vc140_win.md)\r\n- [Visual Studio 2017 v15.6.7](perf/vc141_win.md)\r\n- [Visual Studio 2019 v16.0.3](perf/vc142_win.md)\r\n- [GNU v5.4.0](perf/gnu_linux.md)\r\n- [Clang v6.0.1](perf/clang_linux.md)\r\n- [Apple Clang v10.0.1](perf/clang_mac.md)\r\n\r\n## Usage example\r\n\r\n```cpp\r\n    // यूनिकोड\r\n    static char const u8s[] = \"\\xE0\\xA4\\xAF\\xE0\\xA5\\x82\\xE0\\xA4\\xA8\\xE0\\xA4\\xBF\\xE0\\xA4\\x95\\xE0\\xA5\\x8B\\xE0\\xA4\\xA1\";\r\n    using namespace ww898::utf;\r\n    std::u16string u16;\r\n    convz\u003cutf_selector_t\u003cdecltype(*u8s)\u003e, utf16\u003e(u8s, std::back_inserter(u16));\r\n    std::u32string u32;\r\n    conv\u003cutf16, utf_selector_t\u003cdecltype(u32)::value_type\u003e\u003e(u16.begin(), u16.end(), std::back_inserter(u32));\r\n    std::vector\u003cchar\u003e u8;\r\n    convz\u003cutf32, utf8\u003e(u32.data(), std::back_inserter(u8));\r\n    std::wstring uw;\r\n    conv\u003cutf8, utfw\u003e(u8s, u8s + sizeof(u8s), std::back_inserter(uw));\r\n    auto u8r = conv\u003cchar\u003e(uw);\r\n    auto u16r = conv\u003cchar16_t\u003e(u16);\r\n    auto uwr = convz\u003cwchar_t\u003e(u8s);\r\n\r\n    auto u32r = conv\u003cchar32_t\u003e(std::string_view(u8r.data(), u8r.size())); // C++17 only\r\n\r\n    static_assert(std::is_same\u003cutf_selector\u003cdecltype(*u8s)\u003e, utf_selector\u003cdecltype(u8)::value_type\u003e\u003e::value, \"Fail\");\r\n    static_assert(\r\n        std::is_same\u003cutf_selector_t\u003cdecltype(u16)::value_type\u003e, utf_selector_t\u003cdecltype(uw)::value_type\u003e\u003e::value !=\r\n        std::is_same\u003cutf_selector_t\u003cdecltype(u32)::value_type\u003e, utf_selector_t\u003cdecltype(uw)::value_type\u003e\u003e::value, \"Fail\");\r\n```\r\n\r\n## UTF-8 Conversion table\r\n![UTF-8/32 table](https://upload.wikimedia.org/wikipedia/commons/3/38/UTF-8_Encoding_Scheme.png)\r\n","funding_links":[],"categories":["String Utilities"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fww898%2Futf-cpp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fww898%2Futf-cpp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fww898%2Futf-cpp/lists"}