https://github.com/stdbug/unicpp
Just another C++ Unicode library
https://github.com/stdbug/unicpp
c-plus-plus cpp cpp17 decoding encoding unicode utf-16 utf-8
Last synced: 11 months ago
JSON representation
Just another C++ Unicode library
- Host: GitHub
- URL: https://github.com/stdbug/unicpp
- Owner: stdbug
- License: 0bsd
- Created: 2021-02-21T15:42:41.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-04-04T15:07:40.000Z (about 4 years ago)
- Last Synced: 2025-07-20T00:36:05.203Z (11 months ago)
- Topics: c-plus-plus, cpp, cpp17, decoding, encoding, unicode, utf-16, utf-8
- Language: C++
- Homepage:
- Size: 236 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# unicpp: Just another C++ Unicode library
## Character manipulation and category detection functions (`unicpp/char_type.h`)
```cpp
bool isalpha(char32_t);
bool isdigit(char32_t);
bool isspace(char32_t);
// one-to-one case mappings
char32_t toupper(char32_t);
char32_t tolower(char32_t);
```
## UTF-8 and UTF-16 encode/decode functions (`unicpp/utf8.h`, `unicpp/utf16.h`)
� (U+FFFD) is used as a replacement character when invalid character/byte sequence is encountered
### Strings validation/stats functions
```cpp
size_t Utf8ValidPrefixLength(std::string_view);
size_t Utf8NumValidChars(std::string_view);
size_t Utf8NumCharsWithReplacement(std::string_view);
```
### Encoding/decoding functions
```cpp
// UTF-8
const std::wstring wide_string = L"Some string";
std::vector encoded_utf8 = Utf8Bytes>(wide_string);
std::wstring decoded_utf8 = Utf8Wstring(encoded_utf8);
assert(wide_string == decoded_utf8);
// UTF-16LE
std::vector encoded_utf16le = Utf16LeBytes>(wide_string);
std::wstring decoded_utf16le = Utf16LeWstring(encoded_utf16le);
assert(wide_string == decoded_utf16le);
// UTF-16BE
std::vector encoded_utf16be = Utf16BeBytes>(wide_string);
std::wstring decoded_utf16be = Utf16BeWstring(encoded_utf16be);
assert(wide_string == decoded_utf16be);
```