Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/maxdz-gmbh/mdz_unicode

Very lightweight, versatile and portable C library for handling Unicode strings. Source code of library conforms to ANSI C 89/90 Standard.
https://github.com/maxdz-gmbh/mdz_unicode

android c freebsd library linux macos portable strings unicode-characters unicode-support unicode-symbols utf-16 utf-32 utf-8 utf16 utf32 utf8 wchar-support windows

Last synced: about 1 month ago
JSON representation

Very lightweight, versatile and portable C library for handling Unicode strings. Source code of library conforms to ANSI C 89/90 Standard.

Awesome Lists containing this project

README

        

**February 2024 NOTE:** This repo is obsolete. Please use [mdz_string] project/repo instead.

**NOTE:** All 0.x releases are kind of "alpha-versions" without expectations of API backward-compatibility.

## Table of Contents
[mdz_unicode Overview](#mdz_unicode-Overview)

[mdz_unicode Advantages](#mdz_unicode-Advantages)

[mdz_unicode Usage](#mdz_unicode-Usage)

[mdz_wchar Overview](#mdz_wchar-Overview)

[mdz_utf8 Overview](#mdz_utf8-Overview)

[mdz_utf16 Overview](#mdz_utf16-Overview)

[mdz_utf32 Overview](#mdz_utf32-Overview)

[Licensing info](#Licensing-info)

[Credits](#Credits)

## mdz_unicode Overview

Wiki: [mdz_unicode Wiki]

file: *"mdz_unicode.h"*

Please take a look at *"mdz_unicode.h"* file or [mdz_unicode Wiki] site for detailed functions descriptions.

[mdz_unicode Wiki]: https://github.com/maxdz-gmbh/mdz_unicode/wiki/mdz_unicode-overview

[mdz_unicode] - is a very lightweight, versatile and speedy C library for handling Unicode strings, developed by [maxdz Software GmbH].The library supports UTF8, UTF16, UTF32, wchar strings. Source code of library is highly-portable, conforms to ANSI C 89/90 Standard. Builds for Win32/Win64, Linux, FreeBSD, Android, macOS are available.

Only shared/dynamically loaded libraries (*.so* and *.dll* files with import libraries) are available for evaluation testing purposes. Static libraries are covered by our commercial licenses.

**Linux** binaries are built against Linux Kernel 2.6.18 - and thus should be compatible with Debian (from ver. 4.0), Ubuntu (from ver. 8.04), Fedora (from ver. 9)

**FreeBSD** binaries - may be used from FreeBSD ver. 7.0

**Win32** binaries are built using Visual Studio Platform Toolset "v90", thus are compatible with Windows versions from Windows 2000.

**Win64** binaries are built using Visual Studio Platform Toolset "v100", thus are compatible with Windows versions from Windows XP.

**Android** x86/armeabi-v7a binaries - may be used from Android API level 16 ("Jelly Bean" ver. 4.1.x)

**Android** x86_64/arm64-v8a binaries - may be used from Android API level 21 ("Lollipop" ver. 5.0)

**macOS** binaries - x86_64, from *MacOS X v10.6.0*

[mdz_unicode]: https://github.com/maxdz-gmbh/mdz_unicode
[maxdz Software GmbH]: https://maxdz.com/
[mdz_string]: https://github.com/maxdz-gmbh/mdz_string

## mdz_unicode Advantages

**1. High portability:** the whole code conforms to ANSI C 89/90 Standard. Multithreading/asynchronous part is POSIX compatible (under UNIX/Linux).

**2. Little dependencies:** basically *mdz_unicode* functions are only dependend on standard C-library memory-management/access functions. Multithreading part is dependend on POSIX *pthreads* API (under UNIX/Linux) and old process control/synchronization API (from Windows 2000). It means you can use library in your code withouth any further dependencies except standard platform libraries/APIs.

**3. Extended error-checking:** all functions preserve internal error-code pointing the problem. It is possible to use strict error-checking (when all preserved error-codes should be *MDZ_ERROR_NONE*) or "relaxed"-checking - when only returned *mdz_false* will indicate error.

**4. Extended control:** containers do only explicit operations. It means for example, when "insert" function is called with auto-reservation flag set in *mdz_false* - it will return error if there is not enough capacity in container. No implicit reservations will be made.

**5. Attached usage:** strings should not necessarily use dynamically-allocated memory - which may be not available on your embedded system (or if malloc()/free() are forbidden to use in your safety-critical software). Just attach container/data to your statically-allocated memory and use all strings functionality.

**6. Cache-friendly:** it is possible to keep controlling and data parts together in memory using "embedded part".

**7. Unicode support:** UTF-8, UTF-16, UTF-32 are supported.

**8. wchar_t support:** also wchar_t strings are supported, with 2 and 4 bytes-large *wchar_t* characters.

**9. Endianness-aware containers:** utf16 and utf32 containers are endiannes-aware thus may be used to produce and manipulate strings with pre-defined endianness even if endianness of host differs.

**10. Unicode "surrogate-pairs" awareness:** 2-byte Unicode strings correctly process/distinguish "surrogate-pairs" as 1 Unicode symbol.

**11. Asynchronous execution:** *insert* functions can be executed asynchronously

## mdz_unicode Usage

**unicode** is implemented with strict input parameters checking. It means *mdz_false* or some other error indication will be returned if one or several input parameters are invalid - even if such an invalidity doesn't lead to inconsistence (for example adding or removing 0 items).

**Test license generation:** - in order to get free test-license, please proceed to our Shop page [maxdz Shop] and register an account. After registration you will be able to obtain free 14-days test-licenses for our products using "Obtain for free" button.
Test license data should be used in *mdz_unicode_init()* call for library initialization.

**NOTE:** All 0.x releases are kind of "beta-versions" and can be used 1) only with test-license (during test period of 14 days, with necessity to re-generate license for the next 14 days test period) and 2) without expectations of interface backward-compatibility.

Several usage-scenarios are possible:
- low-level - raw C interface, using *mdz_unicode.h*, *mdz_utf8.h*, *mdz_utf16.h*, etc C-header files
- higher-level - using *MdzUnicode*, *MdzUtf8*, *MdzUtf16*, etc C++ "wrappers" around C-header files functions

[mdz_unicode Wiki]: https://github.com/maxdz-gmbh/mdz_unicode/wiki/mdz_unicode-overview
[maxdz Shop]: https://maxdz.com/shop.php

#### Code Example (low-level use)

*mdz_unicode_init()* with license information should be called for library initialization before any subsequent calls:

```
#include

int main(int argc, char* argv[])
{
/* mdz_unicode library initialization using test info retrieved after license generation (see "Test license generation" above) */

mdz_bool bRet = mdz_unicode_init("", "", "", "");
...

mdz_ansi_uninit(); /* call for un-initialization of library */

return 0;

}
```

[mdz_utf8_create]: https://github.com/maxdz-gmbh/mdz_unicode/wiki/mdz_utf8_create
[mdz_utf8_destroy]: https://github.com/maxdz-gmbh/mdz_unicode/wiki/mdz_utf8_destroy

After library initialization call *[mdz_utf8_create]*() for **utf8** string creation. There should be also symmetric *[mdz_utf8_destroy]*() call for every create, otherwise allocated for string memory remains occupied:

```
#include
#include

int main(int argc, char* argv[])
{
mdz_bool bRet = mdz_unicode_init("", "", "", "");

// initialize pAnsi

mdz_Utf8* pUtf8 = mdz_utf8_create(0); // create utf8-string
...
...
// use pUtf8
...
...
// destroy pUtf8

mdz_utf8_destroy(&pUtf8); // after this pUtf8 should be NULL

mdz_ansi_uninit();
...
}
```

Use *mdz_Utf8** pointer for subsequent library calls:

```
#include
#include

int main(int argc, char* argv[])
{
mdz_bool bRet = mdz_unicode_init("", "", "", "");

mdz_Utf8* pUtf8 = mdz_utf8_create(0); // create utf8-string

// reserve memory for 5 elements

bRet = mdz_utf8_reserve(pUtf8, 5);

// insert 'b' in front position with auto-reservation if necessary

bRet = mdz_utf8_insertAnsi(pUtf8, 0, "b", 1, mdz_true); // "b" after this call

// append string with "cde" with auto-reservation if necessary

bRet = mdz_utf8_insert(pUtf8, (size_t) -1, "cde", 3, mdz_true); // "bcde" after this call

...

mdz_utf8_destroy(&pUtf8);

mdz_ansi_uninit();
...
}
```

## mdz_wchar Overview
Wiki: [mdz_wchar Wiki]

file: *"mdz_wchar.h"*

Please take a look at *"mdz_wchar.h"* file or [mdz_wchar Wiki] site for detailed functions descriptions.

[mdz_wchar Wiki]: https://github.com/maxdz-gmbh/mdz_containers/wiki/mdz_wchar-overview

## mdz_utf8 Overview
Wiki: [mdz_utf8 Wiki]

file: *"mdz_utf8.h"*

Please take a look at *"mdz_utf8.h"* file or [mdz_utf8 Wiki] site for detailed functions descriptions.

[mdz_utf8 Wiki]: https://github.com/maxdz-gmbh/mdz_containers/wiki/mdz_utf8-overview

## mdz_utf16 Overview
Wiki: [mdz_utf16 Wiki]

file: *"mdz_utf16.h"*

Please take a look at *"mdz_utf16.h"* file or [mdz_utf16 Wiki] site for detailed functions descriptions.

[mdz_utf16 Wiki]: https://github.com/maxdz-gmbh/mdz_containers/wiki/mdz_utf16-overview

## mdz_utf32 Overview
Wiki: [mdz_utf32 Wiki]

file: *"mdz_utf32.h"*

Please take a look at *"mdz_utf32.h"* file or [mdz_utf32 Wiki] site for detailed functions descriptions.

[mdz_utf32 Wiki]: https://github.com/maxdz-gmbh/mdz_containers/wiki/mdz_utf32-overview

## Licensing info

Use of **mdz_unicode** library is regulated by license agreement in *LICENSE.txt*

Basically private non-commercial "test" usage is unrestricted. Commercial usage of library (incl. its source code) will be regulated by according license agreement.