{"id":22893386,"url":"https://github.com/railgunlabs/unicorn","last_synced_at":"2025-04-15T01:36:54.298Z","repository":{"id":267014452,"uuid":"897660689","full_name":"railgunlabs/unicorn","owner":"railgunlabs","description":"Unicode® algorithms on a chip. Compliant with MISRA C:2012.","archived":false,"fork":false,"pushed_at":"2025-02-19T16:05:23.000Z","size":1489,"stargazers_count":60,"open_issues_count":0,"forks_count":3,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-28T13:37:29.528Z","etag":null,"topics":["c99","grapheme-clusters","misra-c","unicode","unicode-collation","unicode-normalization"],"latest_commit_sha":null,"homepage":"https://RailgunLabs.com/unicorn","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/railgunlabs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-03T02:34:55.000Z","updated_at":"2025-03-09T03:57:31.000Z","dependencies_parsed_at":"2024-12-07T18:19:16.875Z","dependency_job_id":"9488f76f-5f9b-4900-a862-052a54488019","html_url":"https://github.com/railgunlabs/unicorn","commit_stats":null,"previous_names":["railgunlabs/unicorn"],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/railgunlabs%2Funicorn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/railgunlabs%2Funicorn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/railgunlabs%2Funicorn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/railgunlabs%2Funicorn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/railgunlabs","download_url":"https://codeload.github.com/railgunlabs/unicorn/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248990326,"owners_count":21194738,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c99","grapheme-clusters","misra-c","unicode","unicode-collation","unicode-normalization"],"created_at":"2024-12-13T23:14:16.037Z","updated_at":"2025-04-15T01:36:54.285Z","avatar_url":"https://github.com/railgunlabs.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cpicture\u003e\n  \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\".github/unicorn-dark.svg\"\u003e\n  \u003csource media=\"(prefers-color-scheme: light)\" srcset=\".github/unicorn.svg\"\u003e\n  \u003cimg alt=\"Unicorn\" src=\".github/unicorn.svg\" width=\"408px\"\u003e\n\u003c/picture\u003e\n\nUnicorn is a lightweight, embeddable implementation of essential Unicode® algorithms written in C99.\n\nUnicorn is compliant with the **MISRA C:2012** coding standard.\nIt's perfect for resource constrained devices like microcontrollers and IoT devices.\n\nThis repository contains the scripts to generate the Unicorn header/source amalgamation.\nThe unamalgamated source code is available exclusively to commercial licensees.\n\n[![Build Status](https://github.com/RailgunLabs/unicorn/actions/workflows/build.yml/badge.svg)](https://github.com/RailgunLabs/unicorn/actions/workflows/build.yml)\n![Unicode Version](https://img.shields.io/badge/Unicode-v16.0.0-blue)\n\n## Features\n\n* Normalization ([docs](https://RailgunLabs.com/unicorn/manual/api/normalization/))\n* Case mapping ([docs](https://RailgunLabs.com/unicorn/manual/api/case-mapping/))\n* Collation ([docs](https://RailgunLabs.com/unicorn/manual/api/collation/))\n* Segmentation ([docs](https://RailgunLabs.com/unicorn/manual/api/segmentation/))\n* Short string compression ([docs](https://RailgunLabs.com/unicorn/manual/api/compression/))\n* UTF-8, 16, and 32 iterators and convertors ([docs](https://RailgunLabs.com/unicorn/manual/api/text-encodings/))\n* Various character properties ([docs](https://RailgunLabs.com/unicorn/manual/api/character-properties/))\n* MISRA C:2012 compliance ([learn more](https://RailgunLabs.com/unicorn/manual/misra-compliance/))\n* Distributed as a single header/source amalgamation\n* Written in C99 with no external dependencies\n\n## Fully Customizable\n\nUnicorn is fully customizable.\nYou can choose which Unicode algorithms and character properties to include.\nYou can even exclude character blocks for scripts your application does not support.\n\nTo customize Unicorn, modify `features.json` and run the `generate.pyz` script.\nThis script will generate the `unicorn.c` and `unicorn.h` source files which you can compile with your C project.\nWhen Unicorn is built with a provided build system (e.g. CMake), the script is executed automatically as part of the build process.\n\n\u003cp align=\"center\"\u003e\n  \u003cpicture\u003e\n    \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\".github/customization-dark.svg\"\u003e\n    \u003csource media=\"(prefers-color-scheme: light)\" srcset=\".github/customization.svg\"\u003e\n    \u003cimg alt=\"Customization\" src=\".github/customization.svg\" width=\"500px\"\u003e\n  \u003c/picture\u003e\n\u003c/p\u003e\n\nThe schema for `features.json` is [documented here](https://RailgunLabs.com/unicorn/manual/feature-customization/).\n\n## Ultra Portable\n\nUnicorn is _ultra portable_.\nIt does **not** require an FPU or 64-bit integers.\nIt's written in C99 and only requires a few features from libc which are listed in the following table.\n\n| Header | Types | Macros | Functions |\n| --- | --- | --- | --- |\n| **stdint.h** |  `int8_t`, `int16_t`, `int32_t` \u003cbr/\u003e `uint8_t`, `uint16_t`, `uint32_t` | | |\n| **string.h** | | | `memcpy`, `memset`, `memcmp` |\n| **stddef.h** | `size_t` | `NULL` | |\n| **stdbool.h** | |  `bool`, `true`, `false` | |\n| **assert.h** | |  `assert` | |\n\n## MISRA C:2012 Compliance\n\nUnicorn honors all Mandatory, most Required, and most Advisory rules defined by MISRA C:2012 and its amendments.\nDeviations are [documented here](https://RailgunLabs.com/unicorn/manual/misra-compliance/).\nYou are encouraged to audit Unicorn and verify its level of conformance is acceptable.\n\n## Supported Unicode Encodings\n\nAll functions that operate on text can accept UTF-8, UTF-16, UTF-32, or Unicode scalar values.\nUTF-16 and UTF-32 are accepted as big endian, little endian, and native byte order.\n\nBy default, the implementation performs runtime safety checks to guard against malformed or maliciously encoded text.\nIf you _know_ your text isn't malformed you can opt-in to skip these checks to improve processing performance.\n\n## Thread Safety\n\nUnicorn **is** thread-safe except for the following caveats:\n\n* Functions that allocate memory are only as thread-safe as the allocator itself.\n* The [configuration API](https://RailgunLabs.com/unicorn/manual/api/library-configuration/) is **not** thread-safe, however, in typical usage it's only invoked at application startup and only if the default configuration is unsatisfactory.\n\n## Atomic Operations\n\nAll operations in Unicorn are _atomic_.\nThat means either an operation occurs or nothing occurs at all.\nThis guarantees errors, such as out-of-memory errors, never corrupt internal state.\nThis also means if an error occurs, like an out of memory error, then you can recover (free up memory) and try the same operation again.\n\n## Extensively Tested\n\n* 100% branch test coverage\n* Official Unicode conformance tests\n* Manually written tests\n* Out-of-memory tests\n* Fuzz tests\n* Static analysis\n* Valgrind analysis\n* Code sanitizers (UBSAN, ASAN, and MSAN)\n* Extensive use of assert() and run-time checks\n\n## Installation\n\nDownload a prebuilt header/source amalgamation from the [releases page](https://github.com/RailgunLabs/unicorn/releases) or generate one yourself by running `./generate.pyz` (requires Python 3.10 or newer).\nThe prebuilt amalgamation includes _all_ features whereas the one you generate yourself only includes the features you specify in [features.json](features.json).\n\nAlternatively, build a linkable library with\n\n```\n$ ./configure\n$ make\n$ make install\n```\n\nor [CMake](https://cmake.org/).\n\n## Support\n\n* [Documentation](https://RailgunLabs.com/unicorn/manual/)\n* [Premium Support](https://RailgunLabs.com/services)\n\nSubmit patches and bug reports to [RailgunLabs.com/contribute](https://RailgunLabs.com/contribute).\nDo **not** open a pull request.\nThe pull request tab is enabled because GitHub does not provide a mechanism to disable it.\n\n## License\n\nUnicorn is free of charge for non-commercial use.\nYou can purchase a commercial license from [Railgun Labs](https://RailgunLabs.com/unicorn/license/).\n\nThe unamalgamated C source code, the programs for generating the Unicode data, and the unit tests are **not** open source.\nAccess to them is granted exclusively to commercial licensees.\n\n_Unicode® is a registered trademark of Unicode, Inc. in the United States and other countries.\nThis project is not in any way associated with or endorsed or sponsored by Unicode, Inc. (aka The Unicode Consortium)._\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frailgunlabs%2Funicorn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frailgunlabs%2Funicorn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frailgunlabs%2Funicorn/lists"}