Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nitely/nim-normalize
Unicode normalization forms (tr15) in linear time
https://github.com/nitely/nim-normalize
nim nim-lang unicode unicode-normalization
Last synced: 24 days ago
JSON representation
Unicode normalization forms (tr15) in linear time
- Host: GitHub
- URL: https://github.com/nitely/nim-normalize
- Owner: nitely
- License: mit
- Created: 2017-11-27T06:23:41.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2024-09-19T00:58:31.000Z (4 months ago)
- Last Synced: 2024-11-10T06:42:25.633Z (3 months ago)
- Topics: nim, nim-lang, unicode, unicode-normalization
- Language: Nim
- Homepage: https://nitely.github.io/nim-normalize/
- Size: 1.08 MB
- Stars: 20
- Watchers: 5
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# Normalize
[![Build Status](https://img.shields.io/travis/nitely/nim-normalize.svg?style=flat-square)](https://travis-ci.org/nitely/nim-normalize)
[![licence](https://img.shields.io/github/license/nitely/nim-normalize.svg?style=flat-square)](https://raw.githubusercontent.com/nitely/nim-normalize/master/LICENSE)A library for normalizing unicode text. Implements all the
Unicode Normalization Form algorithms. Normalization is
buffered and takes O(n) time and O(1) space.> Note: the ``iterator`` version takes O(1)
> space, but the ``proc`` takes O(n) space.## Install
```
nimble install normalize
```## Compatibility
Nim +1.0.0
## Usage
```nim
import normalize# Normalization
assert toNfc("E◌̀") == "È"
assert toNfc("\u0045\u0300") == "\u00C8"
assert toNfd("È") == "E◌̀"
assert toNfd("\u00C8") == "\u0045\u0300"# toNfkc and toNfkd are also available
# Canonical comparison
assert cmpNfd(
"Voulez-vous un caf\u00E9?",
"Voulez-vous un caf\u0065\u0301?")# Normalization check (not always reliable, see docs)
assert isNfd(toNfd("\u1E0A"))# isNfc, isNfkc and isNfkd are also available
```> Note: when printing to a terminal,
> the output may visually trick you.
> Better try printing the len or the runes[docs](https://nitely.github.io/nim-normalize/)
## Optimizations
The best optimization is to avoid normalizing when the text
is already normalized. The `isNf` family of procs can be
used for this purpose.```nim
import normalizetemplate fastNfc(s: var string) =
if not isNfc(s):
s = toNfc(s)
```> Beware `isNf` may return `false` even after normalizing,
this is because the internal check has 3 possible outputs
"Yes", "No" and "MayBe". The problem is the output may
always be "MayBe" for certain texts.## Tests
```
nimble test
```## LICENSE
MIT