https://github.com/elixir-unicode/unicode_transform
Implements the Unicode transformation rules
https://github.com/elixir-unicode/unicode_transform
Last synced: 9 months ago
JSON representation
Implements the Unicode transformation rules
- Host: GitHub
- URL: https://github.com/elixir-unicode/unicode_transform
- Owner: elixir-unicode
- License: other
- Created: 2019-11-15T00:57:27.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2022-04-17T23:46:37.000Z (almost 4 years ago)
- Last Synced: 2024-12-13T20:58:20.053Z (about 1 year ago)
- Language: Elixir
- Size: 755 KB
- Stars: 9
- Watchers: 4
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Unicode Transform
Implements the [Unicode transform rules](https://unicode.org/reports/tr35/tr35-general.html#Transforms). This is particularly useful from transliterating from one script to another.
## Installation
```elixir
def deps do
[
{:unicode_transform, "~> 0.1.0"}
]
end
```
The docs are found at [https://hexdocs.pm/unicode_transform](https://hexdocs.pm/unicode_transform).
### Usage
[CLDR](https://cldr.unicode.org) defines a [transform specification](https://unicode.org/reports/tr35/tr35-general.html#Transforms) to aid in transforming text from one script to another. It also defines a number of transforms implementing the specification and this library aims to implement these transforms in elixir.
The strategy used it to generate an elixir module for each of the CLDR transforms. This happens in two parts:
1. The transform defined by CLDR is used to generate an elixir module that contains macro calls modelled on the transform specification. For example, see the generated [Unicode.Transform.LatinAscii.ex](https://github.com/elixir-unicode/unicode_transform/blob/master/lib/transforms/latin_ascii.ex). Generation is performed with `Unicode.Transform.Generator.generate/2`
2. At compilation the macros in the generated module are compiled to elixir code resulting in a module with a single public API `transform/1`
### Current state
The library supports only one transform, `Unicode.Transform.LatinAscii` that translates the Latin-1 script to ASCII. This is commonly referred to as "removing accents" although the scope is much broader. The file [latin_ascii.ex](https://github.com/elixir-unicode/unicode_transform/blob/master/lib/transforms/latin_ascii.ex) is largely self-explanatory.