An open API service indexing awesome lists of open source software.

https://github.com/simon987/pg_asciifold

asciifold C-Language function based on Lucene's ASCIIFoldingFilter
https://github.com/simon987/pg_asciifold

postgresql

Last synced: about 1 month ago
JSON representation

asciifold C-Language function based on Lucene's ASCIIFoldingFilter

Awesome Lists containing this project

README

          

# PostgreSQL ASCII folding

Reasonably fast (tested on Musicbrainz dataset, is 40% faster than a simple `UPPER()`)
ASCII folding functions based on [Lucene's ASCIIFoldingFilter](https://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.html) for PostgreSQL

*Example:*
```
postgres=# SELECT asciifold('Hello, ⒩ᴐⱤú⒴⁈~!');
asciifold
----------------------
Hello, (n)ORu(y)?!~!
(1 row)

postgres=# SELECT asciifold_lower('Hello, ⒩ᴐⱤú⒴⁈~!');
asciifold
----------------------
hello, (n)oru(y)?!~!
(1 row)
```

UTF8 input string is not sanitized (invalid UTF8 might lead to undefined behavior)

### Compiling from source (CMake)

```bash
apt install postgresql-server-11-dev
cmake .
make
```

See [asciifolding.c](asciifolding.c) & [build.sh](build.sh) for more information