https://github.com/begriffs/flexicode
Tools scanning Unicode in Flex
https://github.com/begriffs/flexicode
Last synced: 7 months ago
JSON representation
Tools scanning Unicode in Flex
- Host: GitHub
- URL: https://github.com/begriffs/flexicode
- Owner: begriffs
- Created: 2021-01-17T18:44:01.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2021-01-18T18:08:55.000Z (about 5 years ago)
- Last Synced: 2025-06-14T19:38:44.132Z (9 months ago)
- Language: C
- Size: 4.88 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Utilities for unicode in Flex
### charclass
Outputs a regex to match UTF-8 byte sequences for all codepoints matching an
[ICU unicode
regex](https://unicode-org.github.io/icu/userguide/strings/regexp.html#regular-expression-metacharacters).
```sh
# all Chinese characters
./charclass '\p{Han}'
# horizontal whitespace
./charclass '\h'
```
The `\p` option is especially powerful because it can match [unicode
properties](https://en.wikipedia.org/wiki/Unicode_character_property#General_Category).
To use the regexes, give them aliases in your Flex file:
```lex
/* from charcode '\h' */
whitespace \x09|\x20|\xc2\xa0|\xe1\x9a\x80|\xe2\x80[\x80-\x8a]|\xe2\x80\xaf|\xe2\x81\x9f
%%
{whitespace} { /* ... */ }
```
### Installation
Requires C99, [ICU](http://site.icu-project.org/download/), and
[pkg-config](https://www.freedesktop.org/wiki/Software/pkg-config/).
```sh
./configure
make
```