Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/node-unicode/node-unicode-data
JavaScript-compatible Unicode data generator. Arrays of code points, arrays of symbols, and regular expressions for every Unicode version’s categories, scripts, blocks, and properties — neatly packaged into a separate npm package per Unicode version.
https://github.com/node-unicode/node-unicode-data
Last synced: about 1 month ago
JSON representation
JavaScript-compatible Unicode data generator. Arrays of code points, arrays of symbols, and regular expressions for every Unicode version’s categories, scripts, blocks, and properties — neatly packaged into a separate npm package per Unicode version.
- Host: GitHub
- URL: https://github.com/node-unicode/node-unicode-data
- Owner: node-unicode
- License: mit
- Created: 2013-08-25T13:31:40.000Z (over 11 years ago)
- Default Branch: main
- Last Pushed: 2023-08-31T15:38:05.000Z (over 1 year ago)
- Last Synced: 2024-05-02T00:05:48.482Z (7 months ago)
- Language: JavaScript
- Homepage: https://mths.be/node-unicode-data
- Size: 5.04 MB
- Stars: 136
- Watchers: 10
- Forks: 15
- Open Issues: 17
-
Metadata Files:
- Readme: README.md
- License: LICENSE-MIT.txt
Awesome Lists containing this project
- awesome-typography - node-unicode-data - JavaScript-compatible Unicode data generator. (JavaScript)
README
# node-unicode-data
JavaScript-compatible Unicode data generator. Arrays of code points, arrays of symbols, and regular expressions for every Unicode version’s categories, scripts, script extensions, blocks, bidi data, and other properties — neatly packaged into a separate npm package per Unicode version.
## Using the data in your scripts
To use the generated data, simply install one of [the npm modules generated by this script](https://www.npmjs.com/org/unicode). Separate packages are available for each Unicode version. This allows you to do stuff like:
```js
// Get an array of all code points with the `White_Space` property:
const codePoints = require('@unicode/unicode-6.3.0/Binary_Property/White_Space/code-points');
// Get an array of strings (containing one symbol each) in the `Lu` category:
const symbols = require('@unicode/unicode-6.3.0/General_Category/Uppercase_Letter/symbols');
// Get a regular expression that matches any symbol in the `Aegean Numbers` block:
const regex = require('@unicode/unicode-6.3.0/Block/Aegean_Numbers/regex');
// Get an array of all code points in the `Egyptian_Hieroglyphs` script:
const hieroglyphs = require('@unicode/unicode-6.3.0/Script/Egyptian_Hieroglyphs/code-points');
// Get the canonical category a given code point belongs to:
// (Note: U+0041 is LATIN CAPITAL LETTER A)
const category = require('@unicode/unicode-6.3.0/General_Category').get(0x41);
// Get an array of all code points with a given bidi class:
const lre = require('@unicode/unicode-6.3.0/Bidi_Class/Left_To_Right_Embedding/code-points');
// Get the directionality of a given code point:
const directionality = require('@unicode/unicode-6.3.0/Bidi_Class').get(0x41);
// What glyph is the mirror image of `«` (U+00AB)?
const mirrored = require('@unicode/unicode-6.3.0/Bidi_Mirroring_Glyph').get(0xAB);
// Get a regular expression that matches all opening brackets:
const openingBrackets = require('@unicode/unicode-6.3.0/Bidi_Paired_Bracket_Type/Open/regex');
// …you get the idea.
```For more information, see the README for the package you’re interested in. [Here’s the full list of npm packages generated by this script](https://www.npmjs.com/org/unicode):
* [_@unicode/1.1.5_](https://npmjs.org/package/@unicode/unicode-1.1.5#readme) ([repository](https://github.com/node-unicode/unicode-1.1.5#readme))
* [_@unicode/2.0.14_](https://npmjs.org/package/@unicode/unicode-2.0.14#readme) ([repository](https://github.com/node-unicode/unicode-2.0.14#readme))
* [_@unicode/2.1.2_](https://npmjs.org/package/@unicode/unicode-2.1.2#readme) ([repository](https://github.com/node-unicode/unicode-2.1.2#readme))
* [_@unicode/2.1.5_](https://npmjs.org/package/@unicode/unicode-2.1.5#readme) ([repository](https://github.com/node-unicode/unicode-2.1.5#readme))
* [_@unicode/2.1.8_](https://npmjs.org/package/@unicode/unicode-2.1.8#readme) ([repository](https://github.com/node-unicode/unicode-2.1.8#readme))
* [_@unicode/2.1.9_](https://npmjs.org/package/@unicode/unicode-2.1.9#readme) ([repository](https://github.com/node-unicode/unicode-2.1.9#readme))
* [_@unicode/3.0.0_](https://npmjs.org/package/@unicode/unicode-3.0.0#readme) ([repository](https://github.com/node-unicode/unicode-3.0.0#readme))
* [_@unicode/3.0.1_](https://npmjs.org/package/@unicode/unicode-3.0.1#readme) ([repository](https://github.com/node-unicode/unicode-3.0.1#readme))
* [_@unicode/3.1.0_](https://npmjs.org/package/@unicode/unicode-3.1.0#readme) ([repository](https://github.com/node-unicode/unicode-3.1.0#readme))
* [_@unicode/3.1.1_](https://npmjs.org/package/@unicode/unicode-3.1.1#readme) ([repository](https://github.com/node-unicode/unicode-3.1.1#readme))
* [_@unicode/3.2.0_](https://npmjs.org/package/@unicode/unicode-3.2.0#readme) ([repository](https://github.com/node-unicode/unicode-3.2.0#readme))
* [_@unicode/4.0.0_](https://npmjs.org/package/@unicode/unicode-4.0.0#readme) ([repository](https://github.com/node-unicode/unicode-4.0.0#readme))
* [_@unicode/4.0.1_](https://npmjs.org/package/@unicode/unicode-4.0.1#readme) ([repository](https://github.com/node-unicode/unicode-4.0.1#readme))
* [_@unicode/4.1.0_](https://npmjs.org/package/@unicode/unicode-4.1.0#readme) ([repository](https://github.com/node-unicode/unicode-4.1.0#readme))
* [_@unicode/5.0.0_](https://npmjs.org/package/@unicode/unicode-5.0.0#readme) ([repository](https://github.com/node-unicode/unicode-5.0.0#readme))
* [_@unicode/5.1.0_](https://npmjs.org/package/@unicode/unicode-5.1.0#readme) ([repository](https://github.com/node-unicode/unicode-5.1.0#readme))
* [_@unicode/5.2.0_](https://npmjs.org/package/@unicode/unicode-5.2.0#readme) ([repository](https://github.com/node-unicode/unicode-5.2.0#readme))
* [_@unicode/6.0.0_](https://npmjs.org/package/@unicode/unicode-6.0.0#readme) ([repository](https://github.com/node-unicode/unicode-6.0.0#readme))
* [_@unicode/6.1.0_](https://npmjs.org/package/@unicode/unicode-6.1.0#readme) ([repository](https://github.com/node-unicode/unicode-6.1.0#readme))
* [_@unicode/6.2.0_](https://npmjs.org/package/@unicode/unicode-6.2.0#readme) ([repository](https://github.com/node-unicode/unicode-6.2.0#readme))
* [_@unicode/6.3.0_](https://npmjs.org/package/@unicode/unicode-6.3.0#readme) ([repository](https://github.com/node-unicode/unicode-6.3.0#readme))
* [_@unicode/7.0.0_](https://npmjs.org/package/@unicode/unicode-7.0.0#readme) ([repository](https://github.com/node-unicode/unicode-7.0.0#readme))
* [_@unicode/8.0.0_](https://npmjs.org/package/@unicode/unicode-8.0.0#readme) ([repository](https://github.com/node-unicode/unicode-8.0.0#readme))
* [_@unicode/9.0.0_](https://npmjs.org/package/@unicode/unicode-9.0.0#readme) ([repository](https://github.com/node-unicode/unicode-9.0.0#readme))
* [_@unicode/10.0.0_](https://npmjs.org/package/@unicode/unicode-10.0.0#readme) ([repository](https://github.com/node-unicode/unicode-10.0.0#readme))
* [_@unicode/11.0.0_](https://npmjs.org/package/@unicode/unicode-11.0.0#readme) ([repository](https://github.com/node-unicode/unicode-11.0.0#readme))
* [_@unicode/12.0.0_](https://npmjs.org/package/@unicode/unicode-12.0.0#readme) ([repository](https://github.com/node-unicode/unicode-12.0.0#readme))
* [_@unicode/12.1.0_](https://npmjs.org/package/@unicode/unicode-12.1.0#readme) ([repository](https://github.com/node-unicode/unicode-12.1.0#readme))
* [_@unicode/13.0.0_](https://npmjs.org/package/@unicode/unicode-13.0.0#readme) ([repository](https://github.com/node-unicode/unicode-13.0.0#readme))
* [_@unicode/14.0.0_](https://npmjs.org/package/@unicode/unicode-14.0.0#readme) ([repository](https://github.com/node-unicode/unicode-14.0.0#readme))
* [_@unicode/15.0.0_](https://npmjs.org/package/@unicode/unicode-15.0.0#readme) ([repository](https://github.com/node-unicode/unicode-15.0.0#readme))
* [_@unicode/15.1.0_](https://npmjs.org/package/@unicode/unicode-15.1.0#readme) ([repository](https://github.com/node-unicode/unicode-15.1.0#readme))
* [_@unicode/16.0.0_](https://npmjs.org/package/@unicode/unicode-16.0.0#readme) ([repository](https://github.com/node-unicode/unicode-16.0.0#readme))Note that these READMEs are auto-generated by this script, too – they describe all the data that is available for that particular Unicode version. To programmatically get this list of available categories, scripts, script extensions, blocks, and properties for a given Unicode version, just `require` the main module for that version:
```js
> require('unicode-6.3.0');
{
'Binary_Property': [
'Alphabetic', 'Any', 'ASCII', 'ASCII_Hex_Digit', 'Assigned', …
],
'General_Category': [
'Cased_Letter','Close_Punctuation','Connector_Punctuation', …
],
'Script': [
'Arabic', 'Armenian', 'Avestan', …
],
'Script_Extensions': [
'Arabic', 'Armenian', 'Avestan', …
],
'Block': [
'Aegean Numbers', 'Alchemical Symbols', …
],
'Case_Folding': [
'C', 'F', 'S', 'T'
],
'Simple_Case_Mapping': [
'Uppercase', 'Lowercase', 'Titlecase'
],
'Special_Casing': [
'Uppercase', 'Lowercase', 'Titlecase', …
],
'Bidi_Class': [
'Arabic_Letter', 'Arabic_Number', 'Boundary_Neutral', …
],
'Bidi_Mirroring_Glyph': [],
'Bidi_Paired_Bracket_Type': [
'Close', 'None', 'Open'
]
}
```## For project maintainers
After cloning this repository, before doing anything else, run:
```sh
./clone-repos.sh
```This clones all the generated repositories to your local `output` folder. You can then make changes to node-unicode-data, and use `./bootstrap.sh` to commit and push changes to each of these repositories.
## Generating the data
`npm run download` (re-)downloads the Unicode source files for all the Unicode versions defined in `data/resources.js`, saving them in the `data` folder.
`npm run build` generates data for all the Unicode versions defined in `data/resources.js`. This may take a few minutes… The regular expressions are generated using [Regenerate](https://mths.be/regenerate).
## Testing
`npm test` generates the data for the oldest and latest available Unicode version. This is a good way to test changes to the generator scripts before running `npm run-script generate`.
`npm run-script cover` generates [the code coverage report](http://rawgithub.com/node-unicode/node-unicode-data/master/coverage/index.html).
## Author
| [![twitter/mathias](https://gravatar.com/avatar/24e08a9ea84deb17ae121074d0f17125?s=70)](https://twitter.com/mathias "Follow @mathias on Twitter") |
|---|
| [Mathias Bynens](https://mathiasbynens.be/) |## License
This module is available under the [MIT](https://mths.be/mit) license.