https://github.com/janlelis/unicode-category.js
Get the General Category of a Unicode character
https://github.com/janlelis/unicode-category.js
cased-characters control-characters general-category javascript letters strings unicode unicode-characters
Last synced: 10 months ago
JSON representation
Get the General Category of a Unicode character
- Host: GitHub
- URL: https://github.com/janlelis/unicode-category.js
- Owner: janlelis
- License: mit
- Created: 2024-10-16T14:14:40.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-16T14:20:36.000Z (over 1 year ago)
- Last Synced: 2025-08-25T23:29:31.002Z (10 months ago)
- Topics: cased-characters, control-characters, general-category, javascript, letters, strings, unicode, unicode-characters
- Language: JavaScript
- Homepage: https://character.construction
- Size: 13.7 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: MIT-LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# unicode-category.js [![[ci]](https://github.com/janlelis/unicode-category.js/workflows/Test/badge.svg)](https://github.com/janlelis/unicode-category.js/actions?query=workflow%3ATest)
Returns the [Major or General Category](https://en.wikipedia.org/wiki/Unicode_character_property#General_Category) a Unicode character belongs to.
Unicode version: **16.0.0** (September 2024)
## Install
Use npm or your favorite package manager to install this module:
```sh
npm install unicode-category
```
Or use ESM module directly from the browser.
## Usage - General Category
### `unicodeGeneralCategory(char)` / `unicodeGeneralCategoryLong(char)`
```js
// Get General Category of a single character
import { unicodeGeneralCategory, unicodeGeneralCategoryLong } from "unicode-category";
unicodeGeneralCategory("ᴦ") // 'Ll'
unicodeGeneralCategoryLong("ᴦ") // 'Lowercase_Letter'
unicodeGeneralCategory("☼") // 'So'
unicodeGeneralCategoryLong("☼") // 'Other_Symbol'
unicodeGeneralCategory("") // 'Cn'
unicodeGeneralCategoryLong("") // 'Unassigned'
```
### `unicodeGeneralCategories(string)` / `unicodeGeneralCategoriesLong(string)`
```js
// Get a Set of all General Categories used in a string
import { unicodeGeneralCategories, unicodeGeneralCategoriesLong } from "unicode-category";
unicodeGeneralCategories("A 2") // Set(3) { 'Lu', 'Nd', 'Zs' }
unicodeGeneralCategoriesLong("A 2") // Set(3) { "Decimal_Number", "Space_Separator", "Uppercase_Letter" }
```
## Usage - Major Category
### `unicodeMajorCategory(char)` / `unicodeMajorCategoryLong(char)`
```js
// Get Major Category of a single character
import { unicodeMajorCategory, unicodeMajorCategoryLong } from "unicode-category";
unicodeMajorCategory("ᴦ") // 'L'
unicodeMajorCategoryLong("ᴦ") // 'Letter'
unicodeMajorCategory("☼") // 'S'
unicodeMajorCategoryLong("☼") // 'Symbol'
unicodeMajorCategory("") // 'C'
unicodeMajorCategoryLong("") // 'Other'
```
### `unicodeMajorCategories(string)` / `unicodeMajorCategoriesLong(string)`
```js
import { unicodeMajorCategories, unicodeMajorCategoriesLong } from "unicode-category";
// Set of all Major Categories of a string
unicodeMajorCategories("A 2") // Set(3) { 'L', 'N', 'Z' }
unicodeMajorCategoriesLong("A 2") // Set(3) { "Number", "Separator", "Letter" }
```
### List All Categories
```js
import { listUnicodeGeneralCategories } from "unicode-category"
listUnicodeGeneralCategories()
// {
// Cc: 'Control',
// Cf: 'Format',
// Cn: 'Unassigned',
// Co: 'Private_Use',
// Cs: 'Surrogate',
// LC: 'Cased_Letter',
// Ll: 'Lowercase_Letter',
// Lm: 'Modifier_Letter',
// Lo: 'Other_Letter',
// Lt: 'Titlecase_Letter',
// Lu: 'Uppercase_Letter',
// Mc: 'Spacing_Mark',
// Me: 'Enclosing_Mark',
// Mn: 'Nonspacing_Mark',
// Nd: 'Decimal_Number',
// Nl: 'Letter_Number',
// No: 'Other_Number',
// Pc: 'Connector_Punctuation',
// Pd: 'Dash_Punctuation',
// Pe: 'Close_Punctuation',
// Pf: 'Final_Punctuation',
// Pi: 'Initial_Punctuation',
// Po: 'Other_Punctuation',
// Ps: 'Open_Punctuation',
// Sc: 'Currency_Symbol',
// Sk: 'Modifier_Symbol',
// Sm: 'Math_Symbol',
// So: 'Other_Symbol',
// Zl: 'Line_Separator',
// Zp: 'Paragraph_Separator',
// Zs: 'Space_Separator'
// }
import { listUnicodeMajorCategories } from "unicode-category"
listUnicodeMajorCategories()
// {
// C: 'Other',
// L: 'Letter',
// M: 'Mark',
// N: 'Number',
// P: 'Punctuation',
// S: 'Symbol',
// Z: 'Separator'
// }
```
## Also See
- Get the script of a string: [unicode-script.js](https://github.com/janlelis/unicode-script.js)
- Get the block of a string: [unicode-block.js](https://github.com/janlelis/unicode-block.js)
- Get the name of a character: [unicode-name.js](https://github.com/janlelis/unicode-name.js)
- Index created with: [unicoder](https://github.com/janlelis/unicoder)
## MIT License
- Copyright (C) 2024 Jan Lelis . Released under the MIT license.
- Unicode data: https://www.unicode.org/copyright.html#Exhibit1