Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/lautis/unicode-substring

Unicode-aware substring for JavaScript
https://github.com/lautis/unicode-substring

Last synced: 21 days ago
JSON representation

Unicode-aware substring for JavaScript

Awesome Lists containing this project

README

        

# unicode-substring [![Build Status](https://travis-ci.org/lautis/unicode-substring.svg?branch=master)](https://travis-ci.org/lautis/unicode-substring)

Unicode-aware substring for JavaScript. Surrogate pairs are counted as a single character.

## What?

Characters in JavaScript strings are exposed as 16-bit code points, also known as UCS-2 encoding. This usually good enough, but since there are more than 2^16 characters in Unicode, 16 bits is not enough to represent all characters. To overcome this limitation, characters with scalar value over `0x10FFFF` need to be encoded as surrogate pairs. This encoding is known as UTF-16.

The purpose of this library is to treat surrogate pairs as one character when extracting substrings from a string. This might be preferable if indices are returned from an Unicode-compatible environment.

## Usage

```javascript

var unicodeSubstring = require('unicode-substring')
// unicodeSubstring(string, start, end)
unicodeSubstring("💥Emoji Rule💥", 0, 6)
// => "💥Emoji"
```

The `start` and `end` parameters behave similarly as [String.prototype.substring](https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/String/substring).