https://github.com/emnudge/unicode-lookup
The web's best unicode lookup tool!
https://github.com/emnudge/unicode-lookup
codepoint lookup unicode utf
Last synced: 6 months ago
JSON representation
The web's best unicode lookup tool!
- Host: GitHub
- URL: https://github.com/emnudge/unicode-lookup
- Owner: EmNudge
- License: mit
- Created: 2020-06-15T06:48:05.000Z (over 5 years ago)
- Default Branch: main
- Last Pushed: 2025-04-04T14:22:35.000Z (6 months ago)
- Last Synced: 2025-04-07T12:51:17.336Z (6 months ago)
- Topics: codepoint, lookup, unicode, utf
- Language: Svelte
- Homepage: https://unicode.emnudge.dev
- Size: 1.36 MB
- Stars: 62
- Watchers: 4
- Forks: 4
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Unicode Lookup
This is a web app to quickly look up unicode names from hex codes.
I've recently been using unicode lookup tools a lot during research and I found some to be lacking, buggy, or slow.
It turns out that it's super easy to use web workers and a massive text file to make the entire application static and super quick.
That's what this is.A quick, easy, static unicode lookup web app.
Try it at http://unicode.emnudge.dev/
# How To Use
Right click the info icon to copy it to your clipboard.Left click a row to bring up the info view.
Currently there is "Simple Search" and "Advanced Search".
For *Simple Search*, it accepts:
- character to find (simply type in a single character)
- text to match against description
- regex to match against *character* (in form `/regex/flags`)
- glob pattern match against *names* (in form `latin * a`)
- codepoint of character
- decimal form
- hexidecimal form (in form `0x[number]` or `U+[number]` e.g. `0xfab23` or `U+fab23`)
- binary form (in form `0b[number]` e.g. `0b1101110`)
- codepoint range (in form `num1-num2`, where `num1` and `num2` are in dec, hex, or bin form)*Advanced Search* is a bit more... advanced.
It allows us to create a series of filters, called "rules". These rules can be of 2 types:- **Require**
- requires codepoint to match any of N filters.
- can chain multiple with the `OR` button.
- **Exclude**
- requires codepoint to NOT match a specific filter.The following example excludes characters in the BMP and then requires the character to either have the unicode property XID_Start or the property White_Space:

*Simple Search* will execute while we type. *Advanced Search* will only execute once pressing the `Search` button.
*Simple Search* will also execute if we press enter focusing the box.### Known Bugs/Limitations
- Safari does not support any regex unicode property where `Script` equals one of `"Chorasmian"`, `"Dives_Akuru"`, `"Elymaic"`, `"Khitan_Small_Script"`, `"Nandinagari"`, `"Nyiakeng_Puachue_Hmong"`, `"Wancho"`, or `"Yezidi"`.
- e.g. `/\p{Script=Wancho}/u` will throw an error on Safari and is therefore not supported on that browser.
- Anything typed into simple search before all unicode data has loaded will not be searched, even after the unicode data has loaded.
- Pressing enter while in the text box will perform a search after loading is complete.