Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/71/oktjs
Open Korean Text transpiled to JavaScript.
https://github.com/71/oktjs
browser javascript korean korean-nlp nodejs typescript
Last synced: about 2 months ago
JSON representation
Open Korean Text transpiled to JavaScript.
- Host: GitHub
- URL: https://github.com/71/oktjs
- Owner: 71
- License: apache-2.0
- Created: 2022-07-27T08:17:36.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2023-09-13T14:19:07.000Z (over 1 year ago)
- Last Synced: 2024-08-02T09:30:34.344Z (5 months ago)
- Topics: browser, javascript, korean, korean-nlp, nodejs, typescript
- Language: Scala
- Homepage:
- Size: 35.2 KB
- Stars: 2
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# oktjs
Port of [Open Korean Text](https://github.com/open-korean-text/open-korean-text)
to JavaScript; it has no external dependencies, and runs in the browser.Note that a modern browser with support for
[ES2018 RegExp unicode escapes](https://caniuse.com/mdn-javascript_builtins_regexp_property_escapes)
is [necessary](https://www.scala-js.org/doc/regular-expressions.html).[Try it online](https://observablehq.com/@71/korean-nlp)!
## Building
To build Oktjs, the following must be installed:
- A JDK.
- [`sbt`](https://www.scala-sbt.org/) to compile the Scala code.
- [`yarn`](https://yarnpkg.com/) to fetch dependencies and bundle the JavaScript
code.Then, `yarn` can be used:
```bash
$ yarn build
```## Details
Oktjs uses [Scala.js](https://www.scala-js.org/) to compile Open Korean Text to
JavaScript, so it is cloned as a submodule to use its sources. A few changes are
required to make it work with JavaScript:- [`open-korean-text/src/main/scala/org/openkoreantext/processor/util/KoreanDictionaryProvider.scala`](open-korean-text/src/main/scala/org/openkoreantext/processor/util/KoreanDictionaryProvider.scala)
is replaced by
[a shim](src/main/scala/org/openkoreantext/processor/util/KoreanDictionaryProviderShim.scala).
- The shim uses [`resources.js`](resources.js) instead of embedded resources
to load dictionaries.
- `resources.js` embeds `resources.json.gz` using
[`esbuild`](https://esbuild.github.io/content-types/#binary).
- `resources.json.gz` is generated by
[`resources.json.gz.build.js`](resources.json.gz.build.js), which reads
resources in
[`open-korean-text/src/main/resources/org/openkoreantext/processor/util`](open-korean-text/src/main/resources/org/openkoreantext/processor/util)
and writes them to a JSON file, which is then gzipped.
- [A minimal shim](src/main/scala/com/twitter/Regex.scala) of
[Twitter Text](https://github.com/twitter/twitter-text) is provided.
- [A minimal shim](src/main/scala/org/openkoreantext/processor/util/CharArraySet.scala)
of `CharArraySet` is provided.
- A Scala.js wrapper for the Open Korean Text API is written in
[`Okt.scala`](src/main/scala/is/gregoirege/oktjs/Okt.scala) and then
re-exported with types by [`index.ts`](index.ts).