https://github.com/hexacta/tokenizer
Split text into tokens.
https://github.com/hexacta/tokenizer
browser hexacta-innovation-labs join nlp node split text tokenize tokenizer tokens words
Last synced: 3 months ago
JSON representation
Split text into tokens.
- Host: GitHub
- URL: https://github.com/hexacta/tokenizer
- Owner: hexacta
- License: mit
- Created: 2017-02-27T20:50:30.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2017-02-28T00:14:26.000Z (about 9 years ago)
- Last Synced: 2025-08-09T08:21:19.039Z (9 months ago)
- Topics: browser, hexacta-innovation-labs, join, nlp, node, split, text, tokenize, tokenizer, tokens, words
- Language: JavaScript
- Size: 34.2 KB
- Stars: 2
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- License: license
Awesome Lists containing this project
README
# tokenizer [](https://travis-ci.org/hexacta/tokenizer) [](https://www.npmjs.com/package/hx-tokenizer) [](https://github.com/prettier/prettier)
> Split text into tokens.
## Install
```
$ npm install --save hx-tokenizer
```
## Usage
```js
const tokenizer = require("hx-tokenizer");
const tokens = tokenizer.tokenize("Lorem ipsum, something.");
// tokens == ["Lorem", "ipsum", ",", "something", "."]
const text = tokenizer.join(tokens);
// text == "Lorem ipsum, something."
```
## API
### tokenizer.tokenize(text)
### tokenizer.join(tokens)
## License
MIT © [Hexacta](http://www.hexacta.com)