https://github.com/cshum/levi-chinese

Chinese text processing plugins for Levi
https://github.com/cshum/levi-chinese

Last synced: about 1 month ago
JSON representation

Chinese text processing plugins for Levi

Host: GitHub
URL: https://github.com/cshum/levi-chinese
Owner: cshum
License: mit
Created: 2015-09-10T03:07:39.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2015-09-19T02:57:47.000Z (over 9 years ago)
Last Synced: 2025-03-01T17:06:42.710Z (about 2 months ago)
Language: JavaScript
Homepage:
Size: 230 KB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Levi Chinese

Chinese text processing plugins for [Levi](https://github.com/cshum/levi).

[![Build Status](https://travis-ci.org/cshum/levi-chinese.svg?branch=master)](https://travis-ci.org/cshum/levi-chinese)

Levi Chinese aims to facilitate Chinese support in [Levi](https://github.com/cshum/levi) full-text search.

This is under active development but I am no expert in Chinese NLP. 

Any comments or PRs are appreciated.

```

npm install levi-chinese

```

Levi Chinese provides text processing plugins `chinese.converter()` and `chinese.segmenter()`.

Mount them under the default plugins of Levi.

```js

var levi = require('levi')

var chinese = require('levi-chinese')

var lv = levi('db')

.use(levi.tokenizer())

.use(levi.stemmer())

.use(levi.stopword())

.use(chinese.converter()) // chinese plugin

.use(chinese.segmenter()) // chinese plugin

lv.pipeline('Lorem Ipsum is dummy text我是拖拉機學院手扶拖拉機專業的。', function (err, tokens) {

  // tokens

  ['lorem', 'ipsum', 'dummi', 'text',

    '手扶拖拉机', '拖拉机', '学院', '专业' ]

})

```

### chinese.converter()

Convert Traditional Chinese into Simplified Chinese text tokens.

Based on dictionary from [Tongwen](http://tongwen.openfoundry.org/)

### chinese.segmenter()

Chinese words segmentation using [nodejieba](https://github.com/yanyiwu/nodejieba).

This requires native bindings so it only works on Node.js.

## License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cshum/levi-chinese

Awesome Lists containing this project

README