Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/agatan/yoin
A Japanese Morphological Analyzer written in pure Rust
https://github.com/agatan/yoin
japanese nlp rust
Last synced: 3 months ago
JSON representation
A Japanese Morphological Analyzer written in pure Rust
- Host: GitHub
- URL: https://github.com/agatan/yoin
- Owner: agatan
- License: mit
- Created: 2017-01-31T14:02:09.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2019-10-25T04:41:06.000Z (over 5 years ago)
- Last Synced: 2024-07-18T16:41:32.418Z (7 months ago)
- Topics: japanese, nlp, rust
- Language: Rust
- Size: 31.6 MB
- Stars: 26
- Watchers: 2
- Forks: 2
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- Awesome-Rust-MachineLearning - agatan/yoin - A Japanese Morphological Analyzer written in pure Rust (Natural Language Processing (preprocessing))
README
## Yoin - A Japanese Morphological Analyzer
[![Build Status](https://travis-ci.org/agatan/yoin.svg?branch=master)](https://travis-ci.org/agatan/yoin)
[![Version info](https://img.shields.io/crates/v/yoin.svg)](https://crates.io/crates/yoin)`yoin` is a Japanese morphological analyze engine written in pure Rust.
[mecab-ipadic](https://taku910.github.io/mecab/) is embedded in `yoin`.
```sh
:) $ yoin
すもももももももものうち
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
の 助詞,連体化,*,*,*,*,の,ノ,ノ
うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
EOS
```## Build & Install
*`yoin` is available on [crates.io](https://crates.io)*
### CLI
```sh
:) $ cargo install yoin# or
:) $ git clone https://github.com/agatan/yoin
:) $ cd yoin && cargo install
```### Library
yoin can be included in your Cargo project like this:
```toml
[dependencies]
yoin = "*"
```and write your code like this:
```rust
extern crate yoin;
```## Usage - CLI
By default, `yoin` reads lines from stdin, analyzes each line, and outputs results.
```sh
:) $ yoin
すもももももももものうち
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
の 助詞,連体化,*,*,*,*,の,ノ,ノ
うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
EOS
そこではなしは終わりになった
そこで 接続詞,*,*,*,*,*,そこで,ソコデ,ソコデ
はなし 名詞,一般,*,*,*,*,はなし,ハナシ,ハナシ
は 助詞,係助詞,*,*,*,*,は,ハ,ワ
終わり 動詞,自立,*,*,五段・ラ行,連用形,終わる,オワリ,オワリ
に 助詞,格助詞,一般,*,*,*,に,ニ,ニ
なっ 動詞,自立,*,*,五段・ラ行,連用タ接続,なる,ナッ,ナッ
た 助動詞,*,*,*,特殊・タ,基本形,た,タ,タ
EOS
```Or, reads from file.
```sh
:) $ cat input.txt
すもももももももものうち
:) $ yoin --file input.txt
すもも 名詞,一般,*,*,*,*,すもも,スモモ,スモモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
も 助詞,係助詞,*,*,*,*,も,モ,モ
もも 名詞,一般,*,*,*,*,もも,モモ,モモ
の 助詞,連体化,*,*,*,*,の,ノ,ノ
うち 名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ
EOS
```## LICENSE
This software in under the MIT License and contains the MeCab-ipadic model.
See `LICENSE` and `NOTICE.txt` for more details.