https://github.com/paithiov909/rjavacmecab
rJava Interface to CMeCab
https://github.com/paithiov909/rjavacmecab
mecab r r-package rjava
Last synced: 6 months ago
JSON representation
rJava Interface to CMeCab
- Host: GitHub
- URL: https://github.com/paithiov909/rjavacmecab
- Owner: paithiov909
- License: bsd-3-clause
- Created: 2019-11-25T18:10:14.000Z (almost 6 years ago)
- Default Branch: main
- Last Pushed: 2023-01-09T00:09:39.000Z (over 2 years ago)
- Last Synced: 2023-03-04T02:38:42.919Z (over 2 years ago)
- Topics: mecab, r, r-package, rjava
- Language: R
- Homepage: https://paithiov909.github.io/rjavacmecab/
- Size: 12.6 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.Rmd
- License: LICENSE
Awesome Lists containing this project
README
---
output: github_document
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
pkgload::load_all()
```[](#)
[](https://lifecycle.r-lib.org/articles/stages.html#superseded)
[](https://github.com/paithiov909/rjavacmecab/actions/workflows/check.yml)
[](https://app.codecov.io/gh/paithiov909/rjavacmecab?branch=main)> rJava Interface to CMeCab
rjavacmecab is an rJava interface to [takscape/cmecab-java](https://github.com/takscape/cmecab-java) that is a Java binding for MeCab.
The goal of this package is to provide the simplest way to help use 'MeCab' from R than alternatives ([RMeCab](https://github.com/IshidaMotohiro/RMeCab) and [RcppMeCab](https://github.com/junhewk/RcppMeCab)).
rjavacmecab is yet slower, but it should be easier to use because...
1. There is no need to build from C/C++ source.
2. It returns all features of each nodes accessible via cmecab-java.## System Requirements
rjavacmecab requires 'MeCab' (mecab, libmecab-dev and mecab-ipadic-utf8) and JDK. Please note that they are installed and available before you use rjavacmecab.
In case using base R and JDK for 32/64bit under Windows, you need 32/64bit build of libmecab.
## Usage
### Installation
``` r
remotes::install_github("paithiov909/rjavacmecab")
```### Call Tagger
To make cmecab tagger available, `rebuild_tagger` at first.
```{r cmecab_1}
rjavacmecab::rebuild_tagger()res <- rjavacmecab::cmecab(c("長期的自己実現で福楽は得られない", "幸せは刹那の中にあり"))
str(res)
```### Prettify Output
```{r cmecab_2}
res <- rjavacmecab::prettify(res)
str(res)
```If you use IPA-styled dictionary, the output has these columns.
- doc_id: 文番号
- token: 表層形(surface form)
- POS1~POS4: 品詞, 品詞細分類1, 品詞細分類2, 品詞細分類3
- X5StageUse1: 活用型(ex. 五段, 下二段...)
- X5StageUse2: 活用形(ex. 連用形, 基本形...)
- Original: 原形(lemmatised form)
- Yomi1: 読み(readings)
- Yomi2: 発音(pronunciation)### Pack Output
```{r cmecab_3}
res <- rjavacmecab::pack(res)
print(res)
```### Use Igo
[Igo](http://igo.osdn.jp/) is a pure Java port of MeCab. rjavacmecab also provides a wrapper function of that.
```{r igo}
res <- rjavacmecab::igo("お前がそう思うんならそうなんだろう、お前ん中ではな")
str(res)
```## License
BSD 3-clause License.
This software includes works that are distributed in Public Domain and New BSD License.
See https://github.com/takscape/cmecab-java/blob/master/README.txt for more details.Icons made by [Vectors Market](https://www.flaticon.com/authors/vectors-market) from [Flaticon](https://www.flaticon.com/).