Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tonytonyjan/tjngram

N-Gram generator in Ruby, supporting English, Chinese, Janpanese and Korean.
https://github.com/tonytonyjan/tjngram

Last synced: about 1 month ago
JSON representation

N-Gram generator in Ruby, supporting English, Chinese, Janpanese and Korean.

Awesome Lists containing this project

README

        

# TJNGram

It's common to see Chinese, Jananse and Korean articles contain some English, but it's not common to see an n-gram library which can parse this sort of articles. TJNGram was made for solving this problem.

## Install

gem install tjngram

## Usage

require 'tjngram'

text = < {"一個"=>2, "これ"=>2, "is an"=>2, ...}

## Note

If your file is utf-8 encoded, please run ruby with the following options:

ruby -Ku example.rb

It's strongly recommand you make your all script files utf-8 encoded.