Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tonytonyjan/tjngram
N-Gram generator in Ruby, supporting English, Chinese, Janpanese and Korean.
https://github.com/tonytonyjan/tjngram
Last synced: about 1 month ago
JSON representation
N-Gram generator in Ruby, supporting English, Chinese, Janpanese and Korean.
- Host: GitHub
- URL: https://github.com/tonytonyjan/tjngram
- Owner: tonytonyjan
- Created: 2012-06-06T08:41:08.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2012-06-06T09:27:46.000Z (over 12 years ago)
- Last Synced: 2024-10-11T13:07:20.264Z (2 months ago)
- Language: Ruby
- Size: 93.8 KB
- Stars: 4
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# TJNGram
It's common to see Chinese, Jananse and Korean articles contain some English, but it's not common to see an n-gram library which can parse this sort of articles. TJNGram was made for solving this problem.
## Install
gem install tjngram
## Usage
require 'tjngram'
text = < {"一個"=>2, "これ"=>2, "is an"=>2, ...}
## NoteIf your file is utf-8 encoded, please run ruby with the following options:
ruby -Ku example.rb
It's strongly recommand you make your all script files utf-8 encoded.