https://github.com/shzxcv/normalize_text
https://github.com/shzxcv/normalize_text
japanese japanese-language mecab mecab-ipadic-neologd normalization ruby
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/shzxcv/normalize_text
- Owner: shzxcv
- License: mit
- Created: 2021-09-29T13:32:00.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2021-09-29T16:22:03.000Z (over 4 years ago)
- Last Synced: 2026-01-13T06:59:07.733Z (6 months ago)
- Topics: japanese, japanese-language, mecab, mecab-ipadic-neologd, normalization, ruby
- Language: Ruby
- Homepage:
- Size: 22.5 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# NormalizeText
This Gem normalizes the text.
The normalization is based on the normalization rules of mecab-neologd, and some of my own additions.
https://github.com/neologd/mecab-ipadic-neologd/wiki/Regexp.ja
## Installation
Add this line to your application's Gemfile:
```ruby
gem 'normalize_text'
```
And then execute:
$ bundle install
Or install it yourself as:
$ gem install normalize_text
## Usage
```
require 'normalize_text'
'検索 エンジン 自作 入門 を 買い ました!!!'.normalize
=> "検索エンジン自作入門を買いました!!!"
' PRML 副 読 本 '.normalize
=> "PRML副読本"
'南アルプスの 天然水 Sparking Lemon レモン一絞り'.normalize
=> "南アルプスの天然水Sparking Lemonレモン一絞り"
```
For other normalization rules, please refer to the spec file
https://github.com/sho-jp/normalize_text/blob/main/spec/normalize_text_spec.rb
## Development
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
## Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/sho-jp/normalize_text. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/[USERNAME]/normalize_text/blob/master/CODE_OF_CONDUCT.md).
## License
The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
## Code of Conduct
Everyone interacting in the NormalizeText project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/[USERNAME]/normalize_text/blob/master/CODE_OF_CONDUCT.md).