{"id":16747909,"url":"https://github.com/amake/script_detector_2","last_synced_at":"2025-03-16T03:10:44.211Z","repository":{"id":56894542,"uuid":"398798017","full_name":"amake/script_detector_2","owner":"amake","description":"A simple utility for determining whether a string is Japanese, Simplified Chinese, Traditional Chinese, or Korean","archived":false,"fork":false,"pushed_at":"2023-02-27T12:02:17.000Z","size":140,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-23T16:52:28.816Z","etag":null,"topics":["cjk","detector","ruby","ruby-gem","script"],"latest_commit_sha":null,"homepage":"https://rubygems.org/gems/script_detector_2","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amake.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-08-22T12:47:22.000Z","updated_at":"2021-11-25T22:59:53.000Z","dependencies_parsed_at":"2022-08-21T01:20:30.287Z","dependency_job_id":null,"html_url":"https://github.com/amake/script_detector_2","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amake%2Fscript_detector_2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amake%2Fscript_detector_2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amake%2Fscript_detector_2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amake%2Fscript_detector_2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amake","download_url":"https://codeload.github.com/amake/script_detector_2/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243818205,"owners_count":20352629,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cjk","detector","ruby","ruby-gem","script"],"created_at":"2024-10-13T02:11:14.899Z","updated_at":"2025-03-16T03:10:44.194Z","avatar_url":"https://github.com/amake.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ScriptDetector2\n\nA simple utility for determining whether a string is Japanese, Simplified\nChinese, Traditional Chinese, or Korean. It is intended to be a more accurate\nand more performant alternative to the [script_detector\ngem](https://rubygems.org/gems/script_detector).\n\nUnlike the original script_detector, this gem:\n\n- Is optimized to reduce temporary garbage in favor of some constant memory\n  usage\n- Uses the\n  [kUnihanCore2020](https://www.unicode.org/reports/tr38/#kUnihanCore2020)\n  property of the Unicode Unihan database to determine which characters belong\n  to which script (Unicode 14)\n  ([details](http://www.unicode.org/L2/L2019/19388-unihan-core-2020.pdf))\n- Uses [ISO 15924 script names](https://en.wikipedia.org/wiki/ISO_15924) in\n  symbol form as return values (instead of English strings)\n\n## Installation\n\nAdd this line to your application's Gemfile:\n\n```ruby\ngem 'script_detector_2'\n```\n\nAnd then execute:\n\n    $ bundle install\n\nOr install it yourself as:\n\n    $ gem install script_detector_2\n\n## Usage\n\nThe main detection methods are:\n\n- `ScriptDetector2.japanese?`\n- `ScriptDetector2.chinese?`\n- `ScriptDetector2.simplified_chinese?`\n- `ScriptDetector2.traditional_chinese?`\n- `ScriptDetector2.identify_script`\n- `ScriptDetector2.identify_scripts`\n\nRegexp patterns are used to identify the script to which Han characters belong.\nThese can be used directly as well:\n\n- `ScriptDetector2::JAPANESE_PATTERN`: matches all Han characters in the\n  kUnihanCore2020 set marked as Japanese (J)\n- `ScriptDetector2::SIMPLIFIED_CHINESE_PATTERN`: matches all Han characters in\n  the kUnihanCore2020 set marked as PRC (G)\n- `ScriptDetector2::TRADITIONAL_CHINESE_PATTERN`: matches all Han characters in\n  the kUnihanCore2020 set marked as Hong Kong (H), Macau (M), or ROC (T)\n- `ScriptDetector2::KOREAN_PATTERN`: matches all Han characters in the\n  kUnihanCore2020 set marked as ROK (K) or DPRK (P)\n\nEach of the above patterns matches an entire string containing only Han\ncharacters of the indicated script, i.e.\n\n```ruby\nScriptDetector2::JAPANESE_PATTERN.match?('日本語') # =\u003e true\nScriptDetector2::JAPANESE_PATTERN.match?('你好') # =\u003e false\nScriptDetector2::JAPANESE_PATTERN.match?('Hello 日本語') # =\u003e false\n```\n\nTo recreate the script_detector gem's extension of the String class, use the\nsupplied refinement like so:\n\n```ruby\nusing ScriptDetector2::StringUtil\n```\n\nThen you can do:\n\n```ruby\n'こんにちは、世界！'.japanese? # =\u003e true\n```\n\n## Development\n\nAfter checking out the repo, run `bin/setup` to install dependencies. Then, run\n`rake test` to run the tests. You can also run `bin/console` for an interactive\nprompt that will allow you to experiment.\n\nTo install this gem onto your local machine, run `bundle exec rake install`. To\nrelease a new version, update the version number in `version.rb`, and then run\n`bundle exec rake release`, which will create a git tag for the version, push\ngit commits and the created tag, and push the `.gem` file to\n[rubygems.org](https://rubygems.org).\n\n## Contributing\n\nBug reports and pull requests are welcome on GitHub at\nhttps://github.com/amake/script_detector_2.\n\n## License\n\nThe gem is available as open source under the terms of the [MIT\nLicense](https://opensource.org/licenses/MIT).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famake%2Fscript_detector_2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famake%2Fscript_detector_2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famake%2Fscript_detector_2/lists"}