{"id":13413542,"url":"https://github.com/ikawaha/kagome","last_synced_at":"2026-03-03T16:12:52.148Z","repository":{"id":18139878,"uuid":"21228174","full_name":"ikawaha/kagome","owner":"ikawaha","description":"Self-contained Japanese Morphological Analyzer written in pure Go","archived":false,"fork":false,"pushed_at":"2026-02-24T00:32:28.000Z","size":745754,"stargazers_count":944,"open_issues_count":1,"forks_count":57,"subscribers_count":21,"default_branch":"v2","last_synced_at":"2026-02-24T07:35:28.698Z","etag":null,"topics":["hacktoberfest","japanese","japanese-language","korean","morphological-analysis","nlp-library","pos-tagging","segmentation","tokenizer"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ikawaha.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":"ikawaha"}},"created_at":"2014-06-26T04:38:13.000Z","updated_at":"2026-02-24T00:32:31.000Z","dependencies_parsed_at":"2025-02-22T17:40:22.947Z","dependency_job_id":"d3d4a79a-d5d9-4c9e-b486-f3a065c43d62","html_url":"https://github.com/ikawaha/kagome","commit_stats":{"total_commits":623,"total_committers":16,"mean_commits":38.9375,"dds":0.1284109149277689,"last_synced_commit":"70cdd0b5a77e1c5e60a33d203fe08cdf9891aec1"},"previous_names":[],"tags_count":87,"template":false,"template_full_name":null,"purl":"pkg:github/ikawaha/kagome","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ikawaha%2Fkagome","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ikawaha%2Fkagome/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ikawaha%2Fkagome/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ikawaha%2Fkagome/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ikawaha","download_url":"https://codeload.github.com/ikawaha/kagome/tar.gz/refs/heads/v2","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ikawaha%2Fkagome/sbom","scorecard":{"id":483596,"data":{"date":"2025-08-11","repo":{"name":"github.com/ikawaha/kagome","commit":"3170de94c6faeb52cabe4f43e8b8270062c3e6e0"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":5.6,"checks":[{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/go.yml:1","Warn: no topLevel permission defined: .github/workflows/release.yml:1","Warn: no topLevel permission defined: .github/workflows/reviewdog-golangci-lint.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Code-Review","score":3,"reason":"Found 3/9 approved changesets -- score normalized to 3","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Maintained","score":6,"reason":"6 commit(s) and 2 issue activity found in the last 90 days -- score normalized to 6","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Binary-Artifacts","score":9,"reason":"binaries present in source code","details":["Warn: binary detected: docs/kagome.wasm:1"],"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Pinned-Dependencies","score":8,"reason":"dependency not pinned by hash detected -- score normalized to 8","details":["Warn: goCommand not pinned by hash: .github/workflows/go.yml:62","Info:   7 out of   7 GitHub-owned GitHubAction dependencies pinned","Info:   7 out of   7 third-party GitHubAction dependencies pinned","Info:   0 out of   1 goCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":10,"reason":"project is fuzzed","details":["Info: GoBuiltInFuzzer integration found: tokenizer/lattice/mem/pool_test.go:95"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Packaging","score":10,"reason":"packaging workflow detected","details":["Info: Project packages its releases by way of GitHub Actions.: .github/workflows/release.yml:10"],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Signed-Releases","score":0,"reason":"Project has not signed or included provenance with any releases.","details":["Warn: release artifact v2.10.2 not signed: https://api.github.com/repos/ikawaha/kagome/releases/209613569","Warn: release artifact v2.10.1 not signed: https://api.github.com/repos/ikawaha/kagome/releases/205269251","Warn: release artifact v2.10.0 not signed: https://api.github.com/repos/ikawaha/kagome/releases/168539817","Warn: release artifact v2.9.11 not signed: https://api.github.com/repos/ikawaha/kagome/releases/160695916","Warn: release artifact v2.9.10 not signed: https://api.github.com/repos/ikawaha/kagome/releases/160695145","Warn: release artifact v2.10.2 does not have provenance: https://api.github.com/repos/ikawaha/kagome/releases/209613569","Warn: release artifact v2.10.1 does not have provenance: https://api.github.com/repos/ikawaha/kagome/releases/205269251","Warn: release artifact v2.10.0 does not have provenance: https://api.github.com/repos/ikawaha/kagome/releases/168539817","Warn: release artifact v2.9.11 does not have provenance: https://api.github.com/repos/ikawaha/kagome/releases/160695916","Warn: release artifact v2.9.10 does not have provenance: https://api.github.com/repos/ikawaha/kagome/releases/160695145"],"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 29 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-19T17:14:33.597Z","repository_id":18139878,"created_at":"2025-08-19T17:14:33.597Z","updated_at":"2025-08-19T17:14:33.597Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30051296,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-03T15:26:47.567Z","status":"ssl_error","status_checked_at":"2026-03-03T15:26:17.132Z","response_time":61,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hacktoberfest","japanese","japanese-language","korean","morphological-analysis","nlp-library","pos-tagging","segmentation","tokenizer"],"created_at":"2024-07-30T20:01:42.799Z","updated_at":"2026-03-03T16:12:52.141Z","avatar_url":"https://github.com/ikawaha.png","language":"Go","funding_links":["https://github.com/sponsors/ikawaha"],"categories":["Natural Language Processing","Go","自然语言处理","Bot Building","Relational Databases","Microsoft Office","Libraries"],"sub_categories":["Morphological Analyzers","形态分析","Strings","交流","Uncategorized","暂未分类这些库被放在这里是因为其他类别似乎都不适合。","Books","Advanced Console UIs","暂未分类"],"readme":"[![GoDev](https://pkg.go.dev/badge/github.com/ikawaha/kagome/v2)](https://pkg.go.dev/github.com/ikawaha/kagome/v2)\n[![Go](https://github.com/ikawaha/kagome/workflows/Go/badge.svg)](https://github.com/ikawaha/kagome/actions?query=workflow%3AGo)\n[![Release](https://github.com/ikawaha/kagome/actions/workflows/release.yml/badge.svg?branch=)](https://github.com/ikawaha/kagome/actions/workflows/release.yml)\n[![Coverage Status](https://coveralls.io/repos/github/ikawaha/kagome/badge.svg?branch=v2)](https://coveralls.io/github/ikawaha/kagome?branch=v2)\n[![Docker Pulls](https://img.shields.io/docker/pulls/ikawaha/kagome.svg?style)](https://hub.docker.com/r/ikawaha/kagome/)\n\n# Kagome v2\n\nKagome is an open source Japanese morphological analyzer written in pure Go. It can tokenize Japanese text into words and analyze parts of speech, with dictionaries embedded in the binary for easy deployment.\n\n\u003e [!NOTE]\n\u003e **Key features** (Improvements from [v1](https://github.com/ikawaha/kagome/tree/master)):\n\u003e\n\u003e * Self-contained binaries with embedded dictionaries (MeCab-IPADIC, UniDic)\n\u003e * Multiple segmentation modes for different use cases\n\u003e * RESTful API server mode for production use\n\u003e * WebAssembly support for browser environments\n\u003e * C library API for FFI integration (Python, PHP, and other languages)\n\n## Index\n\n* [Basic Usage](#basic-usage)\n  * [Command line](#command-line)\n  * [As a Go library](#as-a-go-library)\n  * [As a C library](#as-a-c-library)\n  * [More examples](#more-examples)\n* [Install](#install)\n* [Commands](#commands)\n  * [Tokenize command](#tokenize-command)\n  * [Server command](#server-command)\n    * [RESTful API](#restful-api)\n    * [Web App](#web-app)\n  * [Lattice command](#lattice-command)\n  * [Sentence command](#sentence-command)\n* [Dictionaries](#dictionaries)\n* [Segmentation modes](#segmentation-modes)\n* [Docker](#docker)\n* [WebAssembly](#webassembly)\n* [Use from other languages (FFI)](#use-from-other-languages-ffi)\n* [Reference](#reference)\n* [License](#license)\n\n## Basic Usage\n\n### Command line\n\n```shellsession\n% kagome -h\nJapanese Morphological Analyzer -- github.com/ikawaha/kagome/v2\nusage: kagome \u003ccommand\u003e\nThe commands are:\n   [tokenize] - command line tokenize (*default)\n   server - run tokenize server\n   lattice - lattice viewer\n   sentence - tiny sentence splitter\n   version - show version\n\ntokenize [-file input_file] [-dict dic_file] [-userdict user_dic_file] [-sysdict (ipa|uni)] [-simple false] [-mode (normal|search|extended)] [-split] [-json]\n  -dict string\n    \tdict\n  -file string\n    \tinput file\n  -json\n    \toutputs in JSON format\n  -mode string\n    \ttokenize mode (normal|search|extended) (default \"normal\")\n  -simple\n    \tdisplay abbreviated dictionary contents\n  -split\n    \tuse tiny sentence splitter\n  -sysdict string\n    \tsystem dict type (ipa|uni) (default \"ipa\")\n  -udict string\n    \tuser dict\n```\n\n```shellsession\n% # piped standard input\n% echo \"すもももももももものうち\" | kagome\nすもも\t名詞,一般,*,*,*,*,すもも,スモモ,スモモ\nも\t助詞,係助詞,*,*,*,*,も,モ,モ\nもも\t名詞,一般,*,*,*,*,もも,モモ,モモ\nも\t助詞,係助詞,*,*,*,*,も,モ,モ\nもも\t名詞,一般,*,*,*,*,もも,モモ,モモ\nの\t助詞,連体化,*,*,*,*,の,ノ,ノ\nうち\t名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ\nEOS\n```\n\n* For more details, see the [Commands section](#commands).\n\n### As a Go library\n\nYou can integrate Kagome into your Go applications as follows:\n\n```sh\n# Install Kagome module\ngo get github.com/ikawaha/kagome/v2\n```\n\n```Go\npackage main\n\nimport (\n  \"fmt\"\n  \"strings\"\n\n  \"github.com/ikawaha/kagome-dict/ipa\"\n  \"github.com/ikawaha/kagome/v2/tokenizer\"\n)\n\nfunc main() {\n  t, err := tokenizer.New(ipa.Dict(), tokenizer.OmitBosEos())\n  if err != nil {\n    panic(err)\n  }\n  // wakati (simple word splitting/segmentation)\n  fmt.Println(\"---wakati---\")\n  seg := t.Wakati(\"すもももももももものうち\")\n  fmt.Println(seg)\n\n  // tokenize w/ morphological analysis\n  fmt.Println(\"---tokenize---\")\n  tokens := t.Tokenize(\"すもももももももものうち\")\n  for _, token := range tokens {\n    features := strings.Join(token.Features(), \",\")\n    fmt.Printf(\"%s\\t%v\\n\", token.Surface, features)\n  }\n}\n```\n\noutput:\n\n```shellsession\n---wakati---\n[すもも も もも も もも の うち]\n---tokenize---\nすもも\t名詞,一般,*,*,*,*,すもも,スモモ,スモモ\nも\t助詞,係助詞,*,*,*,*,も,モ,モ\nもも\t名詞,一般,*,*,*,*,もも,モモ,モモ\nも\t助詞,係助詞,*,*,*,*,も,モ,モ\nもも\t名詞,一般,*,*,*,*,もも,モモ,モモ\nの\t助詞,連体化,*,*,*,*,の,ノ,ノ\nうち\t名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ\n```\n\n### As a C library\n\nKagome is written in pure Go but can be compiled as a C shared library and used from other languages via [FFI](https://en.wikipedia.org/wiki/Foreign_function_interface) (Foreign Function Interface).\n\nSee the \"[Use from other languages (FFI)](#use-from-other-languages-ffi)\" section below for details and examples.\n\n### More examples\n\nWe provide various examples demonstrating how to use Kagome in different scenarios:\n\n* [Examples directory](https://github.com/ikawaha/kagome/tree/v2/_examples)\n* [Examples in GoDoc](https://pkg.go.dev/github.com/ikawaha/kagome/v2)\n\n## Install\n\nTo **get the `kagome` command line tool**, choose your preferred installation method below:\n\n* **Go (recommended)**\n\n  ```shellsession\n  go install github.com/ikawaha/kagome/v2@latest\n  ```\n\n* **Homebrew**\n\n  ```shellsession\n  # macOS and Linux (for both AMD64 and Arm64)\n  brew install ikawaha/kagome/kagome\n  ```\n\n* **Manual Install**\n\n  * For manual installation, download and extract the appropriate archived file for your OS and architecture from the [releases page](https://github.com/ikawaha/kagome/releases/latest).\n  * Note that the extracted binary must be placed in an accessible directory with execution permission.\n\n* **Docker/Docker Compose**\n\n  * See the [Docker section](#docker) below\n\n## Commands\n\nMajor sub-commands of `kagome` command line tool.\n\n### Tokenize command\n\n```shellsession\n% # interactive/REPL mode\n% kagome\nすもももももももものうち\nすもも\t名詞,一般,*,*,*,*,すもも,スモモ,スモモ\nも\t助詞,係助詞,*,*,*,*,も,モ,モ\nもも\t名詞,一般,*,*,*,*,もも,モモ,モモ\nも\t助詞,係助詞,*,*,*,*,も,モ,モ\nもも\t名詞,一般,*,*,*,*,もも,モモ,モモ\nの\t助詞,連体化,*,*,*,*,の,ノ,ノ\nうち\t名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ\nEOS\n```\n\n```shellsession\n% # piped standard input\n% echo \"すもももももももものうち\" | kagome\nすもも  名詞,一般,*,*,*,*,すもも,スモモ,スモモ\nも      助詞,係助詞,*,*,*,*,も,モ,モ\nもも    名詞,一般,*,*,*,*,もも,モモ,モモ\nも      助詞,係助詞,*,*,*,*,も,モ,モ\nもも    名詞,一般,*,*,*,*,もも,モモ,モモ\nの      助詞,連体化,*,*,*,*,の,ノ,ノ\nうち    名詞,非自立,副詞可能,*,*,*,うち,ウチ,ウチ\nEOS\n```\n\n```shellsession\n% # JSON output\n% # (For jq command see https://jqlang.org/)\n% echo \"猫\" | kagome -json | jq .\n[\n  {\n    \"id\": 286994,\n    \"start\": 0,\n    \"end\": 1,\n    \"surface\": \"猫\",\n    \"class\": \"KNOWN\",\n    \"pos\": [\n      \"名詞\",\n      \"一般\",\n      \"*\",\n      \"*\"\n    ],\n    \"base_form\": \"猫\",\n    \"reading\": \"ネコ\",\n    \"pronunciation\": \"ネコ\",\n    \"features\": [\n      \"名詞\",\n      \"一般\",\n      \"*\",\n      \"*\",\n      \"*\",\n      \"*\",\n      \"猫\",\n      \"ネコ\",\n      \"ネコ\"\n    ]\n  }\n]\n```\n\n```shellsession\n% # word splitting/segmentation only (equivalent to \"wakati\" functionality)\n% echo \"すもももももももものうち\" | kagome -json | jq -r '[.[].surface] | join(\"/\")'\nすもも/も/もも/も/もも/の/うち\n```\n\n```shellsession\n% # Extract only pronunciations using jq (for Text-to-Speech purposes, etc.)\n% echo \"私ははにわよわわわんわん\" | kagome -json | jq -r '.[].pronunciation'\nワタシ\nワ\nハニワ\nヨ\nワ\nワ\nワンワン\n```\n\n### Server command\n\nFor continuous usage, `kagome` provides a server mode to decouple the startup time of the tokenizer.\n\n#### RESTful API\n\nStart a server and try to access the \"/tokenize\" endpoint.\n\n```shellsession\n% kagome server \u0026\n% curl -XPUT localhost:6060/tokenize -d'{\"sentence\":\"すもももももももものうち\", \"mode\":\"normal\"}' | jq .\n```\n\n#### Web App\n\nStart a server and access `http://localhost:6060` in your browser.\n\n```shellsession\n% kagome server \u0026\n```\n\n\u003cvideo src=\"https://github.com/user-attachments/assets/172d8b0c-be9a-4ee5-bdff-315a0eeee85b\" controls width=\"100%\"\u003e\u003c/video\u003e\n\n\u003e [!IMPORTANT]\n\u003e The demo web application uses [graphviz](https://graphviz.org/) to draw a lattice. You need graphviz to be installed on your system.\n\n\u003e [!TIP]\n\u003e Kagome can be compiled to WebAssembly (wasm) and run locally in a web browser as well. For details, see the [WebAssembly section](#webassembly).\n\u003e\n\u003e * Wasm Demo: [https://ikawaha.github.io/kagome/](https://ikawaha.github.io/kagome/)\n\n### Lattice command\n\nA debug tool of tokenize process outputs a lattice in graphviz dot format.\n\n```shellsession\n% kagome lattice 私は鰻 | dot -Tpng -o lattice.png\n```\n\n![lattice](https://user-images.githubusercontent.com/4232165/89723585-74717000-da33-11ea-886a-baab85f7a06e.png)\n\n### Sentence command\n\nSplit long text into sentences:\n\n```shellsession\n% echo \"吾輩は猫である。名前はまだ無い。\" | kagome sentence\n吾輩は猫である。\n名前はまだ無い。\n```\n\nThis command is useful if a single line of data is too lengthy, and you want to avoid errors such as `bufio.Scanner: token too long`.\n\n```shellsession\n% echo \"吾輩は猫である。名前はまだ無い。\" | kagome -json | jq -r '[.[].surface] | join(\"/\")'\n吾輩/は/猫/で/ある/。/名前/は/まだ/無い/。\n\n% echo \"吾輩は猫である。名前はまだ無い。\" | kagome sentence | kagome -json | jq -r '[.[].surface] | join(\"/\")'\n吾輩/は/猫/で/ある/。\n名前/は/まだ/無い/。\n```\n\nThis command is equivalent to the `-split` option of the `tokenize` command.\n\n```shellsession\n% echo \"吾輩は猫である。名前はまだ無い。\" | kagome -split -json | jq -r '[.[].surface] | join(\"/\")'\n吾輩/は/猫/で/ある/。\n名前/は/まだ/無い/。\n```\n\n## Dictionaries\n\n* Currently supported dictionaries by default.\n\n  |dict| source | package |\n  |:---|:---|:---|\n  |MeCab IPADIC| mecab-ipadic-2.7.0-20070801 | [github.com/ikawaha/kagome-dict/ipa](https://github.com/ikawaha/kagome-dict/tree/master/ipa)|\n  |UniDIC| unidic-mecab-2.1.2_src | [github.com/ikawaha/kagome-dict/uni](https://github.com/ikawaha/kagome-dict/tree/master/uni) |\n\n* Experimental Features\n\n  |dict|source|package|\n  |:---|:---|:---|\n  |mecab-ipadic-NEologd|mecab-ipadic-neologd| [github.com/ikawaha/kagome-ipa-neologd](https://github.com/ikawaha/kagome-dict-ipa-neologd)|\n  |Korean MeCab|mecab-ko-dic-2.1.1-20180720 | [github.com/ikawaha/kagome-dict-ko](https://github.com/ikawaha/kagome-dict-ko)|\n\n\u003e [!NOTE]\n\u003e For more details and differences between the dictionaries, see the [wiki](https://github.com/ikawaha/kagome/wiki/About-the-dictionary).\n\n## Segmentation modes\n\nSimilar to [Kuromoji](https://www.atilika.org/), Kagome also supports various **segmentation modes** (splitting strategies) to tokenize the input text.\n\n* **Normal:** Regular segmentation\n* **Search:** Use a heuristic to perform additional segmentation that is **useful for search** purposes\n* **Extended:** Similar to search mode, but also unknown words with [uni-grams](https://en.wikipedia.org/wiki/N-gram)\n\n|Untokenized|Normal|Search|Extended|\n|:-------|:---------|:---------|:---------|\n|関西国際空港|関西国際空港|関西　国際　空港|関西　国際　空港|\n|日本経済新聞|日本経済新聞|日本　経済　新聞|日本　経済　新聞|\n|シニアソフトウェアエンジニア|シニアソフトウェアエンジニア|シニア　ソフトウェア　エンジニア|シニア　ソフトウェア　エンジニア|\n|デジカメを買った|デジカメ　を　買っ　た|デジカメ　を　買っ　た|デ　ジ　カ　メ　を　買っ　た|\n\n\u003e [!NOTE]\n\u003eIf your purpose is for search, try changing the mode before switching to another dictionary.\n\n## Docker\n\n[![Docker](https://dockerico.blankenship.io/image/ikawaha/kagome)](https://hub.docker.com/r/ikawaha/kagome)\n\nWe provide `scratch`-based Docker images that simply run the `kagome` command line tool on various architectures: AMD64, Arm64, Arm32 (Arm v5, v6 and v7)\n\n* Pull the image\n\n  ```sh\n  docker pull ikawaha/kagome:latest\n  ```\n\n  ```sh\n  # Alternatively, you can pull from GitHub Container Registry\n  docker pull ghcr.io/ikawaha/kagome:latest\n  ```\n\n* Run the command via Docker\n\n  ```sh\n  # Interactive/REPL mode\n  docker run --rm -it ikawaha/kagome:latest\n  ```\n\n  ```sh\n  # If pulling from GitHub Container Registry\n  docker run --rm -it ghcr.io/ikawaha/kagome:latest\n  ```\n\n* Run the server via Docker\n\n  ```sh\n  # Server mode (http://localhost:6060)\n  docker run --rm -p 6060:6060 ikawaha/kagome:latest server\n  ```\n\n  ```sh\n  # If pulling from GitHub Container Registry\n  docker run --rm -p 6060:6060 ghcr.io/ikawaha/kagome:latest server\n  ```\n\n* `docker-compose.yml` example\n\n  ```yaml\n  services:\n    kagome:\n      image: ikawaha/kagome:latest\n      ports: [\"6060:6060\"]\n      command: server\n      restart: unless-stopped\n  ```\n\n\u003e **Note:** Base image doesn't include Graphviz. For lattice visualization, see [examples](./_examples/server_docker_graphviz/).\n\n## WebAssembly\n\nKagome compiles to WebAssembly for browser use.\n\n* **Live demo:** [https://ikawaha.github.io/kagome/](https://ikawaha.github.io/kagome/)\n* **Source code:** [./_examples/wasm](./_examples/wasm)\n\n### Use from other languages (FFI)\n\nKagome is written in pure Go but can be compiled as a C shared library and used from other languages via FFI (Foreign Function Interface).\n\n* Currently supported/tested languages:\n  * **Python 3.12+** (using `ctypes`)\n  * **PHP 8+** (using `FFI`)\n\n```python\n# Python example using ctypes\nfrom libkagome import Kagome\n\nkagome = Kagome()\ntokens = kagome.tokenize(\"すもももももももものうち\")\n\nfor token in tokens:\n    print(f\"{token.surface}\\t{token.pos}\")\n```\n\n```php\n\u003c!-- PHP example using FFI --\u003e\n\u003c?php\ndeclare(strict_types=1);\n\nrequire __DIR__ . '/libkagome.php';\n\n$kagome = new Kagome();\n$tokens = $kagome-\u003etokenize(\"すもももももももものうち\");\n\nforeach ($tokens as $token) {\n    echo \"{$token-\u003esurface}\\t\" . implode(',', $token-\u003epos) . \"\\n\";\n}\n```\n\nFor complete examples and build instructions, see:\n\n* [./_examples/clib/](./_examples/clib/) - C library FFI examples for Python and PHP\n\n\u003e [!NOTE]\n\u003e The C library provides thread-safe tokenization with proper memory management and includes comprehensive tests.\n\n## Reference\n\n* Detailed Reference Manual in Japanese:\n\n  [![実践：形態素解析 kagome v2](https://user-images.githubusercontent.com/4232165/102152682-e281e400-3eb8-11eb-91f7-13e08a8977d9.png)](https://zenn.dev/ikawaha/books/kagome-v2-japanese-tokenizer)\n\n* Community Wiki in English:\n  * [https://github.com/ikawaha/kagome/wiki](https://github.com/ikawaha/kagome/wiki)\n\n## License\n\n* MIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fikawaha%2Fkagome","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fikawaha%2Fkagome","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fikawaha%2Fkagome/lists"}