{"id":34899447,"url":"https://github.com/rajatim/zhtw","last_synced_at":"2026-04-11T07:04:51.795Z","repository":{"id":330657767,"uuid":"1122968217","full_name":"rajatim/zhtw","owner":"rajatim","description":"簡轉繁台灣用語轉換器 | Simplified to Traditional Chinese (Taiwan) Converter - rajatim 出品","archived":false,"fork":false,"pushed_at":"2026-04-08T08:34:48.000Z","size":1189,"stargazers_count":9,"open_issues_count":6,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-08T10:19:49.788Z","etag":null,"topics":["chinese","cli","i18n","linter","localization","pre-commit","python","simplified-chinese","taiwan","text-processing","traditional-chinese"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rajatim.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-25T23:49:33.000Z","updated_at":"2026-04-08T08:33:48.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/rajatim/zhtw","commit_stats":null,"previous_names":["rajatim/zhtw"],"tags_count":25,"template":false,"template_full_name":null,"purl":"pkg:github/rajatim/zhtw","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rajatim%2Fzhtw","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rajatim%2Fzhtw/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rajatim%2Fzhtw/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rajatim%2Fzhtw/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rajatim","download_url":"https://codeload.github.com/rajatim/zhtw/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rajatim%2Fzhtw/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31670383,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-10T17:19:37.612Z","status":"online","status_checked_at":"2026-04-11T02:00:05.776Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chinese","cli","i18n","linter","localization","pre-commit","python","simplified-chinese","taiwan","text-processing","traditional-chinese"],"created_at":"2025-12-26T08:29:56.586Z","updated_at":"2026-04-11T07:04:51.789Z","avatar_url":"https://github.com/rajatim.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ZHTW\n\n**繁體中文** · [English](README.en.md)\n\n[![CI](https://github.com/rajatim/zhtw/actions/workflows/ci.yml/badge.svg)](https://github.com/rajatim/zhtw/actions/workflows/ci.yml)\n[![codecov](https://codecov.io/gh/rajatim/zhtw/branch/main/graph/badge.svg)](https://codecov.io/gh/rajatim/zhtw)\n[![PyPI](https://img.shields.io/pypi/v/zhtw.svg)](https://pypi.org/project/zhtw/)\n[![Maven Central](https://img.shields.io/maven-central/v/com.rajatim/zhtw.svg?label=maven%20central)](https://central.sonatype.com/artifact/com.rajatim/zhtw)\n[![Homebrew](https://img.shields.io/badge/homebrew-tap-FBB040?logo=homebrew)](https://github.com/rajatim/homebrew-tap)\n[![Downloads](https://img.shields.io/pypi/dm/zhtw.svg)](https://pypi.org/project/zhtw/)\n[![Python](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)\n[![Java](https://img.shields.io/badge/java-11+-orange.svg)](https://adoptium.net/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)\n\n\u003c!-- zhtw:disable --\u003e\n**讓你的程式碼說台灣話** — 專治「許可權」「軟件」等違和用語\n\u003c!-- zhtw:enable --\u003e\n\n\u003c!-- zhtw:disable --\u003e\n```\n輸入：服务器上的软件需要优化，用户权限请联系管理员\n輸出：伺服器上的軟體需要最佳化，使用者權限請聯絡管理員\n```\n\u003c!-- zhtw:enable --\u003e\n\n一行程式、一個 CLI、六種 SDK —— 把簡體轉成**真正的台灣繁體**。\n\n---\n\n## 為什麼選 ZHTW？\n\n\u003e **寧可少轉，不要錯轉**\n\n通用轉換工具會過度轉換，把台灣正確的詞也改掉。我們不一樣：**只轉確定要改的詞，其他一律不動。**\n\n| | |\n|------|------|\n| **零誤判** | 31,000+ 詞條 + 6,360 字元對映，52 本書 1 億字驗證零錯轉 |\n| **秒級掃描** | 3,100 KB/s 穩定吞吐，1MB 文字 \u003c 1 秒 |\n| **完全離線** | 不傳送任何資料到外部，企業內網也能用 |\n| **CI 整合** | 一行指令加入 GitHub Actions，PR 自動檢查 |\n| **彈性跳過** | 測試資料、第三方程式碼？標記一下就不會被改 |\n\n### 對比 OpenCC\n\n\u003c!-- zhtw:disable --\u003e\nOpenCC 是通用的繁簡轉換框架，採用「全字元 + 短語替換」策略，規則之間容易互相牴觸，例如 `权→權` 會把「權限」誤轉成「許可權」。ZHTW 專注於**簡體 → 台灣繁體**一個方向，用「詞彙層 + 字元層」分層處理，複合詞上下文優先匹配。\n\n| | OpenCC (s2twp) | ZHTW |\n|---|------|------|\n| **設計目標** | 通用繁簡多變體轉換 | 簡體 → 台灣繁體 |\n| **轉換策略** | 字元 + 短語全量替換 | 詞彙優先 → 字元層補齊 |\n| **歧義處理** | 依規則順序 | 102 個歧義字分級管理 + balanced mode 語義消歧 |\n| **詞庫規模** | 內建字表 + 短語 | 31,000+ 精選台灣用詞 |\n| **誤轉** | `权限 → 許可權` 等常見案例 | 52 本書 1 億字驗證零錯轉 |\n| **生態** | C++ 核心、多語言 bindings | CLI + Python/Java/TS/Rust SDK + pre-commit |\n\n想看更多對比案例？執行 `zhtw lookup 权限 服务器 用户`。\n\u003c!-- zhtw:enable --\u003e\n\n---\n\n## 安裝\n\n### macOS (Homebrew) — 推薦\n\n```bash\nbrew tap rajatim/tap\nbrew install zhtw\n```\n\n更新：`brew upgrade zhtw`\n\n### pip (所有平臺)\n\n```bash\npython3 -m pip install zhtw\n```\n\n更新：`pip install --upgrade zhtw`\n\n### pipx (隔離環境)\n\n[pipx](https://pipx.pypa.io/) 會在獨立虛擬環境中安裝，不影響系統 Python：\n\n```bash\npipx install zhtw\n```\n\n更新：`pipx upgrade zhtw`\n\n### 從原始碼安裝 (開發者)\n\n```bash\ngit clone https://github.com/rajatim/zhtw.git\ncd zhtw\npip install -e \".[dev]\"\n```\n\n\u003cdetails\u003e\n\u003csummary\u003epip 安裝後找不到 zhtw 指令？設定 PATH\u003c/summary\u003e\n\n```bash\n# macOS (zsh)\necho 'export PATH=\"$PATH:$(python3 -m site --user-base)/bin\"' \u003e\u003e ~/.zshrc\nsource ~/.zshrc\n\n# Linux (bash)\necho 'export PATH=\"$PATH:~/.local/bin\"' \u003e\u003e ~/.bashrc\nsource ~/.bashrc\n\n# Windows — 通常自動設定，若無請加入環境變數：\n# %APPDATA%\\Python\\PythonXX\\Scripts\n```\n\u003c/details\u003e\n\n---\n\n## 30 秒開始使用\n\nZHTW 提供三種使用方式，選一個最適合你的場景：\n\n### 1. CLI（命令列）\n\n\u003c!-- zhtw:disable --\u003e\n```bash\nzhtw check .            # 檢查整個專案\nzhtw check ./file.py    # 檢查單一檔案\nzhtw fix .              # 自動修正\nzhtw lookup 软件 服务器  # 查詢：软件→軟體、服务器→伺服器\n\n# Balanced mode：自動消歧 10 個高頻歧義字（几→幾、后→後、里→裡 等）\nzhtw fix . --ambiguity-mode balanced\n```\n\u003c!-- zhtw:enable --\u003e\n\n\u003c!-- zhtw:disable --\u003e\n**輸出範例：**\n```\n📁 掃描 ./src\n\n📄 src/components/Header.tsx\n   L12:5: \"用户\" → \"使用者\"\n\n📄 src/utils/api.ts\n   L8:10: \"软件\" → \"軟體\"\n\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n⚠️  發現 2 處需修正（2 個檔案）\n```\n\u003c!-- zhtw:enable --\u003e\n\n### 2. Python Library\n\n\u003c!-- zhtw:disable --\u003e\n```python\nfrom zhtw import convert\n\nconvert(\"这个软件需要优化\")\n# → \"這個軟體需要最佳化\"\n```\n\u003c!-- zhtw:enable --\u003e\n\n首次呼叫會載入字典並建立 Aho-Corasick 自動機，後續呼叫會重用快取。進階用法（自訂詞庫、逐行回報、整合到你自己的 pipeline）見 `convert_text` / `Matcher` / `load_dictionary`。詞彙查詢 API：`lookup_word` / `lookup_words`（v3.3.0+）。\n\n### 3. Java SDK\n\n**Maven**：\n\n\u003c!-- zhtw:disable --\u003e\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.rajatim\u003c/groupId\u003e\n    \u003cartifactId\u003ezhtw\u003c/artifactId\u003e\n    \u003cversion\u003e4.1.0\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\u003c!-- zhtw:enable --\u003e\n\n**Gradle (Kotlin DSL)**：\n\n```kotlin\nimplementation(\"com.rajatim:zhtw:4.1.0\")\n```\n\n**Gradle (Groovy DSL)**：\n\n```groovy\nimplementation 'com.rajatim:zhtw:4.1.0'\n```\n\n\u003c!-- zhtw:disable --\u003e\n```java\nimport com.rajatim.zhtw.ZhtwConverter;\n\n// 快速使用（thread-safe singleton）\nString result = ZhtwConverter.getDefault().convert(\"这个软件需要优化\");\n// → \"這個軟體需要最佳化\"\n\n// 自訂設定\nZhtwConverter conv = ZhtwConverter.builder()\n    .sources(List.of(\"cn\", \"hk\"))\n    .customDict(Map.of(\"自定义\", \"自訂\"))\n    .ambiguityMode(\"balanced\")  // 歧義字自動消歧\n    .build();\n```\n\u003c!-- zhtw:enable --\u003e\n\n**效能**：單句 2μs、100K 字 5.5ms（17.9 MB/s），比 Python 快 ~5.8 倍。詳見 [`sdk/java/BENCHMARK.md`](sdk/java/BENCHMARK.md)。\n\n### 4. TypeScript SDK\n\n**npm / pnpm / yarn**：\n\n```bash\nnpm install zhtw-js\n# 或\npnpm add zhtw-js\nyarn add zhtw-js\n```\n\n\u003c!-- zhtw:disable --\u003e\n```typescript\nimport { convert, check, lookup } from 'zhtw-js';\n\n// 快速使用（zero config，內建 default converter）\nconvert('这个软件需要优化');\n// → '這個軟體需要最佳化'\n\ncheck('用户权限');\n// → [{ start, end, source, target }, ...]\n\nlookup('软件');\n// → { input, output, changed, details: [...] }\n```\n\u003c!-- zhtw:enable --\u003e\n\n**自訂設定**：\n\n\u003c!-- zhtw:disable --\u003e\n```typescript\nimport { createConverter } from 'zhtw-js';\n\nconst conv = createConverter({\n  sources: ['cn'],                  // 預設 ['cn', 'hk']\n  customDict: { '自定义': '自訂' },  // 覆蓋內建詞條\n  ambiguityMode: 'balanced',        // 歧義字自動消歧\n});\n\nconv.convert('...');\n```\n\u003c!-- zhtw:enable --\u003e\n\n**特色**：isomorphic（Node.js ≥20 + 瀏覽器原生支援）、ESM + CJS 雙產出、零執行期相依、tree-shakeable。所有索引（`start` / `end` / `position`）均為 **Unicode codepoint**，與 Python CLI、Java SDK 完全 byte-for-byte 一致（共享 `sdk/data/golden-test.json` 驗證）。釋出走 npm Trusted Publishing (OIDC)，無 long-lived token。\n\n### 5. Rust SDK\n\n**Cargo (crates.io)**：\n\n\u003c!-- zhtw:disable --\u003e\n```toml\n[dependencies]\nzhtw = \"4.1.0\"\n```\n\u003c!-- zhtw:enable --\u003e\n\n\u003c!-- zhtw:disable --\u003e\n```rust\nuse zhtw::{AmbiguityMode, Converter, Source};\n\n// Zero config\nassert_eq!(zhtw::convert(\"这个软件需要优化\"), \"這個軟體需要最佳化\");\n\n// Builder with custom dict + balanced mode\nlet conv = Converter::builder()\n    .sources([Source::Cn])\n    .custom_dict([(\"自定义\", \"自訂\")])\n    .ambiguity_mode(AmbiguityMode::Balanced)  // 歧義字自動消歧\n    .build()\n    .expect(\"non-empty sources\");\n```\n\u003c!-- zhtw:enable --\u003e\n\n**效能**：build-time 預編譯 `daachorse` automaton + `phf` char map，runtime 零建構成本。詳見 benchmarks（`cargo bench -p zhtw`）。\n\n---\n\n## 多語言 SDK\n\nZHTW 以 Python 實作為主，並提供原生 Java、TypeScript、Rust SDK。所有 SDK 共用同一份詞庫資料（`zhtw-data.json`），轉換結果與 Python CLI 完全一致（跨 SDK 透過共享 `sdk/data/golden-test.json` 做 byte-for-byte 驗證，零偏差為釋出條件）。所有 SDK 均支援 balanced mode（歧義字自動消歧）。\n\n| SDK | 安裝 | 吞吐量 (1MB) | 單句延遲 | 適用場景 | 狀態 |\n|-----|------|-------------|---------|---------|------|\n| **Python** | `pip install zhtw` | 3.1 MB/s | — | CLI、CI/CD、pre-commit、資料處理 | ✅ Stable |\n| **Java** | [Maven Central](#3-java-sdk) | 17.9 MB/s | 2μs | Spring Boot、Android、後端服務 | ✅ Stable |\n| **TypeScript** | `npm install zhtw-js` | ~16 MB/s | — | Node.js ≥18、瀏覽器（isomorphic ESM+CJS） | ✅ Stable |\n| **Rust** | [crates.io](#5-rust-sdk) | — | — | 高效能、嵌入式 | ✅ Stable |\n| **WASM** | `npm install zhtw-wasm` | — | — | 瀏覽器、Edge runtime | ✅ Stable |\n| **Go** | `go get` | — | — | 微服務、CLI 工具、雲端原生 | 🚧 Planned |\n| **C# (.NET)** | NuGet | — | — | ASP.NET、Unity、桌面應用 | 🚧 Planned |\n\n---\n\n## 涵蓋範圍\n\n\u003c!-- zhtw:disable --\u003e\n**31,000+ 精選詞條 + 6,360 字元對映**，涵蓋 IT 科技、醫療、法律、金融、遊戲、電商、學術、日常、地理、港式用語 10+ 領域。轉換由三層架構負責：**詞彙層**（Aho-Corasick 最長匹配）處理複合詞，**balanced defaults 層**（`--ambiguity-mode balanced`）處理 10 個高頻歧義字（几→幾、后→後、里→裡 等）的預設轉換 + protect_terms 例外保護，**字元層**（`str.translate`）補齊剩餘簡體字。102 個一對多歧義字分級管理，不確定就不轉。\n\u003c!-- zhtw:enable --\u003e\n\n詳細的詞庫分類、雙層架構原理、一對多特例、語義衝突處理表，見 [`docs/DICTIONARY-COVERAGE.md`](docs/DICTIONARY-COVERAGE.md)。\n\n---\n\n## CI/CD 整合\n\n### GitHub Actions\n\n加入 GitHub Actions，每個 PR 自動檢查：\n\n```yaml\n# .github/workflows/chinese-check.yml\nname: Chinese Check\non: [push, pull_request]\njobs:\n  check:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v4\n      - uses: actions/setup-python@v5\n        with:\n          python-version: '3.x'\n      - name: Install zhtw\n        run: pip install zhtw\n      - name: Check Traditional Chinese\n        run: zhtw check . --json\n```\n\n### GitLab CI\n\n```yaml\n# .gitlab-ci.yml\nchinese-check:\n  image: python:3.12-slim\n  script:\n    - pip install zhtw\n    - zhtw check . --json\n```\n\n有問題就會失敗，再也不怕漏掉。詳細教學請參考 [CI/CD 整合指南](docs/CI-CD-INTEGRATION.md)。\n\n---\n\n## Pre-commit Hook\n\nCommit 前自動擋住問題：\n\n```yaml\n# .pre-commit-config.yaml\nrepos:\n  - repo: https://github.com/rajatim/zhtw\n    rev: v4.1.0  # 使用最新版本\n    hooks:\n      - id: zhtw-check   # 檢查模式（建議）\n      # - id: zhtw-fix   # 或自動修正模式\n```\n\n```bash\npip install pre-commit \u0026\u0026 pre-commit install\n# 之後每次 commit 都會自動檢查\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e進階設定：只檢查特定檔案型別\u003c/summary\u003e\n\n```yaml\nrepos:\n  - repo: https://github.com/rajatim/zhtw\n    rev: v4.1.0\n    hooks:\n      - id: zhtw-check\n        types: [python, markdown, yaml]  # 只檢查這些型別\n        exclude: ^tests/fixtures/        # 排除測試資料\n```\n\u003c/details\u003e\n\n---\n\n## 忽略特定程式碼\n\n測試資料、第三方程式碼不想被轉？用 pragma 標記即可：\n\n```python\n# 忽略這一行\ntest_data = \"软件\"  # zhtw:disable-line\n\n# 忽略下一行\n# zhtw:disable-next\nlegacy_code = \"用户信息\"\n\n# 忽略整個區塊\n# zhtw:disable\ntest_cases = [\"软件\", \"硬件\", \"网络\"]\n# zhtw:enable\n```\n\n專案層級的忽略用 `.zhtwignore`（類 `.gitignore` 格式）；完整範例見 [`docs/CLI-ADVANCED.md`](docs/CLI-ADVANCED.md#zhtwignore-忽略檔案)。\n\n---\n\n\u003c!-- zhtw:disable --\u003e\n## 文件\n\n| 文件 | 內容 |\n|------|------|\n| [`docs/DICTIONARY-COVERAGE.md`](docs/DICTIONARY-COVERAGE.md) | 完整詞庫分類、雙層架構細節、一對多特例、語義衝突處理 |\n| [`docs/CLI-ADVANCED.md`](docs/CLI-ADVANCED.md) | 完整 CLI 引數、詞彙查詢（`lookup`）、多編碼支援、自訂詞庫格式 |\n| [`docs/CI-CD-INTEGRATION.md`](docs/CI-CD-INTEGRATION.md) | GitHub Actions / GitLab CI / pre-commit 深入設定 |\n| [`sdk/java/BENCHMARK.md`](sdk/java/BENCHMARK.md) | Java SDK 效能測試（JMH） |\n| [`CHANGELOG.md`](CHANGELOG.md) | 版本歷史 |\n| [`CONTRIBUTING.md`](CONTRIBUTING.md) | 貢獻指南 |\n\u003c!-- zhtw:enable --\u003e\n\n---\n\n## 開發\n\n```bash\npip install -e \".[dev]\"\npytest\nruff check .\n```\n\n有問題？[開 Issue](https://github.com/rajatim/zhtw/issues) | 想貢獻？[看 Contributing Guide](CONTRIBUTING.md)\n\n---\n\nMIT License | **tim Insight 出品**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frajatim%2Fzhtw","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frajatim%2Fzhtw","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frajatim%2Fzhtw/lists"}