{"id":49998000,"url":"https://github.com/zhangzeyu99-web/localization-workflow","last_synced_at":"2026-05-19T08:25:19.368Z","repository":{"id":353533299,"uuid":"1190826426","full_name":"zhangzeyu99-web/localization-workflow","owner":"zhangzeyu99-web","description":"Game localization QA workflow with Excel delivery gates, strict AI review, and quality regression harness","archived":false,"fork":false,"pushed_at":"2026-05-09T10:13:15.000Z","size":990,"stargazers_count":0,"open_issues_count":6,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-09T11:36:18.542Z","etag":null,"topics":["ai-translation-review","excel","excel-qa","game-localization","game-tools","localization","localization-qa","lqa","python","qa-automation","quality-harness","terminology","translation-qa"],"latest_commit_sha":null,"homepage":"https://github.com/zhangzeyu99-web/localization-workflow","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zhangzeyu99-web.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-03-24T16:52:28.000Z","updated_at":"2026-05-09T10:13:18.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/zhangzeyu99-web/localization-workflow","commit_stats":null,"previous_names":["zhangzeyu99-web/localization-workflow"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/zhangzeyu99-web/localization-workflow","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhangzeyu99-web%2Flocalization-workflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhangzeyu99-web%2Flocalization-workflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhangzeyu99-web%2Flocalization-workflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhangzeyu99-web%2Flocalization-workflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zhangzeyu99-web","download_url":"https://codeload.github.com/zhangzeyu99-web/localization-workflow/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zhangzeyu99-web%2Flocalization-workflow/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33208150,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-19T07:54:09.561Z","status":"ssl_error","status_checked_at":"2026-05-19T07:54:08.508Z","response_time":58,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-translation-review","excel","excel-qa","game-localization","game-tools","localization","localization-qa","lqa","python","qa-automation","quality-harness","terminology","translation-qa"],"created_at":"2026-05-19T08:25:18.401Z","updated_at":"2026-05-19T08:25:19.362Z","avatar_url":"https://github.com/zhangzeyu99-web.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Localization QA Workflow\n\n\u003e Game localization QA workflow for Excel language packs, AI draft review, terminology consistency, placeholder validation, and UI tag safety.\n\n![Localization QA Workflow cover](docs/assets/localization-workflow-cover.svg)\n\n这是一个面向**游戏本地化团队**的质检工作流仓库，专门处理 **AI 粗翻后的 Excel 语言包**。它会自动检查变量占位符、UI 标签、术语一致性、格式模式和高风险翻译问题，并输出可复核的质检结果。\n\n**Keywords:** game localization QA, localization workflow, Excel translation QA, terminology consistency, placeholder validation, UI tag validation, AI translation review, game LQA.\n\n## Why This Project Exists\n\nMost localization tools stop at translation or generic QA. This project focuses on the messy middle:\n\n- AI draft output that still needs human review\n- Excel-based language packs used by game teams\n- Variables, BBCode, UI tags, and short UI strings that are easy to break\n- Repeatable QA rules that can run before delivery instead of after bug reports\n\n## 中文概述\n\n面向游戏本地化场景的自动化质检工具，处理 AI 粗翻后的语言包（Excel 格式），自动检测变量占位符、UI 标记、术语一致性等问题，输出质检报告。\n\n## 30-Second Example\n\n```bash\ngit clone https://github.com/zhangzeyu99-web/localization-workflow.git\ncd localization-workflow\npip install -r requirements.txt\npython process_language.py --input sample-language.xlsx --lang en\n```\n\n## English Sample Input and Output\n\n![English QA report preview](docs/assets/english-qa-report-preview.svg)\n\n- [English sample input CSV](examples/english-sample-input.csv)\n- [English sample QA output](examples/english-sample-output.md)\n\n## 功能\n\n| 模块 | 说明 |\n|------|------|\n| **变量检测** | 检查翻译中变量占位符（`{0}`, `%s` 等）是否完整 |\n| **UI 标记检测** | 检查 UI 控件标记（`\u003ccolor\u003e`, `\u003csize\u003e` 等）是否匹配 |\n| **术语一致性** | 基于术语库检查关键术语翻译是否一致；人名/角色名和连续编号词条批内一致性作为 hard gate |\n| **格式模式检测** | 检测数字格式、标点、空格等模式问题 |\n| **AI 审查** | 调用 LLM 对可疑条目进行二次审查 |\n| **GUI 界面** | 可视化操作界面，支持拖放 Excel 文件 |\n\n## 支持语言\n\n| 优先级 | 语言 |\n|--------|------|\n| P0 | 英语 |\n| P1 | 印尼语、法语、德语、土耳其语、西班牙语、葡萄牙语、俄语 |\n\n说明：通用 QA harness 支持上述语言代码；英语全量翻译 harness v1 仍只支持 `en`。\n\n## 安装\n\n```bash\npip install -r requirements.txt\n```\n\n## 使用\n\n```bash\n# CLI 模式\npython cli.py --input \u003cexcel_file\u003e --language en\n\n# GUI 模式\npython gui.py\n\n# 处理单个语言\npython process_language.py --input \u003cexcel_file\u003e --lang en\n\n# 英语全量翻译 harness：目标列为空或中文回填时先准备 workpack\npython scripts/run_translation_harness.py --input \u003cexcel_file\u003e --term-base \u003cterms.xlsx\u003e --lang en --output-dir \u003coutput_dir\u003e --style-hint \"US mobile SLG; concise, idiomatic wording\"\n\n# 主 agent 写完 translation_response.jsonl 后，严格按 ID 回填并进入 QA\npython scripts/run_translation_harness.py --input \u003cexcel_file\u003e --term-base \u003cterms.xlsx\u003e --lang en --output-dir \u003coutput_dir\u003e --response \u003coutput_dir\u003e/translation_response.jsonl --run-qa\n\n```\n\n## 项目结构\n\n```text\nlocalization-workflow-project/\n├── cli.py                  # 命令行入口\n├── gui.py                  # GUI 入口（Tkinter）\n├── process_language.py     # 语言处理主逻辑\n├── requirements.txt        # Python 依赖\n├── utils/                  # 工具模块\n│   ├── ai_checker.py       #   AI 审查器\n│   ├── excel_reader.py     #   Excel 读取器\n│   ├── pattern_detector.py #   格式模式检测\n│   ├── term_checker.py     #   术语一致性检查\n│   ├── ui_detector.py      #   UI 标记检测\n│   └── variable_checker.py #   变量占位符检查\n├── docs/\n│   └── 使用说明书.md        # 详细使用文档\n├── tools/\n│   └── codex-residential-launcher/  # Codex + Clash 住宅 IP 启动封装（见该目录 README）\n├── output/                 # 输出目录\n│   └── ai_review/          #   AI 审查结果\n└── workflow-design.md      # 需求与设计文档\n```\n\n## 相关工具\n\n- **[Codex + 固定住宅 IP（Clash Verge）完整落地指南](tools/codex-residential-launcher/README.md)**  \n  Windows 下可用 **`tools/codex-residential-launcher/start-codex-desktop.cmd`**（根入口，路径自适应）或 `scripts\\Start-CodexDesktop.cmd` 启动 Desktop；另有 CLI 脚本、Merge 模板、VS Code 代理说明与排障；可单独拷贝目录到其他机器复用。\n\n## 文档\n\n- [变更日志](CHANGELOG.md)\n- [使用说明书](docs/使用说明书.md)\n- [工作流说明](工作流说明.md)\n- [工作流设计文档](workflow-design.md)\n- [不可读缩写与截断词拦截规则](docs/readability-abbreviation-gate.md)\n- [质量回归 Harness](docs/quality-harness.md)\n- [英语全量翻译 Harness](docs/translation-harness.md)\n- [项目定制 Harness 流程](docs/project-custom-harness.md)\n- [项目管理](docs/project-management.md)\n\n项目定制 harness 启动模板：\n\n- [项目资料 Markdown 模板](templates/project_profile_template.md)\n- [项目资料 JSON 模板](templates/project_profile_template.json)\n- [项目资料 YAML 模板](templates/project_profile_template.yaml)\n- [翻译提示词模板](templates/translation_prompt_template.txt)\n\n## 最新更新（2026-05-15）\n\n本次更新把最终交付 gate 收口到 `quality_harness`：术语默认强约束、UI 长度进入最终 workbook 扫描、连续编号词条优先按术语表或首个高质量译法统一，软术语必须显式标记。\n\n### 规则权威\n\n- `AGENTS.md` 和 `scripts/run_quality_harness.py` 是当前权威规则来源。\n- `README.md`、`docs/使用说明书.md` 只保留摘要和历史入口说明，不作为最终放行标准。\n- 最终交付必须跑：\n\n```bash\npython scripts/run_quality_harness.py fixtures/quality_regression.json --workbook \u003cfinal.xlsx\u003e\n```\n\n## 2026-05-12 更新\n\n本次更新把英语目标列为空或中文回填时的“全量翻译 -\u003e 严格回填 -\u003e QA 收口”固定成 agent-operated harness。脚本不调用 API，也不自动操作 ChatGPT 网页；主 agent 负责生成译文，脚本负责抽取、协议校验、按 ID 回填、隐藏缓存和后半 QA。\n\n### 已合入改动\n\n- 英语全量翻译 harness\n  - `scripts/run_translation_harness.py` 可生成 `translation_workpack.jsonl`、`translation_manifest.json` 和 `translation_response.jsonl`\n  - response 只允许 `ID + translation`，回填前会拒绝漏 ID、重复 ID、额外 ID、乱序、输入漂移、占位符漂移、标签漂移和换行漂移\n  - 目标列为空、近乎全空或大量中文回填时，先把中文作为种子列再全量替换，避免空列流程和 QA 断层\n  - 支持 `--style-hint` / `--style-hint-file` 注入项目风格提示，例如“面向美国移动端用户、SLG、简短地道表达”\n  - 同项目翻译记忆只写入输入目录下的 `.translation_cache/\u003clang\u003e.jsonl`，默认不跨项目复用；最终交付前删除缓存，除非仍在连续返修同一批内容\n  - 已用真实 63 行中文回填英语表验证完整闭环：机审需确认 `0`，`quality_harness` 对最终 workbook 返回 `passed: True`\n- 通用 workbook QA 扫描增强\n  - `run_quality_harness.py` 扫真实 workbook 时会跳过 glossary/术语 sheet 和审计/裁决类辅助 sheet，只对正文和 UI 做通用行级 QA，避免术语词典 Title Case 或返修记录误杀\n  - `run_quality_harness.py` workbook 扫描改为非只读读取，且 `rows_scanned=0` 会失败，避免假通过\n  - `run_quality_harness.py --workbook \u003c最终版.xlsx\u003e` 会自动读取 workbook 内置术语表、同目录术语表，以及常见输出目录上一级的术语表；`--term-base` 仅作为覆盖/补充入口\n  - 术语表默认强约束；显式标记 `soft/generic/common/参考/泛词/通用词` 的术语作为软提示\n  - 自动发现的术语表中，`分类` 含 `人名`、`角色`、`person`、`character`、`name` 的术语会作为硬门槛；正文命中中文名时必须使用对应英文名\n- 严格 AI 审核链路\n  - `prepare / merge` 以 manifest 和 fingerprint 绑定批次，避免输入输出词条错配\n  - 模型回填必须逐条输出 `ID | KEEP` 或 `ID | FIX | corrected translation`，缺行或乱序会直接拒绝合并\n- 工作区批处理入口\n  - 支持按目录发现语言表和术语表，适合直接处理项目目录\n- 拼音残留检测与自动修复\n  - 能识别 `Hongshangu`、`Jushizhen`、`Meiguihu`、`Xigu...`、`Lanshidi` 这类专名拼音残留\n  - 对已知地图名和地点名可直接按标准映射回写\n- UI 短文案长度硬约束\n  - 先把中文原文可见长度 `\u003c= 10` 的短文本纳入候选\n  - 再按类型分层处理：紧凑 UI 走硬约束，普通短文本走软提示，编号专名和复杂富文本豁免\n  - 英语预算为 `min(32, max(10, 中文可见长度 * 2 + 14))`，用于避免把正常词组压成拼音或截断词；印尼语预算为 `min(34, max(12, 中文可见长度 * 2 + 15))`\n  - 机审会新增 `ui_length_overflow`\n  - AI 审核 prompt 会带上 `LEN:mode=...,source=...,target=...,budget\u003c=...` 元数据，要求在自然可懂前提下尽量贴近中文长度\n- 不可读缩写 / 截断词硬门槛\n  - 机审会拦截 `PERR`、`DTT`、`IJA`、`CL##1##2` 这类内部代码式文案\n  - 机审会拦截 `rewa`、`obta`、`coll imme`、`tmrw` 这类截断词\n  - AI 审核 prompt 明确要求：长度和可读性冲突时，以自然可懂为准\n- 英文大小写风格复检\n  - 错误、状态、提示类文案默认使用 sentence case，例如 `Too many roles`、`System error`\n  - Title Case 只保留给专名、功能名、标题、商店项和术语表明确要求的名称\n- 质量回归 Harness\n  - `fixtures/quality_regression.json` 固化历史坏例和好例，防止旧问题回归或误杀合理译文\n  - `scripts/run_quality_harness.py` 可同时跑 fixture 和真实 workbook 扫描\n\n### 典型适用场景\n\n- 游戏 UI 文案、按钮、标签、菜单项\n- 地图名、地点名、编号地名\n- 含变量、BBCode、换行和富文本标签的语言表\n\n## 使用注意事项\n\n### 1. 首跑建议走小批次\n\n- 建议 `batch-size` 先用 `80` 到 `100`\n- 第一轮优先稳定性，不优先吞吐量\n\n### 2. `prepare` 和 `merge` 之间不要换输入文件\n\n- 严格链路会校验输入指纹\n- 如果语言表在两步之间被替换、重排或改动，`merge` 会拒绝回填\n\n### 3. 短文本长度约束先入池，再分层\n\n- 中文原文可见长度 `\u003c= 10` 的文本会先进入长度检查候选池\n- 其中紧凑 UI 文案是硬约束，普通短文本是软提示，编号专名和复杂富文本会豁免\n- 完整句子仍然不适合用这条规则强压长度\n- 规则优先级仍然是：自然可懂第一，长度第二\n\n### 4. 专名和世界观名词要尽量进术语表\n\n- 如果项目故意保留音译或有官方专名，不要只依赖内置规则\n- 建议把标准译名补进术语表，避免被拼音残留规则或长度规则误判\n\n### 5. 复杂富文本不要走激进自动修复\n\n- 含大量 `[color]`、`[size]`、`[v0]`、`[b0]` 之类标签的行，优先保结构安全\n- 模型和规则都必须保留占位符、标签和换行\n\n### 6. 报告里重点关注这几类问题\n\n- `romanized_name_residue`\n- `ui_length_overflow`\n- `opaque_abbreviation`\n- `clipped_word`\n- `title_case_overuse`\n- `variable_missing`\n- `variable_extra`\n- `term_missing`\n\n### 7. 最终交付必须跑 quality harness\n\n```bash\npython scripts/run_quality_harness.py fixtures/quality_regression.json --workbook \u003cfinal.xlsx\u003e\n```\n\n说明：QA 会自动读取 workbook 内置术语表、同目录术语表，以及常见输出目录上一级的术语表。只有自动发现失败或需要强制指定额外术语源时，才补 `--term-base \u003cterms.xlsx\u003e`。\n\n### 8. 术语 hard gate 与软术语分开看\n\n- 未标记为软参考的术语默认强约束，`term_missing`、`term_partial_hit`、`term_capitalization` 都会阻断最终交付\n- 如果某个术语只是泛词参考，请在术语表 `分类/category/type` 中显式写 `soft`、`generic`、`common`、`参考`、`泛词` 或 `通用词`\n- 交付判断先看 hard gate：变量、标签、换行、中文残留、乱码、坏缩写、截断词、明显大小写问题必须清零\n- 人名/角色名不属于可保留软提示；如果术语表标了人名，`person_name_term_mismatch` 必须清零后才能交付\n- 软术语问题会以 `term_soft_*` 统计，不阻断最终交付\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzhangzeyu99-web%2Flocalization-workflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzhangzeyu99-web%2Flocalization-workflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzhangzeyu99-web%2Flocalization-workflow/lists"}