{"id":50782737,"url":"https://github.com/binaryloader/korea-persona-interview","last_synced_at":"2026-06-12T05:01:36.981Z","repository":{"id":355172280,"uuid":"1227069430","full_name":"binaryloader/korea-persona-interview","owner":"binaryloader","description":"Synthetic Korean persona interview CLI using OpenAI Chat Completions API and NVIDIA Nemotron-Personas-Korea dataset (CC BY 4.0). Multi-turn interviews, MCP server, JSON mode, prompt caching.","archived":false,"fork":false,"pushed_at":"2026-05-02T08:30:07.000Z","size":2008,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-02T09:21:23.053Z","etag":null,"topics":["cli","interview","korean","llm","mcp","nemotron","openai","personas","python","synthetic-data"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/binaryloader.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-02T06:46:51.000Z","updated_at":"2026-05-02T08:30:08.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/binaryloader/korea-persona-interview","commit_stats":null,"previous_names":["binaryloader/korea-persona-interview"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/binaryloader/korea-persona-interview","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/binaryloader%2Fkorea-persona-interview","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/binaryloader%2Fkorea-persona-interview/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/binaryloader%2Fkorea-persona-interview/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/binaryloader%2Fkorea-persona-interview/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/binaryloader","download_url":"https://codeload.github.com/binaryloader/korea-persona-interview/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/binaryloader%2Fkorea-persona-interview/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34229624,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-12T02:00:06.859Z","response_time":109,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","interview","korean","llm","mcp","nemotron","openai","personas","python","synthetic-data"],"created_at":"2026-06-12T05:01:36.142Z","updated_at":"2026-06-12T05:01:36.910Z","avatar_url":"https://github.com/binaryloader.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"English | [한국어](docs/i18n/ko/README.md) | [日本語](docs/i18n/ja/README.md)\n\n# korea-persona-interview\n\n[![CI](https://github.com/binaryloader/korea-persona-interview/actions/workflows/test.yml/badge.svg)](https://github.com/binaryloader/korea-persona-interview/actions/workflows/test.yml)\n\nA field-ready CLI for running synthetic Korean persona interviews on top of OpenAI, Anthropic Claude, or any OpenAI-compatible local LLM (mlx_lm.server, vLLM, llama.cpp). Pair the NVIDIA Nemotron-Personas-Korea dataset (CC BY 4.0, about 1M Korean synthetic personas) with the model of your choice to pressure-test product ideas, interview guides, and persona hypotheses before recruiting real participants.\n\nThe tool ships four CLI subcommands (`healthcheck`, `list-personas`, `interview`, `report`), a JSON output mode for machine-to-machine use, and a Model Context Protocol (MCP) entry point that runs in either MCP server mode (server-side OpenAI/Anthropic calls) or MCP orchestrator mode (the host agent's sub-agent does the LLM work).\n\n## Features\n\n- Multi-turn interviews with 1M+ Korean synthetic personas (NVIDIA Nemotron-Personas-Korea, CC BY 4.0)\n- Three inference targets: OpenAI Chat Completions API, Anthropic Messages API, and any OpenAI-compatible local server\n- Async batch runner with concurrency 1-10, tqdm progress, SIGINT partial save, and exit-code 3 partial-failure detection\n- Persona drift detection with sentence-bounded first-person assertions for the gender/age/region/family-type axes (negation guards, third-person exclusion) plus an English-ratio safety net\n- `--persona-id` to pin specific personas by uuid for A/B comparisons; `--resume PATH` to re-run only the failed records of a previous batch\n- `--insight-model` to run interviews on a small model and the qualitative-insight call on a larger one\n- OpenAI streaming (`llm.streaming: true`) and Anthropic prompt caching (`llm.anthropic_cache_control: true`, default on)\n- LLM-as-judge drift refinement (`heuristics.llm_drift_review`, opt-in) for clearing false positives\n- `acceptable_price_signal` (`cheap`/`fair`/`expensive`/`null`) on every structured summary, plus optional WTP recommendation from the signal distribution\n- MCP entry point (`python -m src.mcp_server`) for Claude Code, Cursor, and Codex. `mcp.mode` toggles between `orchestrator` (default, no server-side key) and `server` (server-side OpenAI/Anthropic calls)\n- Automatic markdown report after every run (toggle with `--no-report`) and `--json` root mode for shell scripts\n- Single-turn mode (`--single-turn`) that bundles every question into one chat call to cut tokens\n- Token usage (prompt / completion / cached) printed at the end of every run and embedded in the JSON and report header\n- Reproducible sampling via `--seed`. Same seed plus same filter plus same dataset version returns the same personas\n- Operational hardening: persona ids sha256-masked in logs, `outputs/` created with mode 0700 (result files 0600), `--product` and per-question text length-capped at 2000 chars with prompt-injection guards\n- No external telemetry. Outbound calls go only to the configured LLM endpoint and (on first run) Hugging Face Hub for the dataset\n\n## Requirements\n\n- Python 3.12 (pinned in `.python-version`)\n- [uv](https://docs.astral.sh/uv/) package manager\n- An API key for the provider you plan to use:\n  - `OPENAI_API_KEY` for `provider=openai` (default). Get one at https://platform.openai.com/api-keys\n  - `ANTHROPIC_API_KEY` for `provider=anthropic`. Get one at https://console.anthropic.com/\n  - For local LLMs (mlx_lm.server, vLLM, llama.cpp) keep `provider=openai` and use any non-empty value\n- Internet access for the LLM API call and the first dataset download (about 1M records, cached afterwards under `~/.cache/huggingface`)\n- macOS, Linux, and Windows are all supported. There is no Apple Silicon, GPU, or local-runtime requirement\n\n## Installation\n\n`.python-version` pins Python 3.12, so `uv venv` picks the right interpreter automatically. Production deploys must install from the lockfiles to keep the resolved graph identical across environments.\n\n```bash\nuv venv --python 3.12\nsource .venv/bin/activate\nuv pip sync requirements.lock requirements-dev.lock\n```\n\nRecompile the lockfiles after editing `requirements*.txt`.\n\n```bash\nuv pip compile requirements.txt -o requirements.lock\nuv pip compile requirements-dev.txt -o requirements-dev.lock\n```\n\nTo run the CLI as `kpi` and the MCP server as `kpi-mcp-server` from anywhere, install the project in editable mode after the dependency sync.\n\n```bash\nuv pip install -e .\n```\n\nPlain pip works too if you cannot use uv.\n\n```bash\npython -m venv .venv\nsource .venv/bin/activate\npip install -e \".[dev]\"\n```\n\nDirect runtime dependencies live in `pyproject.toml` (`[project.dependencies]`). The official `openai` and `anthropic` SDKs are intentionally not used; calls go through `httpx` so the project keeps its dependency tree small and owns the retry, timeout, and logging policy. See [docs/adr/2026-05-02-openai-backend-migration.md](docs/adr/2026-05-02-openai-backend-migration.md) for the rationale.\n\n## Quick Start\n\nFive commands take you from a fresh checkout to a finished report. The first interview run downloads the dataset (5-10 minutes); subsequent runs start in under 30 seconds.\n\n```bash\nexport OPENAI_API_KEY=sk-...\npython main.py healthcheck\npython main.py list-personas --filter \"age:25-39,region:서울특별시\" --limit 20\npython main.py interview --product \"1인 가구용 반찬 정기배송, 월 39,900원, 주 2회 배송\" --filter \"age:25-39,region:서울특별시\" --n 10 --questions \"이 서비스 쓰실 의향 있나요?\" \"월 얼마면 적당한가요?\" \"거절한다면 왜요?\"\npython main.py report outputs/interview_korea-persona-interview_20260502_120000.json\n```\n\nThe `interview` command auto-generates the markdown report (default `--report`); the standalone `report` step is only needed if you used `--no-report`, edited the JSON, or want to regenerate with different `--top-n` or `--include-drift` settings.\n\nA `.env` file at the project root with `OPENAI_API_KEY=sk-...` (or `ANTHROPIC_API_KEY=sk-ant-...`) is picked up automatically. Existing shell environment variables take precedence over `.env`.\n\nTo use Claude instead, set `ANTHROPIC_API_KEY` and pass `--provider anthropic`.\n\n```bash\nexport ANTHROPIC_API_KEY=sk-ant-...\npython main.py interview --provider anthropic --model claude-haiku-4-5 --product \"...\" --questions \"...\" --n 10\n```\n\nTo use a local OpenAI-compatible server, keep `provider=openai` and override `--base-url`. Any non-empty `OPENAI_API_KEY` works; local servers ignore the value.\n\n```bash\nexport OPENAI_API_KEY=local\npython main.py interview --base-url http://localhost:8080/v1 --model llama-3-8b --product \"...\" --questions \"...\" --n 10\n```\n\n## Usage Examples\n\n### Validate a product idea\n\n```bash\npython main.py interview --product \"1인 가구용 반찬 정기배송, 월 39,900원, 주 2회 배송\" --filter \"age:25-39,region:서울특별시\" --n 10 --seed 42 --questions \"이 서비스 쓰실 의향 있나요?\" \"월 얼마면 적당한가요?\" \"거절한다면 왜요?\"\n```\n\nA markdown report with intent share (positive/neutral/negative), willingness-to-pay median plus IQR, top rejection reasons, and 5-10 actionable insights for the next round.\n\n### A/B test product copy on the same personas\n\nPin the same persona ids across two runs by extracting them from the first batch and replaying them on the second.\n\n```bash\npython main.py interview --product \"직장인 1인 가구를 위한 건강 반찬, 월 39,900원\" --filter \"age:25-39,region:서울특별시\" --n 10 --seed 42 --questions \"쓸 의향?\" \"월 얼마면?\" \"거절 사유?\" --output outputs/copy-a/\n\npython -c \"import json,sys; d=json.load(open(sys.argv[1])); print('\\n'.join(r['persona_id'] for r in d['records']))\" outputs/copy-a/interview_*.json \u003e /tmp/persona_ids.txt\n\nxargs -I {} echo --persona-id {} \u003c /tmp/persona_ids.txt | xargs python main.py interview --product \"주말에 받는 1주일치 한식 반찬 박스, 월 39,900원\" --questions \"쓸 의향?\" \"월 얼마면?\" \"거절 사유?\" --output outputs/copy-b/\n```\n\nBoth runs interview the exact same persona ids, so the only variable is the product copy.\n\n### Cohort comparison\n\n```bash\npython main.py interview --product \"직장인 1인 가구를 위한 건강 반찬 정기배송\" --filter \"age:20-29\" --n 15 --seed 42 --questions \"쓸 의향?\" \"월 얼마면?\" \"거절 사유?\" --output outputs/cohort-20s/\npython main.py interview --product \"직장인 1인 가구를 위한 건강 반찬 정기배송\" --filter \"age:30-39\" --n 15 --seed 42 --questions \"쓸 의향?\" \"월 얼마면?\" \"거절 사유?\" --output outputs/cohort-30s/\n```\n\nThe cohort intent table inside each report further splits by region and gender, so you can see whether a 20s/30s gap holds across all regions or comes from one segment.\n\n### Large-scale screen with single-turn mode\n\nSingle-turn mode bundles every question into one chat call, which roughly halves the prompt tokens versus multi-turn. The auto follow-up is disabled in this mode.\n\n```bash\npython main.py interview --product \"1인 가구용 반찬 정기배송, 월 39,900원\" --filter \"age:20-49\" --n 100 --seed 42 --concurrency 8 --single-turn --questions \"이 서비스 쓸 의향?\" \"월 얼마면 적당?\" \"거절 사유?\"\n```\n\n### Resume after a partial-failure exit\n\nA 30-person batch hit rate-limit storms and the run exited with code 3. Re-run only the failed records on top of the previous JSON.\n\n```bash\npython main.py interview --product \"...\" --filter \"...\" --n 30 --seed 42 --questions \"...\" --resume outputs/interview_korea-persona-interview_20260502_120000.json\n```\n\n`meta_extra.previous_run_id` is set to the original `interview_id` so the two runs can be linked.\n\n### Tip: ask explicit value-pricing questions\n\n`willingness_to_pay` is filled in only when the persona names a specific number. To maximize the explicit-number rate, ask a direct value-pricing question.\n\n- \"본인은 월 얼마면 가입하시겠어요?\" (anchored to a monthly subscription)\n- \"월 39,900원이면 가입할 의향이 있으세요? 아니면 얼마면 적당할까요?\" (counter-offer prompt)\n- \"비슷한 서비스에 한 달에 얼마까지 쓸 수 있어요?\" (ceiling probe)\n\nOpen-ended price questions often only return a qualitative signal (`acceptable_price_signal`), which is filled for every record but does not produce a `willingness_to_pay` integer.\n\n## CLI Reference\n\n### Subcommands\n\n| Command | Description | Exit codes |\n| --- | --- | --- |\n| `healthcheck` | Verify provider reachability and model availability | 0 ok, 1 missing key / 401 / 429 / unreachable |\n| `list-personas` | Preview personas matching a filter | 0 ok, 2 no match |\n| `interview` | Run a batch interview, save JSON, auto-generate report | 0 ok, 1 server error, 2 sample shortfall, 3 partial failure |\n| `report` | Generate a markdown report from an interview JSON | 0 ok, 1 input error, 2 no valid records |\n\nExit code 130 is reserved for `SIGINT` (Ctrl-C). The first interrupt saves a partial JSON; the second terminates immediately.\n\n### Root options\n\nThese apply to every subcommand and must be placed before the subcommand name.\n\n| Option | Default | Description |\n| --- | --- | --- |\n| `--config PATH` | `config.yaml` in cwd | Override the config file path |\n| `--no-color` | off | Disable ANSI color output (also honors `NO_COLOR` env) |\n| `--log-level LEVEL` | `INFO` (from yaml) | Set log level: `DEBUG`/`INFO`/`WARNING`/`ERROR` |\n| `--json` | off | Emit a single JSON document on stdout. Disables tqdm, color, and Korean labels. Errors land as `{\"error\": {...}}` with non-zero exit |\n\n### `interview` options\n\n| Option | Default | Description |\n| --- | --- | --- |\n| `--product TEXT` | required | One-line product description (max 2000 chars) |\n| `--questions TEXT` | required, repeatable | Each question is one `--questions` flag (max 2000 chars each) |\n| `--filter SPEC` | none | Filter DSL (see below) |\n| `--persona-id UUID` | none, repeatable | Pin specific persona ids by uuid. Disables `--n` and `--seed` randomization. Combine with `--filter` for an intersection |\n| `--n N` | `10` | Number of personas |\n| `--seed N` | `42` | Sampling seed |\n| `--concurrency N` | `4` | Async concurrency, range 1-10 |\n| `--persona-fields LIST` | `summary` | Comma-separated toggles: `summary`, `professional`, `sports`, `arts`, `travel`, `culinary`, `family` |\n| `--follow-up TEXT` | none, repeatable | Common follow-up question for every persona |\n| `--single-turn` | off | Bundle every question into one chat call. Auto follow-up disabled |\n| `--dry-run` | off | Run one persona, print to console, write neither JSON nor report |\n| `--output DIR` | `outputs/` | Result JSON directory |\n| `--report / --no-report` | `--report` | Auto-generate the markdown report after the interview |\n| `--resume PATH` | none | Re-run only the `failed` records of a previous result JSON |\n| `--provider {openai,anthropic}` | from `llm.provider` | LLM provider |\n| `--base-url URL` | from `llm.base_url` | LLM server base URL |\n| `--model MODEL_ID` | from `llm.model` | One-shot model override |\n\n### `report` options\n\n| Option | Default | Description |\n| --- | --- | --- |\n| `RESULT_PATH` | required (positional) | Path to an interview JSON |\n| `--top-n N` | `10` | Number of top rejection reasons |\n| `--include-drift` | off | Include `status: drift` records in quantitative aggregation |\n| `--output-dir DIR` | next to input JSON | Where to save the markdown report |\n| `--insight-model MODEL_ID` | from `common.report.insight_model` or `--model` | Use a different model for the qualitative-insight call only |\n\n`healthcheck` and `list-personas` accept the same provider/base-url/model triple plus filter/limit/seed. See `python main.py {sub} --help` for the full list.\n\n### Filter DSL\n\nFilters use `key:value` pairs separated by commas. Different keys combine with AND, repeated keys combine with OR.\n\n- `age:25-39` (range), `age:30` (exact)\n- `gender:F`, `gender:M`, `gender:여자`, `gender:남자`, `gender:여성`, `gender:남성` (all map to `여자`/`남자`)\n- `region:서울특별시`, `region:서울` (17 provinces, with full-name aliases)\n- `subregion:강남구` (suffix match against the `district` column)\n- `occupation_keyword:개발자` (substring match)\n\nExamples.\n\n```text\n--filter \"age:25-39,region:서울특별시\"                    # 25-39 AND Seoul\n--filter \"age:25-39,region:서울특별시,region:경기도\"      # 25-39 AND (Seoul OR Gyeonggi)\n--filter \"gender:F,occupation_keyword:디자이너\"          # female AND occupation contains 디자이너\n```\n\n## Output Format\n\n### Result JSON\n\nInterview results are written to `outputs/interview_{slug}_{YYYYMMDD_HHMMSS}.json`. The envelope contains the run metadata (`interview_id`, `slug`, `product`, `model`, `seed`, `config_snapshot`) plus a `records` array. Each record holds `persona_meta`, the multi-turn `messages`, per-question `raw_responses`, a `structured_summary`, and `flags`.\n\n| Field | Notes |\n| --- | --- |\n| `interview_id` | uuid, one per run |\n| `schema_version` | `2` since v1.1.0 (was `1` in v1.0.x). Readers can branch on this to handle the `acceptable_price_signal` field |\n| `model` | Resolved model id (e.g. `gpt-4o-mini`) |\n| `meta_extra.usage` | Aggregated `prompt_tokens`, `completion_tokens`, `total_tokens`, `cached_tokens` |\n| `meta_extra.previous_run_id` | Set when the run came from `--resume`. Holds the source run's `interview_id` |\n| `records[].status` | `completed` / `refused` / `failed` / `drift` |\n| `records[].structured_summary` | `intent`, `acceptable_price_signal`, `willingness_to_pay`, `willingness_to_pay_currency`, `rejection_reasons`, `one_line` |\n| `records[].flags` | `persona_drift`, `auto_follow_up_used`, `refusal_detected`, `truncated`, `parse_failed` |\n\nSee `docs/prd/korea-persona-interview.md` section 5.4 for the full schema. v1 JSON files load fine on v1.1.0+ (the loader fills `acceptable_price_signal=null`).\n\n### Markdown report\n\nThe report subcommand emits `outputs/report_{slug}_{YYYYMMDD_HHMMSS}.md` next to the input JSON by default.\n\n```text\n# 가상 인터뷰 리포트: {product}\n| meta table | model, seed, persona counts, dataset, usage |\n\n## 1. 정량 지표\n### 1.1. 의향률          # intent share table + bar chart\n### 1.2. 가격 수용가     # WTP median, IQR, histogram\n### 1.3. 거절 사유 빈도  # top-N rejection reasons table\n### 1.4. 코호트별 의향률 # age x region x gender, masked under min cell size\n\n## 2. 정성 인사이트\n### 2.1. 공통 반응       # up to 5 shared reactions\n### 2.2. 인사이트        # 5-10 actionable insights\n### 2.3. 코호트 차이     # cohort-level qualitative differences\n\n## 3. 제외 record 요약   # excluded record counts and reasons\n\n## 4. 한계와 출처        # synthetic-data caveat, dataset citation, model id\n```\n\n## Configuration\n\nSettings policy: `secrets via env, defaults via yaml, one-off overrides via CLI`. Configuration precedence (later overrides earlier): built-in defaults → `config.yaml` → CLI options.\n\nThe only environment variables this tool reads are secrets and the output directory.\n\n| Variable | Purpose |\n| --- | --- |\n| `OPENAI_API_KEY` | OpenAI API key (used when `provider=openai`) |\n| `ANTHROPIC_API_KEY` | Anthropic API key (used when `provider=anthropic`) |\n| `KPI_OUTPUT_DIR` | Output directory override (kept for test/CI isolation) |\n\nThe full annotated yaml lives in [config.yaml](config.yaml). Notable keys.\n\n- `llm.provider` / `llm.base_url` / `llm.model` - provider and endpoint. Defaults flip with `--provider anthropic` (`claude-haiku-4-5` on `https://api.anthropic.com/v1`)\n- `llm.context_budget` - 32000 token budget for multi-turn history (oldest user/assistant pairs dropped first; system prompt preserved)\n- `llm.streaming` / `llm.anthropic_cache_control` / `llm.extra_chat_kwargs` - provider-specific tuning\n- `batch.concurrency` (1-10, default 4) and `batch.partial_failure_threshold` (default 0.5)\n- `common.dataset.field_map`, `common.dataset.gender_aliases`, `common.dataset.province_aliases` - column and value aliases for dataset schema changes\n- `common.persona.fields` and `common.persona.system_prompt_path` - persona toggles and system prompt template path\n- `common.report.cohort_min_cell` / `histogram_bins` / `bar_width` / `insight_model` / `estimate_wtp_from_signal`\n- `common.output.output_dir` / `log_level` / `no_color`\n- `heuristics.short_answer_threshold` / `english_ratio_threshold` / `ambiguous_keywords` / `refusal_keywords` / `auto_follow_up_text` / `auto_follow_up_max` / `occupation_english_whitelist` / `llm_drift_review`\n- `mcp.mode` - `orchestrator` (default, no server-side key) or `server` (server-side OpenAI/Anthropic). See ADR-005 for the rationale\n\n### Choosing a model\n\n`gpt-4o-mini` is the default and gives a strong baseline for this workload. If you measure persona-drift rates above 5% on your own runs, try the alternatives below.\n\n- `gpt-4o-mini` (OpenAI) - default. Good Korean fluency and persona adherence\n- `gpt-4o` (OpenAI) - higher quality\n- `claude-haiku-4-5` (Anthropic) - default for `--provider anthropic`\n- `claude-sonnet-4-5` / `claude-opus-4-5` (Anthropic) - higher quality\n- Local LLMs via `mlx_lm.server`, `vLLM`, or `llama.cpp` work as long as they expose the OpenAI Chat Completions API surface. Korean fluency depends on the underlying weights; validate persona drift on a small sample first\n\nPersona-drift behavior has been validated end-to-end with `gpt-4o-mini`. Other models may need tuned thresholds (`heuristics.english_ratio_threshold`, `heuristics.short_answer_threshold`).\n\n### Customization\n\n- System prompt: edit `prompts/system_prompt.txt` (must contain `{persona_json}` and `{product}` placeholders). Point `common.persona.system_prompt_path` at a different file to use your own template\n- Heuristic thresholds: tune `heuristics.*` in `config.yaml` (lower `short_answer_threshold` for tighter follow-ups, raise `english_ratio_threshold` for technical domains, append domain-specific phrases to `refusal_keywords`/`ambiguous_keywords`)\n- Report output: raise `common.report.cohort_min_cell` to 5 or 7 for tighter masking; lower `bar_width` for narrow terminals; tune `histogram_bins` for different price resolution\n\n## Integration with External Agents\n\nThere are three entry points: CLI, MCP server, and MCP orchestrator. They are not interchangeable - the choice depends on whether you want server-side LLM calls (CLI, MCP server) or whether the host agent's sub-agent does the LLM work (MCP orchestrator).\n\n### Entry point matrix\n\n| Entry point | mode (yaml) | Server-side LLM call | Host LLM call | API key required |\n| --- | --- | --- | --- | --- |\n| CLI (`kpi`) | n/a | yes | no | provider-dependent |\n| MCP server | `mcp.mode: \"server\"` | yes | no | provider-dependent |\n| MCP orchestrator | `mcp.mode: \"orchestrator\"` (default) | no | yes (host sub-agent) | none |\n\nThere is no automatic fallback between modes. The chosen path is reflected on every response as `\"backend\": \"mcp_server\"` or `\"backend\": \"mcp_orchestrator\"`. ADR-005 captures the rationale (sampling mode was removed in v1.2.0 because mainstream MCP clients did not advertise the capability).\n\nIf you run `python -m src.mcp_server` outside an MCP host with `mcp.mode: \"orchestrator\"`, the helper tools still work but `interview` is blocked with a hint to use `build_batch_prompts` + sub-agent + `aggregate_results` instead.\n\n### Tool exposure by mode\n\n| Tool | MCP server | MCP orchestrator | Notes |\n| --- | --- | --- | --- |\n| `healthcheck` | yes | yes | server mode pings the provider; orchestrator mode returns ok + cwd |\n| `list_personas` | yes | yes | preview personas matching a filter |\n| `interview` | yes | no (blocked) | server-side batch interview |\n| `report` | yes | yes | server mode runs the qualitative-insight LLM call; orchestrator mode skips it |\n| `build_persona_prompt` | no | yes | system prompt + persona dict for one persona |\n| `build_batch_prompts` | no | yes | system prompts for N personas (host sub-agent fan-out) |\n| `aggregate_results` | no | yes | takes records from the host and emits the markdown report |\n| `detect_persona_drift` / `should_auto_follow_up` / `parse_structured_summary` / `interview_record_schema` | yes | yes | helpers. CLI and MCP server auto-apply; MCP orchestrator must invoke explicitly |\n\n### Registering the MCP entry point\n\nRun the server manually to verify it starts.\n\n```bash\npython -m src.mcp_server\n```\n\nRegister it in Claude Code by adding the snippet below to `~/.claude/mcp.json` (create the file if it does not exist). The `cwd` must point at the project root so that `config.yaml`, `prompts/system_prompt.txt`, `.env`, and `outputs/` resolve correctly.\n\n```json\n{\n  \"mcpServers\": {\n    \"korea-persona-interview\": {\n      \"command\": \"/absolute/path/to/.venv/bin/python\",\n      \"args\": [\"-m\", \"src.mcp_server\"],\n      \"cwd\": \"/absolute/path/to/korea-persona-interview\"\n    }\n  }\n}\n```\n\nFor Cursor, add the snippet to `.cursor/mcp.json` at the project root. Drop-in copies live under [examples/mcp/](examples/mcp/).\n\nIn MCP server mode, drop your `OPENAI_API_KEY` (or `ANTHROPIC_API_KEY`) into the project's `.env` before the first run. The stdlib `.env` loader uses `setdefault` semantics so a key already exported in the shell wins. Putting the key in the agent's mcp.json `env` block also works but the secret ends up in plaintext inside the agent's config and is more likely to leak through git, dotfile sync, or screenshots.\n\n### MCP orchestrator mode usage (default)\n\nThe host agent owns the LLM. The flow:\n\n1. Call `build_batch_prompts` with `product`, `questions`, `n` (and optionally `filter`, `seed`, `persona_ids`). Returns N system prompts plus persona dicts\n2. The host fans out N sub-agents (one per persona). Each sub-agent uses its own LLM with the returned system prompt as the system message and the questions as user turns. The host can also call `should_auto_follow_up` and `detect_persona_drift` between turns to keep behavior parity with the CLI heuristic\n3. After the LLM call the host calls `parse_structured_summary` on the LLM's structured-summary text to get a normalized dict, then assembles a record per `interview_record_schema`\n4. The host calls `aggregate_results` with the assembled `records`. The tool runs the quantitative aggregation and writes the markdown report. Qualitative insights default to a fallback message; the host can pass its own as `insights` to be embedded\n\n### MCP server mode usage\n\nSet `mcp.mode: \"server\"` in `config.yaml` to call OpenAI/Anthropic server-side. Ask the agent in plain Korean: \"1인 가구 대상 반찬 정기배송 (월 39,900원)을 25-39세 서울 30명에게 인터뷰 돌리고 리포트까지 만들어 줘\" and it will call `interview` then `report` back-to-back, returning the markdown path.\n\n### --json mode for shell scripts\n\nFor agents that drive a CLI directly, pass `--json` at the root group. Disables tqdm, color, and Korean labels; emits a single JSON document on stdout. Logs continue to flow to stderr and `outputs/logs/run_*.jsonl`.\n\n```bash\npython main.py --json healthcheck\n# {\"ok\": true, \"base_url\": \"https://api.openai.com/v1\", \"model\": \"gpt-4o-mini\", \"models\": [...]}\n\npython main.py --json interview --product \"...\" --questions \"...\" --n 10\n# {\"ok\": true, \"output_path\": \"outputs/interview_*.json\", \"report_path\": \"outputs/report_*.md\", \"summary\": {...}, \"usage\": {...}, \"model\": \"gpt-4o-mini\"}\n```\n\nErrors are emitted as `{\"error\": {\"code\": \"...\", \"message\": \"...\", \"exit_code\": N}}` with a non-zero exit code.\n\n## Development\n\n```bash\nuv venv --python 3.12\nsource .venv/bin/activate\nuv pip sync requirements.lock requirements-dev.lock\npytest tests/ -v\n```\n\nThe suite mocks the OpenAI/Anthropic APIs with `pytest-httpx` and the dataset with monkeypatch fixtures, so it does not require a live API key or network access. Coverage spans config, filter DSL, persona loader, LLM client/backend, interview session, persona drift, batch runner, report quant, MCP dispatch in both modes, MCP orchestrator helper tools, error messages, logging, and CLI integration.\n\nManual smoke tests that exercise a real LLM API call live under `tests/manual/` and are excluded from the default run.\n\nUse Conventional Commits (`feat:`, `fix:`, `chore:`, `docs:`, `refactor:`, `test:`). Do not put `Co-Authored-By` trailers on commits.\n\n## Limitations and Disclaimer\n\nSynthetic personas are not a replacement for real user interviews. The dataset is generated, not sampled from real respondents, so the demographic distribution may diverge from the actual Korean population. Treat the output as a quick gut check before recruiting real participants and as a way to pressure-test interview questions and product copy before spending recruitment budget.\n\nEvery report and JSON file produced by this tool also carries the synthetic-data disclaimer in its footer.\n\nThe `--product` text and the persona metadata used for each interview are sent to whichever LLM endpoint you configure (OpenAI, Anthropic, a local server, or the MCP host agent's LLM). Do not put unreleased IP, trade secrets, or personally identifiable information into `--product`. Abstract or paraphrase sensitive parts before running the tool. The tool itself ships no external telemetry beyond the LLM call and the initial dataset download from Hugging Face.\n\nAPI billing is the user's responsibility. Token usage (prompt / completion / cached) is printed at the end of each run, written into the result JSON `meta_extra.usage`, and surfaced in the report header so you can correlate it against your provider's invoice. The tool does not estimate USD cost. Persona-drift quality is validated against `gpt-4o-mini`; other models may need tuned thresholds.\n\nLegal and ethical review of the output is the user's responsibility. The tool does not run any compliance or PII filter beyond the input-secret policy.\n\n## Roadmap\n\nA short list of v1.3.0 candidates. Full details in [docs/backlog/v1.3.0.md](docs/backlog/v1.3.0.md).\n\n- FastAPI REST API on top of the same application layer\n- OpenAI Batch API path for offline runs\n- Multi-model A/B routing (run the same persona sample on two different models and diff the outputs)\n- Provider quality validation report (golden-dataset drift measurement for OpenAI, Anthropic, local LLM)\n- macOS Keychain / Linux libsecret / Windows Credential Manager integration for API keys\n- Per-record streaming write to disk so OOM/crash mid-batch loses fewer records than the SIGINT partial save\n\n## Dataset and Credits\n\nThis project uses the [nvidia/Nemotron-Personas-Korea](https://huggingface.co/datasets/nvidia/Nemotron-Personas-Korea) dataset.\n\n- Title: Nemotron-Personas-Korea\n- Author: NVIDIA Corporation (2025)\n- Source: https://huggingface.co/datasets/nvidia/Nemotron-Personas-Korea\n- License: [Creative Commons Attribution 4.0 International (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/)\n- Modifications: none. The dataset is downloaded from Hugging Face Hub at runtime and sampled in-memory. No derivative dataset is redistributed by this repository\n\nAbout 1M records and 7M synthetic Korean personas covering name, gender, age, marital status, education, occupation, residence (province and district), and seven persona facets (professional, sports, arts, travel, culinary, family, summary).\n\nCC BY 4.0 permits commercial use with attribution. Credit goes to NVIDIA Corporation. Every markdown report and JSON record produced by this tool also carries the dataset citation and license in its footer so attribution travels with downstream artifacts.\n\n## Acknowledgments\n\nThis project was developed with [Claude Code](https://claude.com/claude-code).\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbinaryloader%2Fkorea-persona-interview","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbinaryloader%2Fkorea-persona-interview","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbinaryloader%2Fkorea-persona-interview/lists"}