{"id":46705943,"url":"https://github.com/pppp606/ai-comprehension-test","last_synced_at":"2026-03-09T08:11:24.365Z","repository":{"id":321154514,"uuid":"1081672209","full_name":"pppp606/ai-comprehension-test","owner":"pppp606","description":null,"archived":false,"fork":false,"pushed_at":"2025-10-28T04:04:26.000Z","size":81,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-28T05:24:40.518Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pppp606.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-23T05:55:50.000Z","updated_at":"2025-10-28T04:04:30.000Z","dependencies_parsed_at":"2025-10-28T05:24:43.292Z","dependency_job_id":"86841628-5a88-4534-a767-c904d22428b3","html_url":"https://github.com/pppp606/ai-comprehension-test","commit_stats":null,"previous_names":["pppp606/ai-comprehension-test"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/pppp606/ai-comprehension-test","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pppp606%2Fai-comprehension-test","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pppp606%2Fai-comprehension-test/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pppp606%2Fai-comprehension-test/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pppp606%2Fai-comprehension-test/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pppp606","download_url":"https://codeload.github.com/pppp606/ai-comprehension-test/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pppp606%2Fai-comprehension-test/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30287500,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-09T02:57:19.223Z","status":"ssl_error","status_checked_at":"2026-03-09T02:56:26.373Z","response_time":61,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-03-09T08:11:21.081Z","updated_at":"2026-03-09T08:11:24.353Z","avatar_url":"https://github.com/pppp606.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AI Comprehension Test\n\n`ai-comprehension-test` is a CLI tool that evaluates how easily an AI coding agent (such as Claude Code or Codex) can understand a TypeScript codebase. The tool scans selected project files, generates AI-driven comprehension tests, executes them through a local AI agent, and summarizes the findings in either a console or JSON report.\n\nImportant: the goal is not to “make tests pass”, but to detect whether the code is easy for an AI to read accurately. Reports focus on grounded, code-backed signals rather than pure prompt compliance.\n\n## Key Features\n\n- **Automated project scanning** – analyzes classes, methods, and functions in your TypeScript project.\n- **Multiple comprehension test types** – static analysis, stability checks (local, deterministic scoring), and AI-generated Jest tests.\n- **Groundedness (AST-based)** – extracts reference facts from code (defaults/limits/normalization/constants) and compares them to the LLM’s MR to assess factual alignment.\n- **AI agent orchestration** – prompts a local Claude Code or Codex instance and parses structured responses.\n- **Detailed reporting** – view friendly console summaries or machine-readable JSON output with actionable insights.\n\n## Prerequisites\n\n- Node.js v18 or later\n- npm v9 or later\n- A local AI agent binary available on your `PATH` (Claude Code or Codex)\n- Optional: `claude` CLI with the `-p` flag or `codex exec`\n\nIf you plan to run test-generation suites, ensure `jest` and `ts-jest` dependencies are installed (they are listed in `package.json`).\n\n## Installation\n\nClone the repository and install dependencies:\n\n```bash\ngit clone https://github.com/your-org/ai-comprehension-test.git\ncd ai-comprehension-test\nnpm install\n```\n\nBuild the CLI:\n\n```bash\nnpm run build\n```\n\n## Environment Configuration\n\nYou can configure how the CLI invokes the AI agent and the evaluation behavior via environment variables:\n\n| Variable | Description | Default |\n|----------|-------------|---------|\n| `AI_COMP_TEST_COMMAND` | Command used to call the AI agent. | `claude` |\n| `AI_COMP_TEST_ARGS` | Arguments passed to the AI agent command. | `-p` |\n| `AI_COMP_TEST_TIMEOUT` | Timeout in seconds for AI calls. | `180` |\n| `AI_COMP_TEST_DEBUG` | Enable verbose debugging output (`true` / `false`). | `false` |\n| `AI_COMP_TEST_TYPES` | Comma-separated test types to run: `static-analysis,stability,test-generation`. | (all) |\n| `AI_COMP_TEST_STABILITY_ITER` | Number of responses per stability test (low for faster runs). | `5` |\n| `AI_COMP_TEST_STABILITY_PROMPT_MODE` | `gamified` to enable +1/0/−1 scoring rule in prompts. | (off) |\n| `AI_COMP_TEST_INCLUDE_SCHEMA` | `1` to include JSON Schema and example in prompts. | `1` |\n| `AI_COMP_TEST_RECOMPUTE_COVERAGE` | `1` to recompute coverage metrics from MR in the runner. | (off) |\n| `AI_COMP_TEST_AMBIGUITY` | `model|hybrid` to enable Transformers.js ambiguity detector. | (off) |\n| `AI_COMP_TEST_AMBIGUITY_MODEL` | ZSC model id (e.g., `Xenova/distilbert-base-uncased-mnli`). | (default) |\n| `AI_COMP_TEST_INCLUDE_AMBIGUITY_IN_SCORE` | `1` to blend model specificity into stability score. | (off) |\n| `AI_COMP_TEST_AMBIGUITY_WEIGHT` | Blend weight (0–0.5). | `0.15` |\n\nFor example, to run against Codex:\n\n```bash\nexport AI_COMP_TEST_COMMAND=codex\nexport AI_COMP_TEST_ARGS=\"exec\"\n```\n\n### Using .env\n\nThis CLI loads environment variables from a `.env` file in the current working directory. An example template is provided as `.env.example`.\n\nExample `.env`:\n\n```\nAI_COMP_TEST_COMMAND=claude\nAI_COMP_TEST_ARGS=-p\nAI_COMP_TEST_TIMEOUT=180\n# AI_COMP_TEST_WORKDIR=/absolute/path/to/workdir\n```\n\nNotes:\n- `AI_COMP_TEST_ARGS` is split by spaces. Prefer `--flag=value` for values containing spaces.\n- Shell environment variables override values from `.env`.\n- The `.env` file should be placed in the directory where you run the CLI.\n\n## Usage\n\nThe CLI exposes a single `run` command. Invoke it from the root of the project you want to analyze:\n\n```bash\nnpx ai-comprehension-test run [project-path]\n```\n\n### Options\n\n- `--files \u003cfiles\u003e` – Comma-separated list of files to limit the scan, e.g. `--files \"src/UserService.ts,src/PaymentService.ts\"`.\n- `--output \u003cdir\u003e` – Directory for generated artifacts (defaults to `.ai-comp-test`).\n- `--format \u003cformat\u003e` – Output format: `console` (default) or `json`.\n- `--json-file \u003cfile\u003e` – When `--format json`, the output filename (default: `results.json`).\n- `--verbose` – Print detailed logs during scanning, AI calls, and reporting.\n- `--help` – Display help information.\n\n### Examples\n\nRun the full suite against the current project:\n\n```bash\nnpx ai-comprehension-test run\n```\n\nAnalyze a subset of files and produce a JSON report:\n\n```bash\nnpx ai-comprehension-test run \\\n  --files \"src/services/UserService.ts\" \\\n  --format json \\\n  --output ./results \\\n  --verbose\n```\n\nExecute against another project directory:\n\n```bash\nnpx ai-comprehension-test run ../another-project\n```\n\n### Output\n\n- **Console report** – Summaries, tables of test outcomes, and highlighted critical issues.\n- **JSON report** – Structured machine-readable output saved to `\u003coutput\u003e/results.json` (or `--json-file`) when `--format json` is specified.\n\n## Stability and Groundedness\n\n### Stability (consistency)\nStability judgment is local and deterministic:\n- Parses each MR response (with code‑fence and loose-object recovery).\n- Vectorizes per-field text using TF‑IDF; computes average pairwise cosine similarity per field; aggregates to a 0–100 `consistencyScore` and `consistencyLevel`.\n- Derives `mainIdea`, `variations`, `reasoning`, and `codeClarity`.\n- Optional: embedding/ZSC ambiguity integration via `@xenova/transformers`.\n\n### Groundedness (fact alignment)\nThe runner extracts a Reference MR from code via AST and compares it to the LLM MR:\n- Extracts defaults (`this.x=…`, `??`), limits (nested `Math.min/max`), normalization (s→ms), and constants.\n- Computes `groundedness.score = factCoverage` (0–100) and lists `mismatches`.\n- Recomputes `coverage.schemaCoverage/specificity` from the parsed MR to avoid under-reporting.\n\nThis makes “AI comprehension” observable as factual alignment to code, not only prompt adherence.\n\n### Prompt Schema and Example\nPrompts include the full MR schema (see `schemas/mr.schema.json`) and a concrete example. Keys must be present even when unknown; do not guess—use `unknowns` with reasons. Ambiguous language is disallowed.\n\nThis removes one LLM call from the Stability flow and improves reproducibility.\n\n### Embedding Mode (Optional)\n\nYou can switch to an embedding-based scorer for better synonym/phrasing robustness:\n\n- Set `AI_COMP_TEST_STABILITY_MODE=embedding`\n- Optional: `AI_COMP_TEST_EMBED_MODEL` to override the default `Xenova/all-mpnet-base-v2`.\n- On first run, the model is downloaded (network required). Subsequent runs are cached.\n\nThe embedding scorer creates sentence embeddings per field using `@xenova/transformers` (mean pooled) and uses pairwise cosine similarity, then aggregates like the default mode.\n\nNotes:\n- Default embedding model: `Xenova/all-mpnet-base-v2` (higher accuracy; heavier).\n- Lighter alternative: `Xenova/all-MiniLM-L6-v2` (faster; slightly lower accuracy).\n\n## Development Workflow\n\nStart the CLI in development mode with `tsx`:\n\n```bash\nnpm run dev -- run --help\n```\n\nRun TypeScript compilation checks:\n\n```bash\nnpm run build\n```\n\nExecute unit tests (if present, with Jest):\n\n```bash\nnpm test\n```\n\n\u003e **Note:** Initial dependency installation may require network access to fetch packages such as `@types/jest`.\n\n## Project Structure\n\n```\nai-comprehension-test/\n├── src/\n│   ├── cli/                # CLI entry point and command wiring\n│   ├── core/               # AI agent client, test generation, and runners\n│   ├── reporter/           # Console and JSON reporters\n│   ├── scanner/            # TypeScript AST scanner and type definitions\n│   ├── templates/          # Prompt templates for each test type\n│   └── types/              # Shared TypeScript interfaces\n├── jest.config.js          # Jest configuration for generated tests\n├── package.json            # Dependencies and scripts\n├── tsconfig.json           # TypeScript configuration\n└── README.md               # Project documentation\n```\n\n## Troubleshooting\n\n- **AI command not found:** Ensure `claude` or `codex` is installed locally and available on your `PATH`.\n- **Timeouts during AI calls:** Increase `AI_COMP_TEST_TIMEOUT` or verify that the agent is responsive.\n- **Jest test failures:** Review the generated test files in the output directory and rerun with `--verbose` for additional logs.\n\n## License\n\nThis project is licensed under the MIT License. See [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpppp606%2Fai-comprehension-test","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpppp606%2Fai-comprehension-test","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpppp606%2Fai-comprehension-test/lists"}