{"id":50878986,"url":"https://github.com/geekychris/hitorro-prbench","last_synced_at":"2026-06-15T12:03:28.245Z","repository":{"id":352747990,"uuid":"1215486600","full_name":"geekychris/hitorro-prbench","owner":"geekychris","description":"HiTorro PR Bench - AI review bot benchmarking with replay engine, golden datasets, and F1 scoring","archived":false,"fork":false,"pushed_at":"2026-04-20T23:49:43.000Z","size":111,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-21T01:34:37.807Z","etag":null,"topics":["benchmark","github","hitorro","infra"],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/geekychris.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-20T00:55:53.000Z","updated_at":"2026-04-20T23:43:06.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/geekychris/hitorro-prbench","commit_stats":null,"previous_names":["geekychris/hitorro-prbench"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/geekychris/hitorro-prbench","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/geekychris%2Fhitorro-prbench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/geekychris%2Fhitorro-prbench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/geekychris%2Fhitorro-prbench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/geekychris%2Fhitorro-prbench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/geekychris","download_url":"https://codeload.github.com/geekychris/hitorro-prbench/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/geekychris%2Fhitorro-prbench/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34361403,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-15T02:00:07.085Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","github","hitorro","infra"],"created_at":"2026-06-15T12:03:26.010Z","updated_at":"2026-06-15T12:03:28.239Z","avatar_url":"https://github.com/geekychris.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# HiTorro PR Bench\n\nA standalone benchmarking platform for evaluating AI code review bots. PR Bench replays real pull requests as synthetic PRs in mirror repositories, triggers AI reviewer workflows, collects their comments, and measures quality against a human-curated golden dataset using precision, recall, and F1 metrics.\n\nBuilt with Java 21, Spring Boot 3.2, React 18, and H2 (file-based).\n\n## Feature Highlights\n\n**PR Benchmarking Pipeline**\n- Define benchmark suites from real GitHub PRs with known review comments\n- Register AI review bots with their GitHub Actions workflow definitions\n- Replay PRs as synthetic PRs in mirror repos, injecting bot workflows automatically\n- Collect all bot-generated comments (inline reviews, PR reviews, issue comments)\n- Grade comments as VALID, INVALID, DUPLICATE, or NEEDS_REVIEW -- individually or in bulk\n- Build a golden dataset by promoting exemplary comments\n- Measure precision/recall/F1 per bot, compare runs with McNemar's significance test, track trends over time\n\n**Repository Management**\n- Browse and bulk-import repos from GitHub (personal, org, or by visibility)\n- Tag repos locally with automatic sync to GitHub Topics\n- AI-generated descriptions via Ollama (scans README, file tree, commits, package manifests)\n- Docs scanning across all markdown/rst/adoc files in a repo\n- Faceted search and filtering by owner, language, tag, fork status, description\n- Markdown report generation with summary tables, doc links, and statistics\n\n## Architecture\n\n### System Overview\n\n```mermaid\ngraph TB\n    subgraph Frontend\n        React[React 18 SPA\u003cbr/\u003eVite + TypeScript]\n    end\n\n    subgraph Backend[\"Spring Boot 3.2 (port 8090)\"]\n        Controllers[REST Controllers]\n        Orchestrator[RunOrchestrator]\n        Replay[ReplayEngine]\n        Collector[CommentCollector]\n        Similarity[SimilarityService]\n        Reporting[ReportingService]\n        Ollama[OllamaService]\n        GitTools[hitorro-gittools]\n    end\n\n    subgraph Storage\n        H2[(H2 File DB\u003cbr/\u003e./data/prbench)]\n    end\n\n    subgraph External\n        GitHub[GitHub API]\n        OllamaServer[Ollama LLM\u003cbr/\u003elocalhost:11434]\n        Mirror[Mirror Repos\u003cbr/\u003eon GitHub]\n    end\n\n    React --\u003e|REST API| Controllers\n    Controllers --\u003e Orchestrator\n    Orchestrator --\u003e Replay\n    Orchestrator --\u003e Collector\n    Controllers --\u003e Similarity\n    Controllers --\u003e Reporting\n    Controllers --\u003e Ollama\n    Replay --\u003e GitTools\n    Replay --\u003e GitHub\n    Collector --\u003e GitHub\n    Ollama --\u003e OllamaServer\n    GitTools --\u003e|clone/push/branch| Mirror\n    Controllers --\u003e H2\n```\n\n### Benchmark Run Flow\n\n```mermaid\nsequenceDiagram\n    participant UI as React UI\n    participant API as RunController\n    participant Orch as RunOrchestrator\n    participant RE as ReplayEngine\n    participant GH as GitHub API\n    participant CC as CommentCollector\n\n    UI-\u003e\u003eAPI: POST /api/runs (suiteId, botIds, concurrency)\n    API-\u003e\u003eOrch: executeRun(run) [async]\n    API--\u003e\u003eUI: 200 OK (run created)\n\n    Note over Orch: Snapshot bot configs\n\n    loop For each SuitePr x Bot (semaphore-controlled)\n        Orch-\u003e\u003eRE: replay(suitePr, bot, repo)\n        RE-\u003e\u003eRE: Clone/fetch mirror repo\n        RE-\u003e\u003eRE: Create base branch at base SHA\n        RE-\u003e\u003eRE: Create head branch at head SHA\n        RE-\u003e\u003eRE: Inject bot workflow file + commit\n        RE-\u003e\u003eGH: Push branches, create PR\n        GH--\u003e\u003eRE: PR number + URL\n        RE--\u003e\u003eOrch: ReplayResult\n\n        Note over Orch: Poll for bot completion\n\n        loop Until checks/reviews complete or timeout\n            Orch-\u003e\u003eGH: Check run status / list reviews\n        end\n\n        Orch-\u003e\u003eCC: collectReplayPrComments(replayPr)\n        CC-\u003e\u003eGH: List review comments, reviews, issue comments\n        CC-\u003e\u003eCC: Normalize text, compute Winnowing hashes\n        CC--\u003e\u003eOrch: Comments saved\n    end\n\n    Orch-\u003e\u003eOrch: Mark run COMPLETED\n\n    UI-\u003e\u003eAPI: GET /api/runs/{id}/progress\n    API--\u003e\u003eUI: Status counts per replay PR\n```\n\n### Repository Management Components\n\n```mermaid\ngraph LR\n    subgraph \"Repo Management\"\n        Import[Import from GitHub]\n        Tags[Tag Management]\n        Desc[Description Management]\n        Docs[Docs Scanner]\n        Report[Report Generator]\n    end\n\n    subgraph \"External\"\n        GH[GitHub API]\n        LLM[Ollama LLM]\n    end\n\n    Import --\u003e|Browse user/org repos| GH\n    Tags --\u003e|Sync as Topics| GH\n    Desc --\u003e|Push description max 350 chars| GH\n    Desc --\u003e|Generate via scan| LLM\n    Docs --\u003e|Git tree API recursive| GH\n    Report --\u003e|Markdown with tables| Output[Markdown Output]\n\n    LLM -.-\u003e|README + file tree + commits + manifest| GH\n```\n\n### Database Schema (Key Tables)\n\n```mermaid\nerDiagram\n    exemplar_repos ||--o{ benchmark_suites : has\n    benchmark_suites ||--o{ suite_prs : contains\n    benchmark_suites ||--o{ benchmark_runs : has\n    benchmark_runs }o--o{ bots : uses\n    benchmark_runs ||--o{ replay_prs : creates\n    benchmark_runs ||--o{ bot_snapshots : freezes\n    replay_prs }o--|| suite_prs : replays\n    replay_prs }o--|| bots : uses\n    replay_prs ||--o{ review_comments : has\n    suite_prs ||--o{ original_comments : has\n    suite_prs ||--o{ golden_dataset_entries : has\n    review_comments ||--o{ gradings : graded_by\n    review_comments ||--o{ comment_similarities : compared_in\n    original_comments ||--o{ comment_similarities : compared_in\n\n    exemplar_repos {\n        bigint id PK\n        varchar name\n        varchar github_url\n        varchar owner\n        varchar repo_name\n        varchar mirror_org\n        varchar mirror_repo_name\n        varchar default_branch\n    }\n\n    benchmark_suites {\n        bigint id PK\n        varchar name\n        bigint exemplar_repo_id FK\n    }\n\n    suite_prs {\n        bigint id PK\n        bigint suite_id FK\n        int original_pr_number\n        varchar base_commit_sha\n        varchar head_commit_sha\n        int files_changed\n    }\n\n    bots {\n        bigint id PK\n        varchar name\n        varchar workflow_file_name\n        clob workflow_content\n        varchar wait_strategy\n        int timeout_seconds\n    }\n\n    benchmark_runs {\n        bigint id PK\n        bigint suite_id FK\n        varchar status\n        int concurrency\n        boolean golden_dataset_enabled\n    }\n\n    replay_prs {\n        bigint id PK\n        bigint run_id FK\n        bigint suite_pr_id FK\n        bigint bot_id FK\n        int mirror_pr_number\n        varchar status\n    }\n\n    review_comments {\n        bigint id PK\n        bigint replay_pr_id FK\n        varchar source\n        varchar comment_type\n        varchar file_path\n        int line_number\n        clob body_normalized\n        varchar fingerprint_hash\n    }\n\n    golden_dataset_entries {\n        bigint id PK\n        bigint suite_pr_id FK\n        varchar file_path\n        int line_number\n        varchar issue_type\n        clob canonical_body\n        boolean active\n    }\n\n    gradings {\n        bigint id PK\n        bigint comment_id\n        varchar verdict\n        varchar severity\n        int stars\n    }\n\n    comment_similarities {\n        bigint id PK\n        varchar strategy\n        double score\n        boolean is_match\n    }\n```\n\n## Getting Started\n\n### Prerequisites\n\n- **Java 21** (JDK)\n- **Maven 3.8+**\n- **Node.js 18+** and npm (for the React frontend)\n- **GitHub Personal Access Token** with `repo` scope\n- **Ollama** (optional, for AI-generated descriptions) -- install from [ollama.com](https://ollama.com)\n- **hitorro-gittools 3.0.0** in your local Maven repository\n\n### Build\n\n```bash\n# Build the backend\nmvn clean package -DskipTests\n\n# Install frontend dependencies\ncd react-app \u0026\u0026 npm install\n```\n\n### Run\n\nThe included `run.sh` starts both the backend and frontend dev server:\n\n```bash\n# Set your GitHub token\nexport GITHUB_TOKEN=ghp_your_token_here\n\n# Start both servers\n./run.sh\n```\n\nThis starts:\n- **Backend API** at `http://localhost:8090`\n- **React dev server** at `http://localhost:3001`\n- **Swagger UI** at `http://localhost:8090/swagger-ui.html`\n- **H2 Console** at `http://localhost:8090/h2-console`\n\nAlternatively, run each component separately:\n\n```bash\n# Backend only\njava -jar target/hitorro-pr-bench-1.0.0.jar\n\n# Frontend only (in react-app/)\nnpm run dev\n```\n\n## Configuration\n\n### application.yml\n\n| Property | Default | Description |\n|:---------|:--------|:------------|\n| `server.port` | `8090` | Backend HTTP port |\n| `spring.datasource.url` | `jdbc:h2:file:./data/prbench` | H2 database file path |\n| `app.github.token` | `${GITHUB_TOKEN:}` | GitHub PAT (env var or direct) |\n| `app.github.api-url` | `https://api.github.com` | GitHub API base URL |\n| `app.github.poll-interval-seconds` | `30` | Interval for polling bot completion |\n| `app.github.default-bot-timeout-seconds` | `600` | Default timeout waiting for a bot |\n| `app.workspace.base-path` | `~/.pr-bench/workspaces` | Local directory for git clones |\n| `app.run.default-concurrency` | `2` | Default parallel replay PRs per run |\n| `app.run.max-concurrency` | `10` | Maximum allowed concurrency |\n| `app.similarity.text-similarity-threshold` | `0.8` | Jaro-Winkler threshold for a match |\n| `app.similarity.winnowing-k` | `5` | Winnowing k-gram size |\n| `app.similarity.winnowing-w` | `4` | Winnowing window size |\n| `app.ollama.url` | `http://localhost:11434` | Ollama server URL |\n| `app.ollama.model` | `llama3.2` | Ollama model for description generation |\n\n### GitHub Token\n\nSet via environment variable (recommended) or directly in `application.yml`:\n\n```bash\nexport GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\n```\n\nThe token needs `repo` scope for reading/writing repositories, creating PRs, and managing topics.\n\n### Ollama (Optional)\n\nFor AI-generated repository descriptions:\n\n```bash\n# Install Ollama\ncurl -fsSL https://ollama.com/install.sh | sh\n\n# Pull the default model\nollama pull llama3.2\n\n# Ollama runs on localhost:11434 by default -- no further config needed\n```\n\n## Feature Walkthrough\n\n### Repository Import and Management\n\n1. **Browse GitHub** -- navigate to Repositories, click \"Browse GitHub\" to see all repos accessible via your token (personal and org)\n2. **Import** -- import individual repos or use \"Import All\" for bulk import\n3. **Tag** -- add tags to repos; tags auto-sync to GitHub as Topics (visible on the repo page under \"About\")\n4. **Describe** -- write descriptions manually or click \"Generate Description\" to have Ollama scan the repo (README, file tree, commits, package manifest) and produce a summary\n5. **Scan Docs** -- discovers all markdown, rst, and adoc files in the repo tree\n6. **Report** -- generate a markdown report with summary table, per-repo details, doc links, and statistics\n\n### Benchmark Suite Setup\n\n1. **Create Exemplar Repo** -- register the GitHub repo containing the PRs you want to benchmark\n2. **Create Suite** -- name a benchmark suite and associate it with the exemplar repo\n3. **Add PRs** -- select merged PRs to include; the system records base/head SHAs, changed files, and metadata\n4. **Collect Original Comments** -- fetch human review comments from the original PRs for comparison\n\n### Bot Configuration\n\n1. **Create Bot** -- provide a name, GitHub Actions workflow YAML, and a wait strategy\n2. **Wait Strategies** -- `CHECKS` (wait for check runs to complete), `REVIEWS` (wait for a review to appear), or `BOTH` (wait for both)\n3. **Timeout** -- how long to wait before giving up (default 600 seconds)\n\n### Running a Benchmark\n\n1. **Create Run** -- select a suite, pick bots, set concurrency (max 10 parallel)\n2. **Execution** -- the orchestrator creates synthetic PRs in the mirror repo, injects each bot's workflow, pushes, opens PRs, then polls for completion\n3. **Monitor** -- the progress endpoint shows status counts (PENDING, CREATING_BRANCHES, WAITING_FOR_BOTS, COLLECTING_COMMENTS, COMPLETED, FAILED)\n4. **Cleanup** -- after analysis, clean up mirror branches and close synthetic PRs\n\n### Grading and Golden Dataset\n\n1. **Grading Queue** -- review ungraded bot comments one by one or in bulk\n2. **Verdicts** -- VALID (real issue found), INVALID (false positive), DUPLICATE, NEEDS_REVIEW\n3. **Severity and Stars** -- rate comment quality with severity levels and star ratings\n4. **Promote to Golden** -- promote validated comments to the golden dataset as ground truth\n5. **Export/Import** -- export golden dataset entries for sharing or backup\n\n### Similarity Analysis\n\nFour strategies compare bot comments against original human comments:\n\n| Strategy | Description |\n|:---------|:------------|\n| EXACT_MATCH | Normalized text is identical |\n| FILE_LINE | Same file path and line number |\n| NORMALIZED_TEXT | Jaro-Winkler similarity above threshold (default 0.8) |\n| WINNOWING | Jaccard similarity of Winnowing fingerprint hash sets |\n\nText normalization strips markdown, URLs, punctuation, and lowercases before comparison.\n\n### Reporting\n\n- **Run Report** -- per-bot totals, verdict breakdowns, grading stats\n- **Golden Comparison** -- precision, recall, F1 per bot against the golden dataset\n- **Significance Test** -- McNemar's chi-squared test (with continuity correction) between two runs\n- **Trend Charts** -- F1/precision/recall over time for a bot (rendered with Recharts)\n\n## API Reference\n\n### Setup (`/api/setup`)\n\n| Method | Path | Description |\n|:-------|:-----|:------------|\n| GET | `/status` | GitHub token status and connectivity |\n| POST | `/token` | Set GitHub token at runtime |\n| GET | `/rate-limit` | Current GitHub API rate limit |\n\n### Repositories (`/api/repos`)\n\n| Method | Path | Description |\n|:-------|:-----|:------------|\n| GET | `/` | List repos (filter: tag, search, language, owner, hasNotes, isFork) |\n| POST | `/` | Create repo by GitHub URL |\n| POST | `/import` | Import single repo from GitHub |\n| POST | `/import-all` | Bulk import repos |\n| GET | `/{id}` | Get repo by ID |\n| PUT | `/{id}` | Update repo |\n| DELETE | `/{id}` | Delete repo |\n| GET | `/{id}/github-status` | Compare local vs live GitHub state |\n| POST | `/{id}/sync-to-github` | Push description + tags to GitHub |\n| POST | `/{id}/tags` | Add tag (syncs to GitHub Topics) |\n| DELETE | `/{id}/tags/{tag}` | Remove tag |\n| POST | `/bulk-tag` | Add tag to multiple repos |\n| GET | `/meta/tags` | List all tags |\n| GET | `/meta/stats` | Faceted stats (by owner, language, tag, fork) |\n| POST | `/{id}/notes` | Set description (pushes to GitHub, max 350 chars) |\n| POST | `/{id}/generate-description` | AI-generate description via Ollama |\n| POST | `/meta/generate-descriptions` | Bulk AI-generate for all repos without descriptions |\n| GET | `/github/browse` | Browse GitHub repos accessible via token |\n| GET | `/github/orgs` | List user's GitHub organizations |\n| GET | `/github/orgs/{org}/repos` | List repos in an organization |\n| GET | `/{id}/prs` | List PRs from GitHub for a repo |\n| POST | `/{id}/scan-docs` | Scan repo for documentation files |\n| POST | `/meta/scan-docs` | Bulk scan repos for docs |\n| GET | `/{id}/docs` | Get scanned docs for a repo |\n| POST | `/meta/report` | Generate markdown report for selected repos |\n\n### Benchmark Suites (`/api/suites`)\n\n| Method | Path | Description |\n|:-------|:-----|:------------|\n| GET | `/` | List suites (filter: repoId) |\n| POST | `/` | Create suite |\n| GET | `/{id}` | Get suite |\n| DELETE | `/{id}` | Delete suite |\n| GET | `/{id}/prs` | List PRs in suite |\n| POST | `/{id}/prs` | Add PR to suite |\n| DELETE | `/{suiteId}/prs/{prId}` | Remove PR from suite |\n| POST | `/suite-prs/{id}/collect-original-comments` | Collect human comments from original PR |\n\n### Bots (`/api/bots`)\n\n| Method | Path | Description |\n|:-------|:-----|:------------|\n| GET | `/` | List all bots |\n| POST | `/` | Create bot (name, workflow YAML, wait strategy, timeout) |\n| GET | `/{id}` | Get bot |\n| PUT | `/{id}` | Update bot |\n| DELETE | `/{id}` | Delete bot |\n\n### Runs (`/api/runs`)\n\n| Method | Path | Description |\n|:-------|:-----|:------------|\n| GET | `/` | List runs (filter: suiteId) |\n| POST | `/` | Create and start run (suiteId, botIds, concurrency) |\n| GET | `/{id}` | Get run details |\n| GET | `/{id}/progress` | Status counts per replay PR |\n| POST | `/{id}/cancel` | Cancel a running benchmark |\n| GET | `/{id}/replay-prs` | List replay PRs for run |\n| GET | `/{id}/bot-snapshots` | Bot config snapshots taken at run start |\n| POST | `/{id}/cleanup` | Close mirror PRs and delete branches |\n| GET | `/replay-prs/{id}/comments` | Comments collected for a replay PR |\n| GET | `/{runId}/similarities` | Compute pairwise similarity analysis |\n\n### Grading (`/api`)\n\n| Method | Path | Description |\n|:-------|:-----|:------------|\n| POST | `/gradings` | Create grading (verdict, severity, stars, notes) |\n| PUT | `/gradings/{id}` | Update grading |\n| DELETE | `/gradings/{id}` | Delete grading |\n| GET | `/comments/{id}/gradings` | Get gradings for a comment |\n| POST | `/gradings/bulk` | Bulk grade multiple comments |\n| GET | `/grading-queue` | Ungraded comments queue |\n| GET | `/grading-progress` | Grading completion stats |\n\n### Golden Dataset (`/api/golden-dataset`)\n\n| Method | Path | Description |\n|:-------|:-----|:------------|\n| GET | `/` | List entries (filter: suiteId) |\n| POST | `/promote` | Promote a comment to golden dataset |\n| PUT | `/{id}` | Update entry |\n| DELETE | `/{id}` | Remove entry |\n| GET | `/export` | Export golden dataset as JSON |\n\n### Reports (`/api/reports`)\n\n| Method | Path | Description |\n|:-------|:-----|:------------|\n| GET | `/runs/{runId}` | Run report with per-bot stats |\n| GET | `/runs/{runId}/comparison` | Compare against golden dataset (P/R/F1) |\n| GET | `/bots/{botId}/trend` | F1 trend over recent runs |\n| GET | `/runs/{runAId}/significance` | McNemar's test between two runs |\n\n### Issue Types (`/api/issue-types`)\n\n| Method | Path | Description |\n|:-------|:-----|:------------|\n| GET | `/` | List all issue types |\n| GET | `/categories` | List issue type categories |\n| GET | `/{id}` | Get issue type |\n| GET | `/code/{code}` | Get issue type by code |\n| POST | `/` | Create issue type |\n| PUT | `/{id}` | Update issue type |\n| DELETE | `/{id}` | Delete issue type |\n\nPre-seeded issue types: NULL_DEREF, SQL_INJECTION, XSS, RESOURCE_LEAK, RACE_CONDITION, ERROR_HANDLING, PERFORMANCE, CODE_STYLE, NAMING, DEAD_CODE, COMPLEXITY, DOCUMENTATION.\n\n## Frontend Pages\n\nThe React SPA provides these pages via sidebar navigation:\n\n| Page | Route | Description |\n|:-----|:------|:------------|\n| Dashboard | `/` | Overview stats and recent activity |\n| Repositories | `/repos` | Browse, import, tag, describe, and manage repos |\n| Report \u0026 Docs | `/repo-report` | Generate markdown reports and browse scanned docs |\n| Suites | `/suites` | Create and manage benchmark suites |\n| Suite Detail | `/suites/:id` | View/add PRs in a suite, collect original comments |\n| Bots | `/bots` | Define AI review bots with workflow YAML |\n| Runs | `/runs` | Start benchmark runs, view status |\n| Run Detail | `/runs/:id` | Monitor replay PRs, view progress, trigger cleanup |\n| Replay PR Detail | `/replay-prs/:id` | View collected comments for a single replay PR |\n| Golden Dataset | `/golden-dataset` | Curate ground-truth entries, export/import |\n| Grading Queue | `/grading-queue` | Grade bot comments (VALID/INVALID/DUPLICATE) |\n| Run Report | `/reports/:runId` | Per-bot stats, verdict breakdowns, golden comparison |\n| Trends | `/trend` | F1/precision/recall charts over time (Recharts) |\n| Issue Types | `/issue-types` | Manage the issue type taxonomy |\n| Settings | `/settings` | GitHub token, Ollama status, app config |\n\n**Tech stack:** React 18, TypeScript, Vite 5, TanStack Query 5, Recharts 2, React Router 6.\n\n## Integration with hitorro-gittools\n\nThe `ReplayEngine` depends on `hitorro-gittools` (v3.0.0) for all git operations during PR replay:\n\n- **Clone** -- clones the mirror repository to the local workspace (`~/.pr-bench/workspaces/{org}/{repo}`)\n- **Fetch** -- fetches latest refs before creating branches\n- **Branch creation** -- creates base and head branches at the exact commit SHAs from the original PR\n- **Checkout** -- switches between branches during replay\n- **Push** -- pushes base and head branches to the mirror remote\n- **Raw git commands** -- uses `gitService.getRunner().runOrThrow()` to stage and commit injected workflow files\n\nThe `GitService` and `GitCredentials` classes from hitorro-gittools handle authentication using the GitHub PAT.\n\n## Testing\n\n```bash\n# Run all tests\nmvn test\n\n# Run with verbose output\nmvn test -Dtest.verbose=true\n```\n\nThe project uses:\n- **JUnit 5** (Jupiter) via `spring-boot-starter-test`\n- **Spring Boot Test** for integration testing with auto-configured H2\n- **Flyway** migrations run automatically in test context\n\n## Project Structure\n\n```\nhitorro-pr-bench/\n|-- pom.xml                              # Maven build (Spring Boot 3.2 parent)\n|-- run.sh                               # Start backend + frontend\n|-- src/\n|   |-- main/\n|   |   |-- java/com/hitorro/prbench/\n|   |   |   |-- PrBenchApplication.java  # Entry point (@EnableAsync, @EnableScheduling)\n|   |   |   |-- controller/\n|   |   |   |   |-- RepoController.java          # Repository management + GitHub browsing\n|   |   |   |   |-- SuiteController.java          # Benchmark suite CRUD + PR selection\n|   |   |   |   |-- BotController.java            # AI bot definitions\n|   |   |   |   |-- RunController.java            # Benchmark run lifecycle\n|   |   |   |   |-- GradingController.java        # Comment grading + queue\n|   |   |   |   |-- GoldenDatasetController.java  # Golden dataset management\n|   |   |   |   |-- ReportController.java         # Reporting endpoints\n|   |   |   |   |-- IssueTypeController.java      # Issue type taxonomy\n|   |   |   |   |-- SetupController.java          # Token + connectivity setup\n|   |   |   |-- service/\n|   |   |   |   |-- RunOrchestrator.java   # Async run execution with semaphore concurrency\n|   |   |   |   |-- ReplayEngine.java      # Git-based PR replay via hitorro-gittools\n|   |   |   |   |-- CommentCollector.java  # GitHub comment fetching + normalization\n|   |   |   |   |-- SimilarityService.java # Pairwise comment comparison (4 strategies)\n|   |   |   |   |-- ReportingService.java  # P/R/F1, McNemar's test, trend data\n|   |   |   |   |-- OllamaService.java     # LLM description generation\n|   |   |   |   |-- GitHubApiService.java  # GitHub REST API client\n|   |   |   |   |-- TextNormalizer.java    # Text normalization + Winnowing + Jaro-Winkler\n|   |   |   |-- entity/                    # JPA entities (13 tables)\n|   |   |   |-- repository/               # Spring Data JPA repositories\n|   |   |-- resources/\n|   |       |-- application.yml            # App configuration\n|   |       |-- db/migration/\n|   |           |-- V1__core_tables.sql    # Core schema (12 tables)\n|   |           |-- V2__webhook_events.sql # Webhooks, bot snapshots, issue types, schedules\n|   |-- test/\n|-- react-app/\n|   |-- package.json                       # React 18 + Vite 5 + TanStack Query\n|   |-- src/\n|   |   |-- App.tsx                        # Router + sidebar navigation\n|   |   |-- pages/                         # 15 page components\n|-- data/\n|   |-- prbench.mv.db                     # H2 database file (created at runtime)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgeekychris%2Fhitorro-prbench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgeekychris%2Fhitorro-prbench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgeekychris%2Fhitorro-prbench/lists"}