{"id":47709949,"url":"https://github.com/automationpi/mergelore","last_synced_at":"2026-04-02T18:29:28.676Z","repository":{"id":347883432,"uuid":"1194828249","full_name":"automationpi/mergelore","owner":"automationpi","description":"AI tools write code fast, but they have zero memory of your team's history. mergelore sits in your CI pipeline and surfaces past architectural decisions, reversed patterns, and constraint violations before they ship again. No new infrastructure required,  just add it to your workflow.","archived":false,"fork":false,"pushed_at":"2026-03-29T22:06:47.000Z","size":575,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-30T00:28:41.374Z","etag":null,"topics":["ai","code-review","developer-tools","github-actions","institutional-memory","llm","pr-review","qdrant"],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/automationpi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-28T21:36:25.000Z","updated_at":"2026-03-29T22:06:41.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/automationpi/mergelore","commit_stats":null,"previous_names":["automationpi/mergelore"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/automationpi/mergelore","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/automationpi%2Fmergelore","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/automationpi%2Fmergelore/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/automationpi%2Fmergelore/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/automationpi%2Fmergelore/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/automationpi","download_url":"https://codeload.github.com/automationpi/mergelore/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/automationpi%2Fmergelore/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31312902,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-02T12:59:32.332Z","status":"ssl_error","status_checked_at":"2026-04-02T12:54:48.875Z","response_time":89,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","code-review","developer-tools","github-actions","institutional-memory","llm","pr-review","qdrant"],"created_at":"2026-04-02T18:29:28.090Z","updated_at":"2026-04-02T18:29:28.663Z","avatar_url":"https://github.com/automationpi.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"logo.svg\" alt=\"mergelore\" width=\"120\" /\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003emergelore\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\u003cstrong\u003eGive your PRs the institutional memory that AI-generated code lacks.\u003c/strong\u003e\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/automationpi/mergelore/actions/workflows/ci.yml\"\u003e\u003cimg src=\"https://github.com/automationpi/mergelore/actions/workflows/ci.yml/badge.svg\" alt=\"CI\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/automationpi/mergelore/releases\"\u003e\u003cimg src=\"https://img.shields.io/github/v/release/automationpi/mergelore\" alt=\"Release\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/automationpi/mergelore/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/github/license/automationpi/mergelore\" alt=\"License\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://hub.docker.com/r/automationpi/mergelore-action\"\u003e\u003cimg src=\"https://img.shields.io/docker/pulls/automationpi/mergelore-action\" alt=\"Docker Pulls\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://ghcr.io/automationpi/mergelore-action\"\u003e\u003cimg src=\"https://img.shields.io/badge/GHCR-mergelore--action-blue?logo=github\" alt=\"GHCR\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"banner.svg\" alt=\"mergelore - AI writes code, mergelore remembers why it was changed before\" width=\"900\" /\u003e\n\u003c/p\u003e\n\nmergelore is a GitHub Action that reviews pull requests against your team's history. When a PR re-introduces a pattern you already removed, reverses an architectural decision, or violates a constraint that was load-tested and hardcoded - mergelore flags it before it ships.\n\n## The problem\n\nAI coding tools (Claude Code, Copilot, Cursor, Codex) generate code that looks correct in isolation. But they have zero awareness of your team's history -the decisions made in past PRs, the patterns deliberately removed, the limits that were hardcoded for a reason.\n\nEvery existing review tool analyzes the **present diff**. None of them remember *why* past decisions were made.\n\nmergelore bridges that gap.\n\n## How it works\n\n```mermaid\nflowchart LR\n    A[\"PR opened\"] --\u003e B[\"Extract diff\"]\n    B --\u003e C[\"Query past decisions\\n(memory provider)\"]\n    C --\u003e D[\"LLM analyzes\\ndiff + context\"]\n    D --\u003e E[\"Post findings\\nwith confidence scores\"]\n\n    style A fill:#1e293b,stroke:#34D399,color:#E2E8F0\n    style B fill:#1e293b,stroke:#64748B,color:#E2E8F0\n    style C fill:#1e293b,stroke:#7B61FF,color:#E2E8F0\n    style D fill:#1e293b,stroke:#7B61FF,color:#E2E8F0\n    style E fill:#1e293b,stroke:#34D399,color:#E2E8F0\n```\n\nmergelore runs entirely in your CI pipeline. No data leaves your GitHub Actions runner except API calls to the LLM provider you choose.\n\n### What it catches\n\nWhen a PR conflicts with a past decision, mergelore posts a finding directly on the PR with a confidence score, a plain-language explanation of what was decided before, and a link to the original PR. Every finding includes slash commands so the author can acknowledge, override with a reason, or mark the current PR as the new decision.\n\nHere are real scenarios that happen in every growing codebase:\n\n---\n\n**Backend: Connection pool limit re-introduced after outage**\n\nA new engineer adds a Redis connection pool with `maxConnections: 100`. Looks reasonable. But three months ago, your team debugged a production outage caused by exactly that - Redis was exhausting file descriptors at 100 connections. PR #287 reduced it to 20 with a detailed post-mortem.\n\n```\nConnection pool limit conflicts with post-mortem decision - confidence: 0.92\n\u003e PR #287 reduced Redis maxConnections from 100 to 20 after a production\n\u003e outage where the pool exhausted file descriptors under load. The post-mortem\n\u003e explicitly recommended keeping this value low.\n\u003e This PR sets maxConnections back to 100, directly reversing that decision.\n\u003e Source: PR #287 - 2025-11-14\n```\n\n---\n\n**Frontend: Removed accessibility pattern brought back**\n\nCopilot generates a custom dropdown using `\u003cdiv onClick\u003e`. Clean code, works fine. But your team spent two sprints replacing every custom dropdown with `\u003cselect\u003e` and ARIA attributes after a WCAG audit. PR #156 documented the policy.\n\n```\nCustom dropdown conflicts with accessibility policy - confidence: 0.88\n\u003e PR #156 replaced all custom div-based dropdowns with native \u003cselect\u003e\n\u003e elements and ARIA roles to meet WCAG 2.1 AA compliance. The review\n\u003e comments note that custom dropdowns failed screen reader testing.\n\u003e This PR introduces a new div-based dropdown component in UserSettings.\n\u003e Source: PR #156 - 2025-08-22\n```\n\n---\n\n**Database: Migration pattern that already caused downtime**\n\nA PR adds `ALTER TABLE users ADD COLUMN preferences JSONB DEFAULT '{}'`. Looks standard. But PR #412 tried the same thing and was reverted - adding a default value on a 50M row table locked it for 8 minutes in production.\n\n```\nColumn default on large table conflicts with migration policy - confidence: 0.90\n\u003e PR #412 attempted to add a JSONB column with a DEFAULT value to the users\n\u003e table (50M+ rows). The migration locked the table for 8 minutes in production.\n\u003e PR #415 documented the policy: large tables must use ADD COLUMN without\n\u003e DEFAULT, then backfill in batches.\n\u003e This PR adds a DEFAULT value to the same table.\n\u003e Source: PR #415 - 2026-01-05\n```\n\n---\n\n**API: Rate limit deliberately hardcoded after incident**\n\nClaude Code generates an API endpoint with configurable rate limiting via environment variable. Flexible, good practice - normally. But your team hardcoded the rate limit to 100 req/s after an incident where a misconfigured env var allowed 10,000 req/s and took down the billing service.\n\n```\nConfigurable rate limit conflicts with hardcoding decision - confidence: 0.85\n\u003e PR #203 hardcoded the API rate limit to 100 req/s after an incident where\n\u003e a misconfigured RATE_LIMIT env var allowed 10,000 req/s and overwhelmed\n\u003e the downstream billing service. The PR review explicitly decided against\n\u003e making this configurable.\n\u003e This PR makes the rate limit configurable via environment variable.\n\u003e Source: PR #203 - 2025-09-30\n```\n\n---\n\nYour reviewers are good. But they're reviewing 3x more PRs than last year - most of them AI-generated - across dozens of files they didn't write. No human can hold six months of architectural decisions in their head while reviewing their 12th PR of the day.\n\nmergelore doesn't replace your reviewers. It gives them the context they'd have if they personally reviewed every PR in the repo's history. The reviewer focuses on code quality. mergelore watches for decisions being silently undone.\n\nFindings are **suggestions, not blocks** - unless you explicitly opt into `block-on-critical`.\n\n\u003e See the [full quickstart guide](https://gist.github.com/automationpi/2160dd02ea6b4628e7658fcb4bd816e5) for detailed setup instructions, architecture deep-dive, and more examples.\n\n## Quickstart\n\nAdd this to `.github/workflows/mergelore.yml`:\n\n```yaml\nname: mergelore\non:\n  pull_request:\n    types: [opened, synchronize, reopened]\n  issue_comment:\n    types: [created]\n\npermissions:\n  pull-requests: write\n  contents: read\n\njobs:\n  review:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v4\n      - uses: automationpi/mergelore/action@v0.2.1\n        with:\n          anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}\n```\n\nThat's it. Add your `ANTHROPIC_API_KEY` as a repo secret and every PR gets reviewed.\n\n## Configuration\n\n| Input | Default | Description |\n|-------|---------|-------------|\n| `anthropic-api-key` | -| Anthropic API key (required for `claude` provider) |\n| `openai-api-key` | -| OpenAI API key (required for `openai` provider) |\n| `llm-provider` | `claude` | LLM to use for analysis: `claude` or `openai` |\n| `memory-provider` | `git-native` | How to retrieve past decisions: `none` or `git-native` |\n| `history-depth` | `20` | Number of past merged PRs to search |\n| `confidence-threshold` | `0.7` | Minimum confidence (0.0-1.0) to post a finding |\n| `block-on-critical` | `false` | Fail CI on critical findings |\n| `timeout` | `45000` | Timeout in ms for LLM and memory queries |\n\n### Using OpenAI instead of Claude\n\n```yaml\n- uses: automationpi/mergelore/action@v0.2.1\n  with:\n    llm-provider: openai\n    openai-api-key: ${{ secrets.OPENAI_API_KEY }}\n```\n\n### Diff-only mode (no history)\n\nIf you just want LLM-powered diff review without historical context:\n\n```yaml\n- uses: automationpi/mergelore/action@v0.2.1\n  with:\n    anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}\n    memory-provider: none\n```\n\n### Blocking on critical findings\n\n```yaml\n- uses: automationpi/mergelore/action@v0.2.1\n  with:\n    anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}\n    block-on-critical: true\n```\n\n## Memory providers\n\nmergelore has a plugin architecture. You pick how it remembers your team's decisions.\n\n### `none` -No memory (Tier 0)\n\nThe LLM reviews the diff on its own merits. No historical context. Good for trying mergelore out or small repos where past decisions are few.\n\n### `git-native` -GitHub API (Tier 1, default)\n\nFetches your repo's recently merged PRs via the GitHub API. Scores them by file overlap with the current diff and sends the most relevant ones as context to the LLM. No extra infrastructure needed -it uses the `GITHUB_TOKEN` already available in Actions.\n\nThis is the sweet spot for most teams.\n\n### `qdrant` -Vector search (Tier 2)\n\nSearch over your full indexed PR history using Qdrant. The separate **indexer** service embeds merged PRs into Qdrant after each merge. The action queries Qdrant by file path overlap -no embedding runs in CI.\n\n**Step 1:** Add the indexer workflow (runs on merge to main):\n\n```yaml\nname: mergelore-index\non:\n  push:\n    branches: [main]\npermissions:\n  contents: read\n  pull-requests: read\njobs:\n  index:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: automationpi/mergelore/indexer@v0.2.1\n        with:\n          qdrant-url: ${{ secrets.QDRANT_URL }}\n        env:\n          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}\n          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n          QDRANT_API_KEY: ${{ secrets.QDRANT_API_KEY }}\n```\n\n**Step 2:** Update your mergelore review workflow:\n\n```yaml\n- uses: automationpi/mergelore@v0.2.1\n  with:\n    anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}\n    memory-provider: qdrant\n    vector-store-url: ${{ secrets.QDRANT_URL }}\n  env:\n    QDRANT_API_KEY: ${{ secrets.QDRANT_API_KEY }}\n```\n\nIf Qdrant is unreachable, mergelore automatically falls back to `git-native` with a warning.\n\n#### Supported embedding models\n\n| Model | Provider | Dimensions | Cost |\n|-------|----------|-----------|------|\n| `text-embedding-3-small` | OpenAI | 1536 | Low |\n| `nomic-embed-text` | Local (CPU) | 768 | Free |\n| `embed-english-v3.0` | Cohere | 1024 | Low |\n\nSet via the `embed-model` input on the indexer action (default: `text-embedding-3-small`).\n\n#### Image signing\n\nAll release images are signed with cosign keyless signing (Sigstore). Verify with:\n\n```bash\ncosign verify ghcr.io/automationpi/mergelore-action:v0.2.1\ncosign verify ghcr.io/automationpi/mergelore-indexer:v0.2.1\n```\n\n## Human-in-the-loop\n\nmergelore is an advisory tool, not an autonomous blocker. Every finding comes with slash commands:\n\n| Command | What it does |\n|---------|-------------|\n| `/mergelore-acknowledge` | \"I saw this, I understand, continuing anyway\" |\n| `/mergelore-override [reason]` | \"I disagree -here's why\" (reason is stored as a decision record) |\n| `/mergelore-update-record` | \"This PR IS the new decision -update the memory after merge\" |\n\n**Override decisions are written back as data.** When someone overrides a finding with a reason, that reason becomes part of the institutional memory. Over time, mergelore learns what your team's actual standards are.\n\n## Architecture\n\n```\naction/\n  src/\n    index.ts                    # Main runner -thin orchestrator\n    types.ts                    # Shared types (Diff, Context, Finding)\n    diff.ts                     # PR diff extraction via GitHub API\n    comment.ts                  # PR comment formatter (locked format)\n    config.ts                   # Action input parser + validation\n    logger.ts                   # Structured JSON logging\n    handlers/\n      slash-commands.ts         # HITL command handler\n    providers/\n      llm/\n        prompts.ts              # Shared prompts + Finding schema\n        claude.ts               # Anthropic API (structured via tool_use)\n        openai.ts               # OpenAI API (structured via json_schema)\n        factory.ts              # Provider factory\n      memory/\n        none.ts                 # Tier 0: no memory\n        git-native.ts           # Tier 1: GitHub API\n        qdrant.ts               # Tier 2: Qdrant vector store\n        factory.ts              # Provider factory\nindexer/                          # Async embedding service (Python 3.12)\n  src/mergelore_indexer/\n    main.py                       # CLI entry point\n    extract.py                    # PR content extraction\n    chunk.py                      # 512-token chunking with 64-token overlap\n    embed/                        # Pluggable embedding providers\n      openai_embed.py             # text-embedding-3-small\n      nomic.py                    # nomic-embed-text (local)\n      cohere_embed.py             # embed-english-v3.0\n    store/\n      qdrant_store.py             # Qdrant client wrapper\n```\n\n### Plugin interfaces\n\nEverything plugs into two interfaces:\n\n```typescript\ninterface LLMProvider {\n  analyze(diff: Diff, context: Context[]): Promise\u003cFinding[]\u003e\n}\n\ninterface MemoryProvider {\n  index(pr: MergedPR): Promise\u003cvoid\u003e\n  query(diff: Diff): Promise\u003cContext[]\u003e\n}\n```\n\nAdding a new LLM or memory provider means implementing one interface and adding a case to the factory. That's it.\n\n### Design principles\n\n- **All LLM calls use structured output.** No free text parsing. Claude uses `tool_use`, OpenAI uses `json_schema`.\n- **Confidence scores are mandatory.** Every finding has a score between 0.0 and 1.0. Below the threshold, it doesn't get posted.\n- **Providers are stateless.** No singleton state, no class-level caching. Every invocation is independent.\n- **Graceful degradation everywhere.** If the LLM API is down, mergelore posts a warning -it never breaks your CI.\n- **Findings are suggestions.** mergelore never blocks a PR unless you explicitly set `block-on-critical: true`.\n\n## Where your data lives\n\n\u003e **mergelore never stores your code on any third-party server.**\n\n| Tier | Data location | Who controls it |\n|------|--------------|-----------------|\n| Tier 0/1 (git-native) | No storage - queries GitHub API on the fly | GitHub (your existing infra) |\n| Tier 2 (Qdrant Cloud) | Your Qdrant Cloud cluster | You (free 1GB tier available) |\n| Tier 2 (self-hosted) | Your own Qdrant via Docker or Kubernetes | You (fully air-gapped) |\n\n- The action runs entirely on your GitHub Actions runner. The only external calls are to the LLM provider (Claude or OpenAI) for analysis.\n- The indexer writes embeddings to your Qdrant instance. No data passes through mergelore's infrastructure. There is no mergelore backend.\n- No telemetry. No call-home. No usage tracking. MIT licensed, fully auditable.\n- For regulated industries (pharma, finance, healthcare): self-host Qdrant in your own VPC, use `nomic-embed-text` for local CPU embedding (no API calls), and every byte stays inside your network.\n\n## Development\n\n```bash\ncd action\nnpm install\nnpm run typecheck    # TypeScript strict mode\nnpm test             # 41 unit tests\nnpm run build        # esbuild single-file bundle → dist/index.js\n```\n\n### Running tests\n\n```bash\nnpm test                    # unit tests only\nnpm run test:integration    # integration tests (needs API keys)\n```\n\n### Docker\n\n```bash\ndocker build -t mergelore/action action/\ndocker images mergelore/action --format '{{.Size}}'  # target: \u003c50MB\n```\n\n## Roadmap\n\n| Version | What ships |\n|---------|-----------|\n| v0.1.0 | Core action, Claude + OpenAI providers, git-native memory, HITL slash commands |\n| **v0.2.1** | Qdrant memory provider, Python indexer, Trivy scanning, cosign image signing |\n| v0.3.0 | Helm chart, webhook receiver, pgvector support |\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fautomationpi%2Fmergelore","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fautomationpi%2Fmergelore","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fautomationpi%2Fmergelore/lists"}