https://github.com/hallengray/rag-forge

Production-grade RAG pipelines with evaluation baked in
https://github.com/hallengray/rag-forge

cli embeddings llm llm-evaluation mcp observability python rag rag-evaluation rag-pipeline ragas retrieval-augmented-generation vector-database

Last synced: 3 months ago
JSON representation

Production-grade RAG pipelines with evaluation baked in

Host: GitHub
URL: https://github.com/hallengray/rag-forge
Owner: hallengray
License: mit
Created: 2026-04-11T13:56:02.000Z (3 months ago)
Default Branch: main
Last Pushed: 2026-04-16T13:35:46.000Z (3 months ago)
Last Synced: 2026-04-17T02:03:40.707Z (3 months ago)
Topics: cli, embeddings, llm, llm-evaluation, mcp, observability, python, rag, rag-evaluation, rag-pipeline, ragas, retrieval-augmented-generation, vector-database
Language: Python
Homepage: https://rag-forge-site.vercel.app/
Size: 1.42 MB
Stars: 3
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
- Security: SECURITY.md

Awesome Lists containing this project

README

          


# RAG-Forge

**Production-grade RAG pipelines with evaluation baked in — not bolted on after deployment.**

[![npm version](https://img.shields.io/npm/v/@rag-forge/cli?label=%40rag-forge%2Fcli)](https://www.npmjs.com/package/@rag-forge/cli)

[![PyPI version](https://img.shields.io/pypi/v/rag-forge-core?label=rag-forge-core)](https://pypi.org/project/rag-forge-core/)

[![CI](https://img.shields.io/github/actions/workflow/status/hallengray/rag-forge/ci.yml?branch=main)](https://github.com/hallengray/rag-forge/actions)

[![License: MIT](https://img.shields.io/github/license/hallengray/rag-forge)](./LICENSE)

[![Discussions](https://img.shields.io/github/discussions/hallengray/rag-forge)](https://github.com/hallengray/rag-forge/discussions)

[Docs](https://rag-forge-docs.vercel.app/) · [Website](https://rag-forge-site.vercel.app/) · [Discussions](https://github.com/hallengray/rag-forge/discussions) · [Changelog](./docs/release-notes)



---

## Why RAG-Forge?

Most RAG projects ship without evaluation, and most evaluation libraries don't help you build the pipeline. Few tools score maturity end-to-end — so teams often don't know if they're at "a demo that sometimes works" or "a system you can put in front of customers."

- **Building a RAG pipeline is easy. Knowing whether it works is hard.** RAG-Forge closes that loop.

- **Eval is a first-class citizen, not an afterthought.** Every template ships with a golden set and an audit gate.

- **The RAG Maturity Model (RMM-0 → RMM-5)** gives you a concrete scorecard for any RAG system — yours or someone else's.

RAG-Forge is one of the few toolkits that scaffolds production-ready RAG pipelines, runs continuous evaluation as a CI/CD gate, and scores any existing system against a published maturity model — all in one CLI.

---

## RAG Maturity Model

The RMM is the scoring framework at the heart of RAG-Forge. Run `rag-forge assess` on any audit report to see where your system sits.

| Level | Name              | Exit Criteria                                              |

|-------|-------------------|------------------------------------------------------------|

| RMM-0 | Naive             | Basic vector search works                                  |

| RMM-1 | Better Recall     | Hybrid search, Recall@5 > 70%                              |

| RMM-2 | Better Precision  | Reranker active, nDCG@10 +10%                              |

| RMM-3 | Better Trust      | Guardrails, faithfulness > 85%                             |

| RMM-4 | Better Workflow   | Caching, P95 < 4s, cost tracking                           |

| RMM-5 | Enterprise        | Drift detection, CI/CD gates, adversarial tests            |

---

## Quick Start

```bash

npm install -g @rag-forge/cli

# Scaffold a project (use --directory to name the folder)

rag-forge init basic --directory my-rag-project

cd my-rag-project

# Drop your documents into a folder of your choice (or use the example below)

mkdir docs

echo "RAG-Forge is a CLI for building and evaluating RAG pipelines." > docs/example.md

rag-forge index --source ./docs

rag-forge audit --golden-set eval/golden_set.json

rag-forge assess --audit-report reports/audit-report.json

```

From empty directory to a scored RAG system with a golden set and an audit report — in under a minute.

---

## Installation

**CLI (Node.js 20+):**

```bash

npm install -g @rag-forge/cli

```

**Python packages (Python 3.11+):**

```bash

pip install rag-forge-core rag-forge-evaluator rag-forge-observability

```

---

## Templates

| Template     | Use Case                                             |

|--------------|------------------------------------------------------|

| `basic`      | First RAG project, simple Q&A                        |

| `hybrid`     | Production-ready document Q&A with reranking         |

| `agentic`    | Multi-hop reasoning with query decomposition         |

| `enterprise` | Regulated industries with full security suite        |

| `n8n`        | AI automation agency deployments                     |

Templates generate editable source code in your project — not framework dependencies. Fork the code, not the abstraction.

---

## Commands

| Category         | Commands                                                             |

|------------------|----------------------------------------------------------------------|

| **Scaffolding**  | `init`, `add`                                                        |

| **Ingestion**    | `parse`, `chunk`, `index`                                            |

| **Query**        | `query`, `inspect`                                                   |

| **Evaluation**   | `audit`, `assess`, `golden add`, `golden validate`                   |

| **Operations**   | `report`, `cache stats`, `drift report`, `cost`                      |

| **Security**     | `guardrails test`, `guardrails scan-pii`                             |

| **Integration**  | `serve --mcp`, `n8n export`                                          |

Run `rag-forge --help` for the full command reference.

---

## How RAG-Forge compares

There are great tools in this space. Here's an honest look at where each fits.

| Capability                        | RAG-Forge | RAGAS  | LangChain Eval | Giskard |

|-----------------------------------|:---------:|:------:|:--------------:|:-------:|

| Scaffolds a RAG pipeline          |     ✓     |   —    |       —        |    —    |

| Evaluation metrics                |     ✓     |   ✓    |       ✓        |    ✓    |

| Maturity scoring (RMM-0 → 5)      |     ✓     |   —    |       —        |    —    |

| CI gate workflow (audit action)   |     ✓     |   —    |    partial     | partial |

| MCP server                        |     ✓     |   —    |       —        |    —    |

| Guardrails / PII scanning         |     ✓     |   —    |    partial     |    ✓    |

| Drift detection                   |     ✓     |   —    |       —        | partial |

| Multi-language (TS + Python)      |     ✓     |   —    |       ✓        |    —    |

| Framework-agnostic                |     ✓     |   ✓    |       —        |    ✓    |

**Peer strengths worth knowing:**

- **RAGAS** has deeper metric research and a large community. RAG-Forge's evaluator supports RAGAS as a backend — run `rag-forge audit --evaluator ragas` to use it directly.

- **LangChain Eval** has the broadest ecosystem of integrations if you're already invested in LangChain.

- **Giskard** has a strong general-purpose ML testing story beyond RAG.

Pick the tool that matches your stage. RAG-Forge's wedge is the full lifecycle — scaffold → evaluate → score → ship — in one CLI, with the RMM as the objective function.

---

## Architecture

RAG-Forge is a polyglot monorepo. The CLI and MCP server are TypeScript; all RAG logic is Python. The CLI delegates to Python via a subprocess bridge so the two halves can be developed and versioned independently.

```text

rag-forge/

├── packages/

│   ├── cli/              TypeScript — Commander.js CLI (rag-forge command)

│   ├── mcp/              TypeScript — MCP server (@modelcontextprotocol/sdk)

│   ├── core/             Python    — RAG pipeline primitives

│   ├── evaluator/        Python    — RAGAS + DeepEval + LLM-as-Judge

│   └── observability/    Python    — OpenTelemetry + Langfuse

├── templates/            Project templates (basic, hybrid, agentic, enterprise, n8n)

└── apps/site/            Docs and marketing site (Next.js, deployed to Vercel)

```

See [docs/architecture.md](./docs/architecture.md) for a deeper dive.

---

## Docs & Community

- 📚 **Docs:** https://rag-forge-docs.vercel.app/

- 🌐 **Website:** https://rag-forge-site.vercel.app/

- 💬 **Discussions:** https://github.com/hallengray/rag-forge/discussions

- 🔒 **Security:** see [SECURITY.md](./SECURITY.md)

- 📝 **Changelog:** [docs/release-notes](./docs/release-notes)

---

## Contributing

See [CONTRIBUTING.md](./CONTRIBUTING.md) for development setup and contribution guidelines. All contributors are expected to follow our [Code of Conduct](./CODE_OF_CONDUCT.md).

---

## License

MIT — see [LICENSE](./LICENSE)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/hallengray/rag-forge

Awesome Lists containing this project

README