https://github.com/cuzfrog/docent-mcp

Semantic + BM25 Document + Git history search MCP server
https://github.com/cuzfrog/docent-mcp

ai bm25 decision-records embedding-models git-history hybrid-search indexing local-ai mcp mcp-server search-engine

Last synced: about 2 months ago
JSON representation

Semantic + BM25 Document + Git history search MCP server

Host: GitHub
URL: https://github.com/cuzfrog/docent-mcp
Owner: cuzfrog
License: mit
Created: 2026-05-05T02:15:17.000Z (3 months ago)
Default Branch: main
Last Pushed: 2026-05-19T04:17:00.000Z (2 months ago)
Last Synced: 2026-05-19T06:56:33.904Z (2 months ago)
Topics: ai, bm25, decision-records, embedding-models, git-history, hybrid-search, indexing, local-ai, mcp, mcp-server, search-engine
Language: Rust
Homepage:
Size: 691 KB
Stars: 1
Watchers: 0
Forks: 1
Open Issues: 2
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

README

          [![SafeSkill 92/100](https://img.shields.io/badge/SafeSkill-92%2F100_Verified%20Safe-brightgreen)](https://safeskill.dev/scan/cuzfrog-docent-mcp)

# docent

**Semantic + BM25 Document & Git history search for Design Decision Records** — an experimental MCP server written in Rust that indexes documents and git history, letting agents query *why* code looks the way it does.

```

  files/git ──▼── index ──▶  MCP server  ◀──── query

                 (cache)        (HTTP)

```

## Quick Start

```sh

docent init              # generate docent.toml config

docent index [path]      # index .md files + git history

docent serve             # start MCP server on port 7878

```

Open [http://localhost:7878](http://localhost:7878) for the built-in Web UI.

## Usage

| Command | Description |

|---|---|

| `docent init` | Generate a `docent.toml` config file |

| `docent index [dir]` | Index both file and git sources (default: current dir) |

| `docent index-file ` | Index specific files/directories |

| `docent index-git ` | Index git history from a repository |

| `docent serve` | Start the MCP server (streamable HTTP) |

| `docent list-models` | List supported embedding models |

Flags: `--config ` (default `./docent.toml`), `--rebuild` (full re-index), `--verbose`.

## How It Works

1. **Sources** — Reads markdown files and git commit history

2. **Section-aware chunking** — Splits documents into chunks, preserving heading structure

3. **Embedding** — Converts chunks to vectors via `fastembed` (configurable model)

4. **Index cache** — Persists vectors and metadata to disk

5. **Semantic + BM25 search** — Hybrid scoring with configurable algorithm

6. **MCP server** — Exposes `search_ddr` tool over streamable HTTP

## Install

TBC

## Documentation

- [Development Guide](doc/Development.md)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/cuzfrog/docent-mcp

Awesome Lists containing this project

README