An open API service indexing awesome lists of open source software.

https://github.com/hannasdev/mcp-document-reader

MCP server for reading large PDF documents — paginated reads, full-text search, and page-aware navigation for AI assistants.
https://github.com/hannasdev/mcp-document-reader

Last synced: 28 days ago
JSON representation

MCP server for reading large PDF documents — paginated reads, full-text search, and page-aware navigation for AI assistants.

Awesome Lists containing this project

README

          

# mcp-document-reader

MCP server for AI-assisted PDF reading with pagination, search, and page navigation.

Designed to let AI assistants read large documents without filling the context window from page 1 every time. The model can chunk through a book, jump to any section by keyword, and pick up exactly where it left off.

## Tools

| Tool | Description |
|---|---|
| `list_documents` | List all documents in the configured directory |
| `get_document_info` | Get page count and file size without extracting text |
| `read_document` | Read a page range (default: 50-page chunks) |
| `search_document` | Keyword search — returns page numbers and context snippets |

Every `read_document` response includes a `[Pages X–Y of Z total]` header and a continuation hint so the model knows exactly where to call next.

## Quick start (Claude Desktop / npx)

Add to your Claude Desktop config (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS):

```json
{
"mcpServers": {
"documents": {
"command": "npx",
"args": ["-y", "@hanna84/mcp-document-reader", "--stdio", "--dir", "/path/to/your/docs"]
}
}
}
```

No install needed — `npx` pulls the package on first run.

## Self-hosted (Docker / SSE)

```yaml
# docker-compose.yml
services:
mcp-document-reader:
image: node:22-alpine
working_dir: /app
command: node index.js
environment:
DOC_DIR: /docs
HTTP_PORT: "3000"
volumes:
- ./docs:/docs:ro
ports:
- "3000:3000"
```

Register the SSE endpoint in your MCP gateway:
```json
{ "url": "http://mcp-document-reader:3000/sse" }
```

A `GET /healthz` endpoint is available for Docker healthchecks.

## CLI reference

```
node index.js [options]

--dir Directory to serve documents from (default: $DOC_DIR or cwd)
--port HTTP port for SSE mode (default: $HTTP_PORT or 3000)
--stdio Use stdio transport instead of SSE
```

## Environment variables

| Variable | Default | Description |
|---|---|---|
| `DOC_DIR` | `.` | Document directory |
| `HTTP_PORT` | `3000` | SSE server port |
| `MCP_TRANSPORT` | — | Set to `stdio` to use stdio transport |

## Supported formats

- PDF (`.pdf`)

## Requirements

Node.js >= 18