https://github.com/gmickel/gno
Local AI-powered document search and editing with first-in-class hybrid retrieval, LLM answers, WebUI, REST API and MCP support for AI clients.
https://github.com/gmickel/gno
ai-assistant bun cli code-search document-search embeddings knowledge-base llm local-first mcp offline pkm rag second-brain semantic-search typescript vector-search
Last synced: 22 days ago
JSON representation
Local AI-powered document search and editing with first-in-class hybrid retrieval, LLM answers, WebUI, REST API and MCP support for AI clients.
- Host: GitHub
- URL: https://github.com/gmickel/gno
- Owner: gmickel
- License: mit
- Created: 2025-12-16T15:58:15.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-01-11T23:42:32.000Z (3 months ago)
- Last Synced: 2026-01-12T03:24:00.419Z (3 months ago)
- Topics: ai-assistant, bun, cli, code-search, document-search, embeddings, knowledge-base, llm, local-first, mcp, offline, pkm, rag, second-brain, semantic-search, typescript, vector-search
- Language: TypeScript
- Homepage: https://www.gno.sh
- Size: 14.7 MB
- Stars: 17
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Agents: AGENTS.md
Awesome Lists containing this project
README
# GNO
**Your Local Second Brain**: Index, search, and synthesize your entire digital life.
[](https://www.npmjs.com/package/@gmickel/gno)
[](./LICENSE)
[](https://gno.sh)
[](https://twitter.com/gmickel)
[](https://discord.gg/nHEmyJB5tg)
> **ClawdHub**: GNO skills bundled for Clawdbot — [clawdhub.com/gmickel/gno](https://clawdhub.com/gmickel/gno)

GNO is a local knowledge engine that turns your documents into a searchable, connected knowledge graph. Index notes, code, PDFs, and Office docs. Get hybrid search, AI answers with citations, and wiki-style note linking—all 100% offline.
---
## Contents
- [Quick Start](#quick-start)
- [Installation](#installation)
- [Search Modes](#search-modes)
- [Agent Integration](#agent-integration)
- [Web UI](#web-ui)
- [REST API](#rest-api)
- [How It Works](#how-it-works)
- [Features](#features)
- [Local Models](#local-models)
- [Architecture](#architecture)
- [Development](#development)
---
## What's New in v0.21
- **Ask CLI Query Modes**: `gno ask` now accepts repeatable `--query-mode term|intent|hyde` entries, matching the existing Ask API and Web controls
### v0.20
- **Improved Model Init Fallbacks**: upgraded `node-llama-cpp` to `3.17.1` and switched to `build: "autoAttempt"` for better backend selection/fallback behavior
### v0.19
- **Exclusion Filters**: explicit `exclude` controls across CLI, API, Web, and MCP to hard-prune unwanted docs by title/path/body text
- **Ask Query-Mode Parity**: Ask now supports structured `term` / `intent` / `hyde` controls in both API and Web UI
### v0.18
- **Intent Steering**: optional `intent` control for ambiguous queries across CLI, API, Web, and MCP query flows
- **Rerank Controls**: `candidateLimit` lets you tune rerank cost vs. recall on slower or memory-constrained machines
- **Stability**: query expansion now uses a bounded configurable context size (`models.expandContextSize`, default `2048`)
- **Rerank Efficiency**: identical chunk texts are deduplicated before scoring and expanded back out deterministically
### v0.17
- **Structured Query Modes**: `term`, `intent`, and `hyde` controls across CLI, API, MCP, and Web
- **Temporal Retrieval Upgrades**: `since`/`until`, date-range parsing, and recency sorting with frontmatter-date fallback
- **Web Retrieval UX Polish**: richer advanced controls in Search and Ask (collection/date/category/author/tags + query modes)
- **Metadata-Aware Retrieval**: ingestion now materializes document metadata/date fields for better filtering and ranking
- **Migration Reliability**: SQLite-compatible migration path for existing indexes (including older SQLite engines)
### v0.15
- **HTTP Backends**: Offload embedding, reranking, and generation to remote GPU servers
- Simple URI config: `http://host:port/path#modelname`
- Works with llama-server, Ollama, LocalAI, vLLM
- Run GNO on lightweight machines while GPU inference runs on your network
### v0.13
- **Knowledge Graph**: Interactive force-directed visualization of document connections
- **Graph with Similarity**: See semantic similarity as golden edges (not just wiki/markdown links)
- **CLI**: `gno graph` command with collection filtering and similarity options
- **Web UI**: `/graph` page with zoom, pan, collection filter, similarity toggle
- **MCP**: `gno_graph` tool for AI agents to explore document relationships
- **REST API**: `/api/graph` endpoint with full query parameters
### v0.12
- **Note Linking**: Wiki-style `[[links]]`, backlinks, and AI-powered related notes
- **Tag System**: Filter searches by frontmatter tags with `--tags-any`/`--tags-all`
- **Web UI**: Outgoing links panel, backlinks panel, related notes sidebar
- **CLI**: `gno links`, `gno backlinks`, `gno similar` commands
- **MCP**: `gno_links`, `gno_backlinks`, `gno_similar` tools
---
## Quick Start
```bash
gno init ~/notes --name notes # Point at your docs
gno index # Build search index
gno query "auth best practices" # Hybrid search
gno ask "summarize the API" --answer # AI answer with citations
```

---
## Installation
### Install GNO
Requires [Bun](https://bun.sh/) >= 1.0.0.
```bash
bun install -g @gmickel/gno
```
**macOS**: Vector search requires Homebrew SQLite:
```bash
brew install sqlite3
```
Verify everything works:
```bash
gno doctor
```
### Connect to AI Agents
#### MCP Server (Claude Desktop, Cursor, Zed, etc.)
One command to add GNO to your AI assistant:
```bash
gno mcp install # Claude Desktop (default)
gno mcp install --target cursor # Cursor
gno mcp install --target claude-code # Claude Code CLI
gno mcp install --target zed # Zed
gno mcp install --target windsurf # Windsurf
gno mcp install --target codex # OpenAI Codex CLI
gno mcp install --target opencode # OpenCode
gno mcp install --target amp # Amp
gno mcp install --target lmstudio # LM Studio
gno mcp install --target librechat # LibreChat
```
Check status: `gno mcp status`
#### Skills (Claude Code, Codex, OpenCode)
Skills integrate via CLI with no MCP overhead:
```bash
gno skill install --scope user # User-wide
gno skill install --target codex # Codex
gno skill install --target all # Both Claude + Codex
```
> **Full setup guide**: [MCP Integration](https://gno.sh/docs/MCP/) · [CLI Reference](https://gno.sh/docs/CLI/)
---
## Search Modes
| Command | Mode | Best For |
| :----------------- | :------------------ | :---------------------------------------- |
| `gno search` | Document-level BM25 | Exact phrases, code identifiers |
| `gno vsearch` | Contextual Vector | Natural language, concepts |
| `gno query` | Hybrid | Best accuracy (BM25 + vector + reranking) |
| `gno ask --answer` | RAG | Direct answers with citations |
**BM25** indexes full documents (not chunks) with Snowball stemming, so "running" matches "run".
**Vector** embeds chunks with document titles for context awareness.
All retrieval modes also support metadata filters: `--since`, `--until`, `--category`, `--author`, `--tags-all`, `--tags-any`.
```bash
gno search "handleAuth" # Find exact matches
gno vsearch "error handling patterns" # Semantic similarity
gno query "database optimization" # Full pipeline
gno query "meeting decisions" --since "last month" --category "meeting,notes" --author "gordon"
gno query "performance" --intent "web performance and latency"
gno query "performance" --exclude "reviews,hiring"
gno ask "what did we decide" --answer # AI synthesis
```
Output formats: `--json`, `--files`, `--csv`, `--md`, `--xml`
### Retrieval V2 Controls
Existing query calls still work. Retrieval v2 adds optional structured intent control and deeper explain output.
```bash
# Existing call (unchanged)
gno query "auth flow" --thorough
# Structured retrieval intent
gno query "auth flow" \
--intent "web authentication and token lifecycle" \
--candidate-limit 12 \
--query-mode term:"jwt refresh token -oauth1" \
--query-mode intent:"how refresh token rotation works" \
--query-mode hyde:"Refresh tokens rotate on each use and previous tokens are revoked." \
--explain
```
- Modes: `term` (BM25-focused), `intent` (semantic-focused), `hyde` (single hypothetical passage)
- Explain includes stage timings, fallback/cache counters, and per-result score components
- `gno ask --json` includes `meta.answerContext` for adaptive source selection traces
---
## Agent Integration
Give your local LLM agents a long-term memory. GNO integrates as a Claude Code skill or MCP server, allowing agents to search, read, and cite your local files.
### Skills
Skills add GNO search to Claude Code/Codex without MCP protocol overhead:
```bash
gno skill install --scope user
```

Then ask your agent: _"Search my notes for the auth discussion"_
[Skill setup guide →](https://gno.sh/docs/integrations/skills/)
### MCP Server
Connect GNO to Claude Desktop, Cursor, Raycast, and more:

GNO exposes tools via [Model Context Protocol](https://modelcontextprotocol.io):
| Tool | Description |
| :-------------- | :------------------------------------ |
| `gno_search` | BM25 keyword search |
| `gno_vsearch` | Vector semantic search |
| `gno_query` | Hybrid search (recommended) |
| `gno_get` | Retrieve document by ID |
| `gno_multi_get` | Batch document retrieval |
| `gno_links` | Get outgoing links from document |
| `gno_backlinks` | Get documents linking TO document |
| `gno_similar` | Find semantically similar documents |
| `gno_graph` | Get knowledge graph (nodes and edges) |
| `gno_status` | Index health check |
**Design**: MCP tools are retrieval-only. Your AI assistant (Claude, GPT-4) synthesizes answers from retrieved context. Best retrieval (GNO) + best reasoning (your LLM).
[MCP setup guide →](https://gno.sh/docs/MCP/)
---
## Web UI
Visual dashboard for search, browsing, editing, and AI answers. Right in your browser.
```bash
gno serve # Start on port 3000
gno serve --port 8080 # Custom port
```

Open `http://localhost:3000` to:
- **Search**: BM25, vector, or hybrid modes with visual results
- **Browse**: Paginated document list, filter by collection
- **Edit**: Create, edit, and delete documents with live preview
- **Ask**: AI-powered Q&A with citations
- **Manage Collections**: Add, remove, and re-index collections
- **Switch presets**: Change models live without restart
### Search

Three retrieval modes: BM25 (keyword), Vector (semantic), or Hybrid (best of both). Adjust search depth for speed vs thoroughness.
### Document Editing

Full-featured markdown editor with:
| Feature | Description |
| :---------------------- | :----------------------------------- |
| **Split View** | Side-by-side editor and live preview |
| **Auto-save** | 2-second debounced saves |
| **Syntax Highlighting** | CodeMirror 6 with markdown support |
| **Keyboard Shortcuts** | ⌘S save, ⌘B bold, ⌘I italic, ⌘K link |
| **Quick Capture** | ⌘N creates new note from anywhere |
### Document Viewer

View documents with full context: outgoing links, backlinks, and AI-powered related notes sidebar.
### Knowledge Graph

Interactive visualization of document connections. Wiki links, markdown links, and optional similarity edges rendered as a navigable constellation.
### Collections Management

- Add collections with folder path input
- View document count, chunk count, embedding status
- Re-index individual collections
- Remove collections (documents preserved)
### AI Answers

Ask questions in natural language. GNO searches your documents and synthesizes answers with inline citations linking to sources.
Everything runs locally. No cloud, no accounts, no data leaving your machine.
> **Detailed docs**: [Web UI Guide](https://gno.sh/docs/WEB-UI/)
---
## REST API
Programmatic access to all GNO features via HTTP.
```bash
# Hybrid search
curl -X POST http://localhost:3000/api/query \
-H "Content-Type: application/json" \
-d '{"query": "authentication patterns", "limit": 10}'
# AI answer
curl -X POST http://localhost:3000/api/ask \
-H "Content-Type: application/json" \
-d '{"query": "What is our deployment process?"}'
# Index status
curl http://localhost:3000/api/status
```
| Endpoint | Method | Description |
| :------------------------- | :----- | :-------------------------- |
| `/api/query` | POST | Hybrid search (recommended) |
| `/api/search` | POST | BM25 keyword search |
| `/api/ask` | POST | AI-powered Q&A |
| `/api/docs` | GET | List documents |
| `/api/docs` | POST | Create document |
| `/api/docs/:id` | PUT | Update document content |
| `/api/docs/:id/deactivate` | POST | Remove from index |
| `/api/doc` | GET | Get document content |
| `/api/collections` | POST | Add collection |
| `/api/collections/:name` | DELETE | Remove collection |
| `/api/sync` | POST | Trigger re-index |
| `/api/status` | GET | Index statistics |
| `/api/presets` | GET | List model presets |
| `/api/presets` | POST | Switch preset |
| `/api/models/pull` | POST | Download models |
| `/api/models/status` | GET | Download progress |
No authentication. No rate limits. Build custom tools, automate workflows, integrate with any language.
> **Full reference**: [API Documentation](https://gno.sh/docs/API/)
---
## How It Works
```mermaid
graph TD
A[User Query] --> B(Query Expansion)
B --> C{Lexical Variants}
B --> D{Semantic Variants}
B --> E{HyDE Passage}
C --> G(BM25 Search)
D --> H(Vector Search)
E --> H
A --> G
A --> H
G --> I(Ranked Results)
H --> J(Ranked Results)
I --> K{RRF Fusion}
J --> K
K --> L(Top 20 Candidates)
L --> M(Cross-Encoder Rerank)
M --> N[Final Results]
```
0. **Strong Signal Check**: Skip expansion if BM25 has confident match (saves 1-3s)
1. **Query Expansion**: LLM generates lexical variants, semantic rephrases, and a [HyDE](https://arxiv.org/abs/2212.10496) passage
2. **Parallel Retrieval**: Document-level BM25 + chunk-level vector search on all variants
3. **Fusion**: RRF with 2× weight for original query, tiered bonus for top ranks
4. **Reranking**: Qwen3-Reranker scores best chunk per document (4K), blended with fusion
> **Deep dive**: [How Search Works](https://gno.sh/docs/HOW-SEARCH-WORKS/)
---
## Features
| Feature | Description |
| :------------------- | :----------------------------------------------------------------------------- |
| **Hybrid Search** | BM25 + vector + RRF fusion + cross-encoder reranking |
| **Document Editor** | Create, edit, delete docs with live markdown preview |
| **Web UI** | Visual dashboard for search, browse, edit, and AI Q&A |
| **REST API** | HTTP API for custom tools and integrations |
| **Multi-Format** | Markdown, PDF, DOCX, XLSX, PPTX, plain text |
| **Local LLM** | AI answers via llama.cpp, no API keys |
| **Remote Inference** | Offload to GPU servers via HTTP (llama-server, Ollama, LocalAI) |
| **Privacy First** | 100% offline, zero telemetry, your data stays yours |
| **MCP Server** | Works with Claude Desktop, Cursor, Zed, + 8 more |
| **Collections** | Organize sources with patterns, excludes, contexts |
| **Tag Filtering** | Frontmatter tags with hierarchical paths, filter via `--tags-any`/`--tags-all` |
| **Note Linking** | Wiki links, backlinks, related notes, cross-collection navigation |
| **Multilingual** | 30+ languages, auto-detection, cross-lingual search |
| **Incremental** | SHA-256 tracking, only changed files re-indexed |
| **Keyboard First** | ⌘N capture, ⌘K search, ⌘/ shortcuts, ⌘S save |
---
## Local Models
Models auto-download on first use to `~/.cache/gno/models/`. For deterministic startup, set `GNO_NO_AUTO_DOWNLOAD=1` and use `gno models pull` explicitly. Alternatively, offload to a GPU server on your network using HTTP backends.
| Model | Purpose | Size |
| :------------------ | :------------------------------------ | :----------- |
| bge-m3 | Embeddings (1024-dim, multilingual) | ~500MB |
| Qwen3-Reranker-0.6B | Cross-encoder reranking (32K context) | ~700MB |
| Qwen/SmolLM | Query expansion + AI answers | ~600MB-1.2GB |
### Model Presets
| Preset | Disk | Best For |
| :--------- | :----- | :--------------------------- |
| `slim` | ~1GB | Fast, good quality (default) |
| `balanced` | ~2GB | Slightly larger model |
| `quality` | ~2.5GB | Best answers |
```bash
gno models use slim
gno models pull --all # Optional: pre-download models (auto-downloads on first use)
```
### HTTP Backends (Remote GPU)
Offload inference to a GPU server on your network:
```yaml
# ~/.config/gno/config.yaml
models:
activePreset: remote-gpu
presets:
- id: remote-gpu
name: Remote GPU Server
embed: "http://192.168.1.100:8081/v1/embeddings#bge-m3"
rerank: "http://192.168.1.100:8082/v1/completions#reranker"
gen: "http://192.168.1.100:8083/v1/chat/completions#qwen3-4b"
```
Works with llama-server, Ollama, LocalAI, vLLM, or any OpenAI-compatible server.
> **Configuration**: [Model Setup](https://gno.sh/docs/CONFIGURATION/)
---
## Architecture
```
┌─────────────────────────────────────────────────┐
│ GNO CLI / MCP / Web UI / API │
├─────────────────────────────────────────────────┤
│ Ports: Converter, Store, Embedding, Rerank │
├─────────────────────────────────────────────────┤
│ Adapters: SQLite, FTS5, sqlite-vec, llama-cpp │
├─────────────────────────────────────────────────┤
│ Core: Identity, Mirrors, Chunking, Retrieval │
└─────────────────────────────────────────────────┘
```
> **Details**: [Architecture](https://gno.sh/docs/ARCHITECTURE/)
---
## Development
```bash
git clone https://github.com/gmickel/gno.git && cd gno
bun install
bun test
bun run lint && bun run typecheck
```
> **Contributing**: [CONTRIBUTING.md](.github/CONTRIBUTING.md)
### Evals and Benchmark Deltas
Use retrieval benchmark commands to track quality and latency over time:
```bash
bun run eval:hybrid
bun run eval:hybrid:baseline
bun run eval:hybrid:delta
```
- Benchmark guide: [evals/README.md](./evals/README.md)
- Latest baseline snapshot: [evals/fixtures/hybrid-baseline/latest.json](./evals/fixtures/hybrid-baseline/latest.json)
---
## License
[MIT](./LICENSE)
---
made with ❤️ by @gmickel