https://github.com/nduckmink/arkon
Arkon: Enterprise AI Knowledge Hub & MCP Server. Self-hosted knowledge base for teams to manage RAG contexts, access policies, and AI skills. Connect Claude and other LLMs via Model Context Protocol (MCP) for automated, secure organizational knowledge integration.
https://github.com/nduckmink/arkon
enterprise-ai knowledge-base-embeddings knowledge-based-systems knowledge-bases llm-wiki-personal-knowledge-base mcp mcp-server model-context-protocol organizational-knowledge rag self-hosted wiki
Last synced: about 1 month ago
JSON representation
Arkon: Enterprise AI Knowledge Hub & MCP Server. Self-hosted knowledge base for teams to manage RAG contexts, access policies, and AI skills. Connect Claude and other LLMs via Model Context Protocol (MCP) for automated, secure organizational knowledge integration.
- Host: GitHub
- URL: https://github.com/nduckmink/arkon
- Owner: nduckmink
- License: other
- Created: 2026-04-30T15:11:39.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2026-05-14T05:41:10.000Z (about 1 month ago)
- Last Synced: 2026-05-14T06:36:03.833Z (about 1 month ago)
- Topics: enterprise-ai, knowledge-base-embeddings, knowledge-based-systems, knowledge-bases, llm-wiki-personal-knowledge-base, mcp, mcp-server, model-context-protocol, organizational-knowledge, rag, self-hosted, wiki
- Language: Python
- Homepage:
- Size: 6.5 MB
- Stars: 697
- Watchers: 7
- Forks: 132
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Arkon - The Open-Source Enterprise AI Knowledge Hub & MCP Server
**Arkon** is a self-hosted, enterprise-grade knowledge management layer that bridges organizational data and AI clients. It runs as a centralized **MCP Server** (Model Context Protocol), compiling your SOPs, policies, and internal docs into a structured, traceable knowledge wiki - then serving that wiki to Claude and other LLMs through a single permission-scoped endpoint.
## π Why Arkon?
In most organizations, AI adoption is fragmented. Employees copy-paste documents into chatbots, producing inconsistent context, security risks, and duplicated work.
**Arkon treats AI as a managed organizational resource.** Every employee gets the right context, automatically and securely - filtered by their department, project membership, and role.
---
## β¨ Key Features
### π§ Intelligent Knowledge Wiki - the MRP Pipeline
Unlike a vector database that just chunks and indexes, Arkon's **MRP pipeline** (**M**ap β **R**educe β **P**lan-review β **R**efine β **V**erify β Commit) actually compiles documents into a coherent wiki of interlinked pages.
- **Plan review before write:** every ingestion produces a human-reviewable plan listing which wiki pages will be created or updated. Editors can regenerate the plan with feedback before any page is written.
- **Page merge instead of overwrite:** when a new source touches an existing wiki page, content is LLM-merged so prior knowledge is never lost.
- **Traceable claims:** every page records the source documents it was compiled from.
- **Image-aware:** vision captions are baked into source text before compilation, so wiki pages reference the right images in the right places.
- **Resumable:** drafts persist mid-pipeline; a crashed run resumes without re-doing the expensive LLM work.
### π Wiki Browser & Knowledge Graph
- Three-panel layout: page tree, content, backlinks & outlinks.
- Full-text + semantic (pgvector) search.
- Interactive knowledge graph visualization (per-scope or global).
- Wikilink-style cross-references between pages.
- Version history and rollback on every page.
- Draft proposal β editor review β approval workflow.
### π’ Workspaces (Department & Project Scopes)
Cross-functional contexts with their own scoped wiki, document set, and member roster.
- **Department-level isolation** for HR, Legal, Engineering, etc.
- **Project workspaces** for cross-functional initiatives or clients.
- **Hard scope enforcement:** members only see knowledge from their assigned scopes - at the API, MCP, and search layers.
### π Fine-Grained RBAC
Role-based access control at department + workspace level.
- Built-in roles: **Viewer Β· Contributor Β· Editor Β· Admin** (and admin-defined custom roles).
- Granular permissions (`doc:read:own_dept`, `wiki:edit:all`, `org:settings:manage`, ...).
- **Audit log** for every privileged action - settings changes, plan approvals, role updates.
### π MCP Server for Claude & Other AI Clients
Employees connect Claude Desktop (or any MCP-compatible client) to Arkon with a personal token. The MCP server exposes:
- **Wiki tools** - `search_wiki`, `read_wiki_page`, `list_wiki_pages`, `read_wiki_index`.
- **Source drill-down** - `get_source`, `get_source_outline`, `get_source_pages`, `list_sources`.
- **Edit workflow** - `propose_wiki_edit`, `edit_wiki_page`, `list_pending_drafts`, `review_draft`, `approve_draft`, `reject_draft`.
- **Discovery** - `list_knowledge_types`, `get_knowledge_type_docs`.
All tools enforce per-token scope (department, knowledge type, source list).
### π§° AI Skills Distribution
Upload custom agent packages once and distribute them across the org.
- Versioned skill packages (.zip with `SKILL.md`).
- Department-scoped visibility.
- Contribution workflow for end-user updates.
### π€ Pluggable AI Providers
Catalog-driven selection - admins pick from a curated list with context window, cost, and capability metadata on display.
- **LLM:** Anthropic Claude (Opus / Sonnet / Haiku 4.x), Google Gemini (3.x Pro / Flash / Flash-Lite, 2.5), OpenAI (GPT-5.x, GPT-4.x).
- **Embedding:** Google `gemini-embedding-*`, OpenAI `text-embedding-3-*` - switchable with online re-embed migration (active model atomically flipped on completion, no zero-result search window).
- **Vision:** Google Gemini Flash, OpenAI GPT-4o family.
### π Privacy & Security First
- **Self-hosted.** Deploy on-prem or in your private cloud via Docker.
- **No telemetry.** Outbound traffic goes only to the AI provider you choose.
- **Encrypted at rest.** API keys stored with Fernet encryption in PostgreSQL.
---
## π οΈ Tech Stack
- **Backend:** FastAPI Β· PostgreSQL + pgvector Β· Redis (arq workers) Β· MinIO
- **Frontend:** Next.js Β· Tailwind CSS
- **AI integration:** Model Context Protocol via FastMCP
- **Document parsing:** PDF, DOCX, DOC, plain text, URLs, embedded images
---
## π» Server Requirements
Arkon runs **7 Docker containers** (PostgreSQL + pgvector, Redis, MinIO, FastAPI API, 2 ARQ workers, Next.js frontend). The table below provides recommended configurations based on team size:
| | **Starter** | **Team** | **Enterprise** |
|---|:---:|:---:|:---:|
| **Team size** | 1 β 20 | 20 β 100 | 100+ |
| **vCPU** | 2 cores | 4 cores | 8+ cores |
| **RAM** | 4 GB | 8 GB | 16+ GB |
| **Storage** | 40 GB SSD | 100 GB SSD | 250+ GB NVMe SSD |
| **OS** | Ubuntu 22.04+ / Debian 12+ | Ubuntu 22.04+ / Debian 12+ | Ubuntu 22.04+ / Debian 12+ |
| **Use case** | Evaluation / small teams | Departmental deployment | Organization-wide rollout |
> [!NOTE]
> - **RAM** is the primary bottleneck β the MRP pipeline workers load large LLM context windows into memory during wiki compilation.
> - **Storage** scales with your document corpus β pgvector indexes, MinIO file storage, and PostgreSQL WAL logs are the main consumers.
> - All AI inference happens externally (Anthropic / Google / OpenAI APIs), so **GPU is not required**.
> - A reverse proxy (Nginx / Caddy) with SSL is recommended for production. See [Setup Guide](docs/SETUP.md).
---
## π¦ Quick Start (Docker)
**Prerequisites:** Docker & Docker Compose, plus an API key from your preferred AI provider (Anthropic, Google, or OpenAI).
1. **Clone the repository:**
```shell
git clone https://github.com/nduckmink/arkon.git
cd arkon
```
2. **Configure environment:**
```shell
cp .env.docker.example .env.docker
# Edit .env.docker - set SECRET_KEY, admin credentials, and Postgres/MinIO secrets
```
3. **Launch:**
```shell
docker compose --env-file .env.docker up -d --build
```
4. Access the portal at `http://localhost:3119`, sign in as admin, then go to **Settings** to pick your embedding / LLM / vision models and paste API keys.
β See [Setup Guide](docs/SETUP.md) for development mode and the full env reference.
---
## π Connecting Claude
After creating an employee account and generating an MCP token:
```json
{
"mcpServers": {
"arkon": {
"url": "https://your-arkon-server/mcp",
"headers": { "Authorization": "Bearer ark_xxxxxxxxxxxx" }
}
}
}
```
Drop it into `claude_desktop_config.json` and restart Claude Desktop. The employee's scoped knowledge is immediately available.
β See [MCP & Claude](docs/MCP.md) for the complete tool reference.
---
## πΊοΈ Roadmap
- [x] **MRP Pipeline** - deterministic compilation with plan review, page merge, and resume-on-crash.
- [x] **MCP Server** - scoped wiki + source + draft tools.
- [x] **Workspaces** - department isolation + project scopes with RBAC.
- [x] **Wiki draft & revision workflow** - propose, review, approve, rollback.
- [x] **AI Skills** - versioned, department-scoped agent packages.
- [x] **Catalog-driven model selection** - LLM, embedding, and vision picked from a curated list with cost/context-window metadata.
- [x] **Online embedding migration** - atomic re-embed with no search downtime.
- [x] **Audit log** - privileged actions tracked.
- [ ] **Arkon CLI** - one-command setup for employees.
- [ ] **Notification system** - for draft reviews and plan approvals.
- [ ] **Usage analytics dashboard** - cost + adoption per department.
---
## π License
Arkon is licensed under the [PolyForm Internal Use License 1.0.0](LICENSE). Free for internal business operations; you may not offer Arkon as a service to third parties.
For enterprise support or custom integrations, please contact the maintainers.
---
**Keywords:** *Enterprise AI, Model Context Protocol, MCP Server, Knowledge Management System, Self-hosted RAG, AI Knowledge Base, Claude MCP, LLM Context Management, Open Source Wiki.*