https://github.com/zebbern/knowledge-assistant
A lightweight, no-cost, chat agent that answers questions from custom knowledge files. No vector databases, no embeddings, no complex setup - drop in your md files and start chatting
https://github.com/zebbern/knowledge-assistant
agent chat chat-bot free-ai-tools llm project-ai-chatbot
Last synced: 3 months ago
JSON representation
A lightweight, no-cost, chat agent that answers questions from custom knowledge files. No vector databases, no embeddings, no complex setup - drop in your md files and start chatting
- Host: GitHub
- URL: https://github.com/zebbern/knowledge-assistant
- Owner: zebbern
- Created: 2026-01-20T16:51:37.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2026-01-20T19:45:18.000Z (5 months ago)
- Last Synced: 2026-01-21T03:58:05.515Z (5 months ago)
- Topics: agent, chat, chat-bot, free-ai-tools, llm, project-ai-chatbot
- Language: TypeScript
- Homepage: https://tryassistant.vercel.app/
- Size: 150 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Knowledge-Assistant
#### A lightweight, free, file-based chat agent that answers questions from your custom knowledge files. No vector databases, no embeddings, no complex setup - just drop in your markdown files and start chatting.
env variables needed: OPENROUTER_API_KEY, NEXT_PUBLIC_SITE_URL


## Why Lightweight?
Unlike heavyweight RAG solutions that require:
- Vector databases (Pinecone, Weaviate, Chroma)
- Embedding models and preprocessing
- Complex chunking strategies
- Database hosting and maintenance
Knowledge-Assistant takes a simpler approach:
- **Just files** - Drop `.md` or `.txt` files in a folder
- **No preprocessing** - Files are read at request time
- **No database** - Context goes directly to the LLM
- **Free models** - Uses OpenRouter's free tier (Llama, Mimo, DeepSeek)
Perfect for: documentation sites, GitHub repos, personal knowledge bases, and projects where simplicity beats complexity.
### Key Features
- **Multiple Free Models** - Choose from Llama 3.3, Mimo, DeepSeek R1, Devstral, GLM
- **Temperature Control** - Adjust creativity (0 = precise, 2 = creative)
- **Custom System Prompts** - Add your own AI personality and instructions
- **Streaming Responses** - Real-time token-by-token output
- **Markdown Rendering** - Full markdown with syntax highlighting
- **Mermaid Diagrams** - Live diagram rendering in chat
- **Code Blocks** - Syntax-highlighted with copy functionality
- **Persistent Sessions** - Conversations stored in localStorage
- **Export Chat** - Download conversations as markdown
---
Knowledge AI loads markdown files from a `content/` directory and uses them as context for AI responses. This approach provides accurate, domain-specific answers without the complexity of vector databases or embeddings.
```mermaid
flowchart LR
subgraph Client
A[Chat Interface]
end
subgraph Server
B[Next.js API]
C[Knowledge Loader]
D[OpenRouter API]
end
subgraph Knowledge
E[content/*.md]
end
A -->|User Message| B
B --> C
C -->|Read Files| E
C -->|Context + Message| D
D -->|Streaming Response| B
B -->|SSE Stream| A
```
## Quick Start
### Prerequisites
- Node.js 18+
- OpenRouter API key ([get one free](https://openrouter.ai))
### Installation
```bash
cd knowledge-assistance
npm install
cp .env.example .env.local
```
Edit `.env.local` with your API key:
```env
OPENROUTER_API_KEY=your_api_key_here
NEXT_PUBLIC_SITE_URL=http://localhost:3000
```
### Running
```bash
npm run dev # Development
npm run build # Production build
npm start # Start production server
```
Access the application at `http://localhost:3000`
## Architecture
```mermaid
graph TB
subgraph Frontend
UI[React Chat UI]
MD[Markdown Renderer]
MM[Mermaid Component]
CB[Code Block Component]
end
subgraph API_Layer[API Layer]
STREAM[Stream Endpoint]
CHAT[Chat Endpoint]
end
subgraph Knowledge_System[Knowledge System]
LOADER[File Loader]
CONTENT[Content Files]
end
subgraph External
OR[OpenRouter API]
end
UI --> STREAM
STREAM --> LOADER
LOADER --> CONTENT
STREAM --> OR
OR --> UI
UI --> MD
MD --> MM
MD --> CB
```
## Knowledge System
Add knowledge by placing markdown files in the `content/` directory:
```
content/
knowledge.md # General domain knowledge
n8n-workflow-guide.md # n8n workflow JSON reference
n8n-ai-nodes.md # AI/LangChain node examples
n8n-patterns.md # Common workflow patterns
mermaid-syntax.md # Mermaid diagram reference
```
The AI reads all `.md` and `.txt` files at request time and uses them as context for responses.
### Knowledge File Structure
Each knowledge file should be focused on a specific topic:
```markdown
# Topic Title
Brief overview of the topic.
## Section 1
Detailed information with examples.
## Section 2
Code examples in fenced blocks.
```
## Project Structure
```
knowledge-assistance/
app/
api/
chat/
route.ts # Non-streaming endpoint
stream/
route.ts # Streaming endpoint
layout.tsx
page.tsx
globals.css
components/
chat.tsx # Main chat interface
mermaid.tsx # Mermaid and code block rendering
ui/ # Shadcn UI components
content/ # Knowledge files
lib/
utils.ts
```
## API Endpoints
### POST /api/chat/stream
Streaming chat endpoint using Server-Sent Events.
**Request:**
```json
{
"message": "User message",
"messages": [
{ "role": "user", "content": "Previous message" },
{ "role": "assistant", "content": "Previous response" }
]
}
```
**Response:** SSE stream with chunked content
### POST /api/chat
Non-streaming chat endpoint.
**Request:** Same as streaming endpoint
**Response:**
```json
{
"response": "Complete AI response"
}
```
## Configuration
### Model Selection
The default model can be changed in `app/api/chat/stream/route.ts`:
```typescript
model: "xiaomi/mimo-v2-flash:free", // or any OpenRouter model
```
### Token Limits
Adjust response length:
```typescript
max_tokens: 4096,
```
## Data Flow
```mermaid
sequenceDiagram
participant User
participant Chat UI
participant API Route
participant Knowledge Loader
participant OpenRouter
User->>Chat UI: Send message
Chat UI->>API Route: POST /api/chat/stream
API Route->>Knowledge Loader: getKnowledge()
Knowledge Loader->>Knowledge Loader: Read content/*.md
Knowledge Loader-->>API Route: Knowledge context
API Route->>OpenRouter: Stream request
loop Streaming
OpenRouter-->>API Route: Token chunk
API Route-->>Chat UI: SSE event
Chat UI-->>User: Render token
end
```