{"id":49639383,"url":"https://github.com/obnoxiousmods/obbyai","last_synced_at":"2026-05-05T18:31:23.197Z","repository":{"id":349701242,"uuid":"1203512294","full_name":"obnoxiousmods/obbyai","owner":"obnoxiousmods","description":"Self-hosted local AI chat with multi-GPU routing, RAG, web search, and file intelligence. Powered by Ollama.","archived":false,"fork":false,"pushed_at":"2026-04-07T07:09:31.000Z","size":136,"stargazers_count":0,"open_issues_count":2,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-04-07T07:24:42.903Z","etag":null,"topics":["ai-chat","amd-gpu","arangodb","local-llm","ollama","python","rag","self-hosted","starlette","vulkan"],"latest_commit_sha":null,"homepage":"https://ai.obby.ca","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/obnoxiousmods.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":".github/SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-07T05:27:07.000Z","updated_at":"2026-04-07T07:09:37.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/obnoxiousmods/obbyai","commit_stats":null,"previous_names":["obnoxiousmods/obbyai"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/obnoxiousmods/obbyai","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obnoxiousmods%2Fobbyai","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obnoxiousmods%2Fobbyai/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obnoxiousmods%2Fobbyai/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obnoxiousmods%2Fobbyai/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/obnoxiousmods","download_url":"https://codeload.github.com/obnoxiousmods/obbyai/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/obnoxiousmods%2Fobbyai/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32662835,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-05T11:29:49.557Z","status":"ssl_error","status_checked_at":"2026-05-05T11:29:48.587Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-chat","amd-gpu","arangodb","local-llm","ollama","python","rag","self-hosted","starlette","vulkan"],"created_at":"2026-05-05T18:31:22.127Z","updated_at":"2026-05-05T18:31:23.189Z","avatar_url":"https://github.com/obnoxiousmods.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# ObbyAI\n\n**A fully local, self-hosted AI chat platform with multi-GPU routing, RAG, web search, and comprehensive file intelligence.**\n\n[![CI](https://github.com/obnoxiousmods/obbyai/actions/workflows/ci.yml/badge.svg)](https://github.com/obnoxiousmods/obbyai/actions/workflows/ci.yml)\n[![License: MIT](https://img.shields.io/badge/License-MIT-7c5cfc.svg)](LICENSE)\n[![Python](https://img.shields.io/badge/Python-3.12%2B-blue)](https://python.org)\n[![Ollama](https://img.shields.io/badge/Ollama-Vulkan%20%7C%20CUDA-orange)](https://ollama.com)\n[![ArangoDB](https://img.shields.io/badge/ArangoDB-3.12-green)](https://arangodb.com)\n\n[Features](#features) · [Quick Start](#quick-start) · [Architecture](#architecture) · [Models](#models) · [Configuration](#configuration) · [Contributing](#contributing)\n\n\u003c/div\u003e\n\n---\n\n## What is ObbyAI?\n\nObbyAI is a **production-quality, self-hosted AI chat interface** built on top of [Ollama](https://ollama.com). It runs entirely on your local hardware with no external API calls — except when you explicitly use the web search tool.\n\nKey design goals:\n- **Privacy-first**: all inference runs on your hardware, nothing leaves your network\n- **Multi-GPU**: route requests to different machines based on the task (e.g. vision to RTX 2080 Super, fast chat to RX 580)\n- **Tool-augmented**: real web search, persistent vector memory, exact math, and file intelligence built in\n- **Single-file frontend**: the entire UI is one `index.html` — zero build steps, zero Node.js in production\n\n---\n\n## Features\n\n### 🖥️ Multi-GPU Server Routing\nSwitch between GPU backends from the header:\n- **Local**: AMD RX 580 8GB (Vulkan) — fast 7–8B models\n- **Remote**: NVIDIA RTX 2080 Super 8GB (CUDA) — Gemma4 vision, larger models\n\nEach server has its own model list. Requests are proxied through the Python backend — no CORS, no direct browser-to-Ollama.\n\n### 🤖 10+ Curated Models\n\n| Model | Size | Tag | Best For |\n|---|---|---|---|\n| llama3.1:8b | 4.9 GB | General | Default — reasoning, coding, Q\u0026A |\n| qwen2.5:7b | 4.7 GB | General | Multilingual, strong reasoning |\n| qwen2.5-coder:7b | 4.7 GB | Coding | Code generation and debugging |\n| qwen3:8b | 5.2 GB | Thinking | Hybrid thinking/reasoning mode |\n| deepseek-r1:8b | 5.2 GB | Reasoning | Chain-of-thought problems |\n| mistral:7b-instruct | 4.4 GB | General | Structured tasks, fast responses |\n| mistral-nemo:12b | 7.1 GB | General | More capable, ~7GB VRAM |\n| gemma3:4b | 3.3 GB | Fast | Quick responses, efficient |\n| phi4-mini:3.8b | 2.5 GB | Fast | Microsoft Phi-4, punches above weight |\n| gemma4:latest | 9.6 GB | Vision | Multimodal image+text (RTX 2080S) |\n\n### 🛠️ Tool Calling\nAutomatic tool use for capable models (llama3.1, qwen2.5, qwen3, mistral, phi4-mini):\n\n| Tool | Trigger | Capability |\n|---|---|---|\n| 🔍 `web_search` | Current events, lookups, docs | DuckDuckGo, no API key |\n| 📚 `rag_search` | \"in my documents\", context recall | ArangoDB vector search |\n| 🧮 `calculator` | Any arithmetic | Safe AST eval, trig, log |\n| 🕐 `get_datetime` | Time/date queries | Timezone-aware |\n\nTool calls are streamed live — you see exactly what the model is searching for and what results it gets.\n\n### 📚 RAG — Retrieval-Augmented Generation\nPersistent knowledge base powered by [ArangoDB](https://arangodb.com) + [nomic-embed-text](https://ollama.com/library/nomic-embed-text) embeddings:\n- Automatic context injection: relevant chunks from your documents are silently prepended to every query\n- Conversation memory: past messages are embedded and retrievable across sessions\n- Document library: upload files and they stay in the knowledge base permanently\n\n### 📁 File Intelligence — 40+ Formats\n\n**Images** → Gemma4 vision model performs deep analysis:\n- Full OCR — extracts all visible text\n- Image description and context\n- Table/chart/diagram interpretation\n- Code and URL extraction\n\n**Documents:**\n\n| Format | Extraction Method |\n|---|---|\n| PDF | [pymupdf4llm](https://github.com/pymupdf/RAG) — layout-aware markdown |\n| DOCX | python-docx — headings, paragraphs, tables |\n| PPTX | python-pptx — per-slide text |\n| XLSX/ODS | openpyxl — all sheets as structured text |\n\n**Text \u0026 Code** (30+ extensions): direct UTF-8 extraction with encoding detection. Python, JavaScript, TypeScript, Rust, Go, Java, C/C++, SQL, YAML, TOML, JSON, Markdown, and more.\n\nAll extracted content is chunked, embedded, and stored in ArangoDB for future retrieval.\n\n### 🎭 Persona Library\n7 curated system prompts selectable from the toolbar:\n\n| Persona | Best For |\n|---|---|\n| ✨ Default | General knowledge, balanced |\n| ⚙️ Senior Developer | Architecture, code reviews, production-quality code |\n| 💻 Coding Assistant | Write/debug/review code |\n| 🔬 Research Assistant | Analysis, citations, web-sourced facts |\n| 📊 Data Analyst | Numbers, insights, business intelligence |\n| 🎨 Creative Partner | Writing, brainstorming, storytelling |\n| ⚡ Concise Mode | Shortest accurate answer, zero fluff |\n\n### 💬 Chat Features\n- **Multi-conversation sidebar** with search, rename, delete\n- **Persistent history** in localStorage with per-conversation session IDs\n- **Markdown rendering** with syntax-highlighted code (highlight.js, 100+ languages)\n- **Streaming tokens** with real-time render\n- **Regenerate / edit** user messages\n- **Export conversation** as Markdown\n- **Token stats** per message (tokens generated, tok/s)\n- **Drag \u0026 drop** files directly into chat\n- **Image attachment** for vision-capable models\n- **System prompt bar** — per-session custom instructions\n- **Parameter controls** — temperature, top-p, max tokens, context length\n- **Keyboard shortcuts**: `Ctrl+Enter` send, `Ctrl+K` new chat, `Esc` close modals\n\n---\n\n## Quick Start\n\n### Prerequisites\n\n| Requirement | Notes |\n|---|---|\n| Python 3.12+ | |\n| [uv](https://docs.astral.sh/uv/) | `curl -LsSf https://astral.sh/uv/install.sh \\| sh` |\n| [Ollama](https://ollama.com) | AMD: use `ollama-vulkan`; NVIDIA: standard install |\n| [ArangoDB](https://arangodb.com) 3.12 | For RAG features |\n\n### 1. Clone and install\n\n```bash\ngit clone https://github.com/obnoxiousmods/obbyai.git\ncd obbyai\nuv sync\n```\n\n### 2. Configure\n\n```bash\ncp .env.example .env\n# Edit .env: set ARANGO_PASS and OLLAMA_REMOTE_URL\n```\n\n### 3. Pull models\n\n```bash\n# Primary model\nollama pull llama3.1:8b\n\n# Required for RAG\nollama pull nomic-embed-text\n\n# Optional extras\nollama pull qwen2.5:7b qwen2.5-coder:7b deepseek-r1:8b\n```\n\n### 4. Run\n\n```bash\nuv run uvicorn main:app --host 0.0.0.0 --port 8091\n```\n\nOpen **http://localhost:8091**\n\n### Production (systemd)\n\n```ini\n[Unit]\nDescription=ObbyAI Chat Web UI\nAfter=network.target ollama.service arangodb3.service\n\n[Service]\nWorkingDirectory=/opt/ai-chat\nEnvironmentFile=/opt/ai-chat/.env\nExecStart=/home/user/.local/bin/uv run uvicorn main:app --host 127.0.0.1 --port 8091\nRestart=on-failure\nUser=youruser\n\n[Install]\nWantedBy=multi-user.target\n```\n\n### HTTPS with nginx\n\n```nginx\nserver {\n    listen 443 ssl;\n    server_name ai.yourdomain.com;\n\n    ssl_certificate     /path/to/cert.pem;\n    ssl_certificate_key /path/to/key.pem;\n\n    location / {\n        proxy_pass http://127.0.0.1:8091;\n        proxy_set_header Host $host;\n        proxy_buffering off;\n    }\n\n    location ~ ^/(api|v1)/ {\n        proxy_pass http://127.0.0.1:11434;\n    }\n}\n```\n\n---\n\n## Architecture\n\n```\nBrowser\n  │  HTTPS (nginx SSL termination)\n  ▼\nStarlette (port 8091)\n  ├── GET  /              → index.html (self-contained SPA)\n  ├── GET  /servers       → [{id, name, gpu, default_model}]\n  ├── GET  /models        → Ollama /api/tags (proxied per server)\n  ├── GET  /prompts       → persona list\n  ├── POST /upload        → file_processor → RAG ingest\n  ├── POST /ingest        → direct text → RAG ingest\n  ├── GET  /rag/stats     → document + session message counts\n  └── POST /chat          → tool loop → Ollama SSE → frontend SSE\n        │\n        ├── tools/web_search.py     DuckDuckGo\n        ├── tools/rag.py            ArangoDB + nomic-embed-text\n        ├── tools/calculator.py     AST eval\n        ├── tools/datetime_tool.py  zoneinfo\n        └── tools/file_processor.py\n              ├── Images   → Gemma4 @ remote:11434\n              ├── PDF      → pymupdf4llm\n              ├── DOCX     → python-docx\n              ├── PPTX     → python-pptx\n              ├── XLSX     → openpyxl\n              └── Text/Code → chardet + direct decode\n\nArangoDB (port 8529)\n  ├── rag_documents   (chunked text + nomic embeddings)\n  └── rag_sessions    (per-conversation message embeddings)\n\nOllama instances\n  ├── Local  127.0.0.1:11434   (RX 580 / Vulkan)\n  └── Remote 192.168.1.x:11434 (RTX 2080S / CUDA)\n```\n\n### Tool Call Flow\n\n```\nUser message\n    │\n    ├─ RAG search (if use_rag=true) → inject top-K chunks into context\n    │\n    └─ Ollama /api/chat (with tools=[web_search, rag_search, calculator, get_datetime])\n            │\n            ├─ Model streams tokens → forwarded to browser via SSE\n            │\n            └─ Model calls tool?\n                    │\n                    ├─ Execute tool\n                    ├─ Stream tool_call + tool_result events to browser\n                    └─ Re-submit with tool results → continue streaming\n```\n\n---\n\n## Configuration\n\nAll configuration is in `.env` (copy from `.env.example`):\n\n```env\nARANGO_URL=http://localhost:8529\nARANGO_USER=root\nARANGO_PASS=your_password\nARANGO_DB=ai_chat_rag\n\nOLLAMA_LOCAL_URL=http://127.0.0.1:11434\nOLLAMA_REMOTE_URL=http://192.168.1.x:11434\n\nEMBED_MODEL=nomic-embed-text\nVISION_MODEL=gemma4:latest\n```\n\n### AMD GPU Notes (RX 580 / Polaris)\n\nThe RX 580 2048SP (PCI ID `0x6fdf`) is a Chinese market variant not in ROCm's whitelist. Use Vulkan:\n\n```bash\n# Arch Linux\npacman -S ollama-vulkan   # NOT ollama (ROCm-only)\n\n# Verify GPU layers\nollama run llama3.1:8b \"hi\" 2\u003e\u00261 | grep -i gpu\n# Expected: \"offloaded 33/33 layers to GPU\"\n```\n\n---\n\n## Development\n\n```bash\nuv sync\nuv run uvicorn main:app --reload --port 8091\n\n# Lint\nuv run ruff check .\nuv run ruff format .\n\n# Check JS\npython3 -c \"\nimport re\nhtml = open('index.html').read()\nscripts = re.findall(r'\u003cscript\u003e(.*?)\u003c/script\u003e', html, re.DOTALL)\nopen('/tmp/check.js','w').write('\\n'.join(scripts))\n\"\nnode --check /tmp/check.js\n```\n\n---\n\n## Roadmap\n\n- [ ] Authentication (nginx basic auth / OAuth)\n- [ ] Docker Compose deployment\n- [ ] Code execution sandbox (Python REPL tool)\n- [ ] Voice input (Whisper via Ollama)\n- [ ] Image generation (Stable Diffusion / ComfyUI)\n- [ ] Conversation export to HTML\n- [ ] Mobile layout improvements\n- [ ] Scheduled automation tools\n\n---\n\n## Contributing\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md). PRs welcome.\n\n---\n\n## License\n\n[MIT](LICENSE) © 2026 obnoxiousmods\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fobnoxiousmods%2Fobbyai","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fobnoxiousmods%2Fobbyai","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fobnoxiousmods%2Fobbyai/lists"}