{"id":36398357,"url":"https://github.com/HKUDS/DeepTutor","last_synced_at":"2026-01-18T09:01:23.600Z","repository":{"id":330863615,"uuid":"1124219907","full_name":"HKUDS/DeepTutor","owner":"HKUDS","description":"\"DeepTutor: AI-Powered Personalized Learning Assistant\"","archived":false,"fork":false,"pushed_at":"2026-01-13T16:33:34.000Z","size":60284,"stargazers_count":8301,"open_issues_count":24,"forks_count":1076,"subscribers_count":73,"default_branch":"main","last_synced_at":"2026-01-13T19:59:03.302Z","etag":null,"topics":["ai-agents","ai-tutor","deepresearch","idea-generation","interactive-learning","knowledge-graph","large-language-models","multi-agent-systems","rag"],"latest_commit_sha":null,"homepage":"https://hkuds.github.io/DeepTutor","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/HKUDS.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"docs/roadmap.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-28T15:35:54.000Z","updated_at":"2026-01-13T19:37:22.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/HKUDS/DeepTutor","commit_stats":null,"previous_names":["hkuds/opentutor"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/HKUDS/DeepTutor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HKUDS%2FDeepTutor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HKUDS%2FDeepTutor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HKUDS%2FDeepTutor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HKUDS%2FDeepTutor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/HKUDS","download_url":"https://codeload.github.com/HKUDS/DeepTutor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/HKUDS%2FDeepTutor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28534154,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-18T00:39:45.795Z","status":"online","status_checked_at":"2026-01-18T02:00:07.578Z","response_time":98,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","ai-tutor","deepresearch","idea-generation","interactive-learning","knowledge-graph","large-language-models","multi-agent-systems","rag"],"created_at":"2026-01-11T16:00:32.183Z","updated_at":"2026-01-18T09:01:23.581Z","avatar_url":"https://github.com/HKUDS.png","language":"Python","readme":"\u003cdiv align=\"center\"\u003e\n\n\u003cimg src=\"assets/logo-ver2.png\" alt=\"DeepTutor Logo\" width=\"150\" style=\"border-radius: 15px;\"\u003e\n\n# DeepTutor: AI-Powered Personalized Learning Assistant\n\n[![Python](https://img.shields.io/badge/Python-3.10%2B-3776AB?style=flat-square\u0026logo=python\u0026logoColor=white)](https://www.python.org/downloads/)\n[![FastAPI](https://img.shields.io/badge/FastAPI-0.100%2B-009688?style=flat-square\u0026logo=fastapi\u0026logoColor=white)](https://fastapi.tiangolo.com/)\n[![React](https://img.shields.io/badge/React-19-61DAFB?style=flat-square\u0026logo=react\u0026logoColor=black)](https://react.dev/)\n[![Next.js](https://img.shields.io/badge/Next.js-16-000000?style=flat-square\u0026logo=next.js\u0026logoColor=white)](https://nextjs.org/)\n[![TailwindCSS](https://img.shields.io/badge/Tailwind-3.4-06B6D4?style=flat-square\u0026logo=tailwindcss\u0026logoColor=white)](https://tailwindcss.com/)\n[![License](https://img.shields.io/badge/License-AGPL--3.0-blue?style=flat-square)](LICENSE)\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://discord.gg/zpP9cssj\"\u003e\u003cimg src=\"https://img.shields.io/badge/Discord-Join_Community-5865F2?style=for-the-badge\u0026logo=discord\u0026logoColor=white\" alt=\"Discord\"\u003e\u003c/a\u003e\n  \u0026nbsp;\u0026nbsp;\n  \u003ca href=\"./Communication.md\"\u003e\u003cimg src=\"https://img.shields.io/badge/Feishu-Join_Group-00D4AA?style=for-the-badge\u0026logo=feishu\u0026logoColor=white\" alt=\"Feishu\"\u003e\u003c/a\u003e\n  \u0026nbsp;\u0026nbsp;\n  \u003ca href=\"https://github.com/HKUDS/DeepTutor/issues/78\"\u003e\u003cimg src=\"https://img.shields.io/badge/WeChat-Join_Group-07C160?style=for-the-badge\u0026logo=wechat\u0026logoColor=white\" alt=\"WeChat\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\n\n[**Quick Start**](#quick-start) · [**Core Modules**](#core-modules) · [**FAQ**](#faq)\n\n[🇨🇳 中文](assets/README/README_CN.md) · [🇯🇵 日本語](assets/README/README_JA.md) · [🇪🇸 Español](assets/README/README_ES.md) · [🇫🇷 Français](assets/README/README_FR.md) · [🇸🇦 العربية](assets/README/README_AR.md) · [🇷🇺 Русский](assets/README/README_RU.md) · [🇮🇳 हिन्दी](assets/README/README_HI.md) · [🇵🇹 Português](assets/README/README_PT.md)\n\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n📚 **Massive Document Knowledge Q\u0026A** \u0026nbsp;•\u0026nbsp; 🎨 **Interactive Learning Visualization**\u003cbr\u003e\n🎯 **Knowledge Reinforcement** \u0026nbsp;•\u0026nbsp; 🔍 **Deep Research \u0026 Idea Generation**\n\n\u003c/div\u003e\n\n---\n### 📰 News\n\n\u003e **[2026.1.1]** Join our [Discord Community](https://discord.gg/zpP9cssj) and [GitHub Discussions](https://github.com/HKUDS/DeepTutor/discussions) - shape the future of DeepTutor! 💬\n\n\u003e **[2025.12.30]** Visit our [Official Website](https://hkuds.github.io/DeepTutor/) for more details!\n\n\u003e **[2025.12.29]** DeepTutor is now live! ✨\n\n### 📦 Releases\n\n\u003e **[2026.1.9]** Release [v0.4.1](https://github.com/HKUDS/DeepTutor/releases/tag/v0.4.1) with LLM Provider system overhaul, Question Generation robustness improvements, and codebase cleanup - Thanks to all the contributors!\n\u003cdetails\u003e\n\u003csummary\u003eHistory releases\u003c/summary\u003e\n\n\u003e **[2026.1.9]** Release [v0.4.0](https://github.com/HKUDS/DeepTutor/releases/tag/v0.4.0) with new code structure, multiple llm \u0026 embeddings support - Thanks to all the contributors!\n\n\u003e **[2026.1.5]** [v0.3.0](https://github.com/HKUDS/DeepTutor/releases/tag/v0.3.0) - Unified PromptManager architecture, CI/CD automation \u0026 pre-built Docker images on GHCR\n\n\u003e **[2026.1.2]** [v0.2.0](https://github.com/HKUDS/DeepTutor/releases/tag/v0.2.0) - Docker deployment, Next.js 16 \u0026 React 19 upgrade, WebSocket security \u0026 critical vulnerability fixes\n\n\u003c/details\u003e\n\n---\n\n## Key Features of DeepTutor\n\n### 📚 Massive Document Knowledge Q\u0026A\n• **Smart Knowledge Base**: Upload textbooks, research papers, technical manuals, and domain-specific documents. Build a comprehensive AI-powered knowledge repository for instant access.\u003cbr\u003e\n• **Multi-Agent Problem Solving**: Dual-loop reasoning architecture with RAG, web search, and code execution -- delivering step-by-step solutions with precise citations.\n\n### 🎨 Interactive Learning Visualization\n• **Knowledge Simplification \u0026 Explanations**: Transform complex concepts, knowledge, and algorithms into easy-to-understand visual aids, detailed step-by-step breakdowns, and engaging interactive demonstrations.\u003cbr\u003e\n• **Personalized Q\u0026A**: Context-aware conversations that adapt to your learning progress, with interactive pages and session-based knowledge tracking.\n\n### 🎯 Knowledge Reinforcement with Practice Exercise Generator\n• **Intelligent Exercise Creation**: Generate targeted quizzes, practice problems, and customized assessments tailored to your current knowledge level and specific learning objectives.\u003cbr\u003e\n• **Authentic Exam Simulation**: Upload reference exams to generate practice questions that perfectly match the original style, format, and difficulty—giving you realistic preparation for the actual test.\n\n### 🔍 Deep Research \u0026 Idea Generation\n• **Comprehensive Research \u0026 Literature Review**: Conduct in-depth topic exploration with systematic analysis. Identify patterns, connect related concepts across disciplines, and synthesize existing research findings.\u003cbr\u003e\n• **Novel Insight Discovery**: Generate structured learning materials and uncover knowledge gaps. Identify promising new research directions through intelligent cross-domain knowledge synthesis.\n\n---\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"assets/figs/title_gradient.svg\" alt=\"All-in-One Tutoring System\" width=\"70%\"\u003e\n\u003c/div\u003e\n\n\u003c!-- ━━━━━━━━━━━━━━━━ Core Learning Experience ━━━━━━━━━━━━━━━━ --\u003e\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003ctd width=\"50%\" align=\"center\" valign=\"top\"\u003e\n\n\u003ch3\u003e📚 Massive Document Knowledge Q\u0026A\u003c/h3\u003e\n\u003ca href=\"#problem-solving-agent\"\u003e\n\u003cimg src=\"assets/gifs/solve.gif\" width=\"100%\"\u003e\n\u003c/a\u003e\n\u003cbr\u003e\n\u003csub\u003eMulti-agent Problem Solving with Exact Citations\u003c/sub\u003e\n\n\u003c/td\u003e\n\u003ctd width=\"50%\" align=\"center\" valign=\"top\"\u003e\n\n\u003ch3\u003e🎨 Interactive Learning Visualization\u003c/h3\u003e\n\u003ca href=\"#guided-learning\"\u003e\n\u003cimg src=\"assets/gifs/guided-learning.gif\" width=\"100%\"\u003e\n\u003c/a\u003e\n\u003cbr\u003e\n\u003csub\u003eStep-by-step Visual Explanations with Personal QAs.\u003c/sub\u003e\n\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\u003c!-- ━━━━━━━━━━━━━━━━ Practice \u0026 Reinforcement ━━━━━━━━━━━━━━━━ --\u003e\n\n\u003ch3 align=\"center\"\u003e🎯 Knowledge Reinforcement\u003c/h3\u003e\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003ctd width=\"50%\" valign=\"top\" align=\"center\"\u003e\n\n\u003ca href=\"#question-generator\"\u003e\n\u003cimg src=\"assets/gifs/question-1.gif\" width=\"100%\"\u003e\n\u003c/a\u003e\n\n**Custom Questions**  \n\u003csub\u003eAuto-Validated Practice Questions Generation\u003c/sub\u003e\n\n\u003c/td\u003e\n\u003ctd width=\"50%\" valign=\"top\" align=\"center\"\u003e\n\n\u003ca href=\"#question-generator\"\u003e\n\u003cimg src=\"assets/gifs/question-2.gif\" width=\"100%\"\u003e\n\u003c/a\u003e\n\n**Mimic Questions**  \n\u003csub\u003eClone Exam Style for Authentic Practice\u003c/sub\u003e\n\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\u003c!-- ━━━━━━━━━━━━━━━━ Research \u0026 Creation ━━━━━━━━━━━━━━━━ --\u003e\n\n\u003ch3 align=\"center\"\u003e🔍 Deep Research \u0026 Idea Generation\u003c/h3\u003e\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003ctd width=\"33%\" align=\"center\"\u003e\n\n\u003ca href=\"#deep-research\"\u003e\n\u003cimg src=\"assets/gifs/deepresearch.gif\" width=\"100%\"\u003e\n\u003c/a\u003e\n\n**Deep Research**  \n\u003csub\u003eKnowledge Extention from Textbook with RAG, Web and Paper-search\u003c/sub\u003e\n\n\u003c/td\u003e\n\u003ctd width=\"33%\" align=\"center\"\u003e\n\n\u003ca href=\"#idea-generation\"\u003e\n\u003cimg src=\"assets/gifs/ideagen.gif\" width=\"100%\"\u003e\n\u003c/a\u003e\n\n**Automated IdeaGen**  \n\u003csub\u003eBrainstorming and Concept Synthesis with Dual-filter Workflow\u003c/sub\u003e\n\n\u003c/td\u003e\n\u003ctd width=\"33%\" align=\"center\"\u003e\n\n\u003ca href=\"#co-writer\"\u003e\n\u003cimg src=\"assets/gifs/co-writer.gif\" width=\"100%\"\u003e\n\u003c/a\u003e\n\n**Interactive IdeaGen**  \n\u003csub\u003eRAG and Web-search Powered Co-writer with Podcast Generation\u003c/sub\u003e\n\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\u003c!-- ━━━━━━━━━━━━━━━━ Knowledge Infrastructure ━━━━━━━━━━━━━━━━ --\u003e\n\n\u003ch3 align=\"center\"\u003e🏗️ All-in-One Knowledge System\u003c/h3\u003e\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003ctd width=\"50%\" align=\"center\"\u003e\n\n\u003ca href=\"#dashboard--knowledge-base-management\"\u003e\n\u003cimg src=\"assets/gifs/knowledge_bases.png\" width=\"100%\"\u003e\n\u003c/a\u003e\n\n**Personal Knowledge Base**  \n\u003csub\u003eBuild and Organize Your Own Knowledge Repository\u003c/sub\u003e\n\n\u003c/td\u003e\n\u003ctd width=\"50%\" align=\"center\"\u003e\n\n\u003ca href=\"#notebook\"\u003e\n\u003cimg src=\"assets/gifs/notebooks.png\" width=\"100%\"\u003e\n\u003c/a\u003e\n\n**Personal Notebook**  \n\u003csub\u003eYour Contextual Memory for Learning Sessions\u003c/sub\u003e\n\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003csub\u003e🌙 Use DeepTutor in \u003cb\u003eDark Mode\u003c/b\u003e!\u003c/sub\u003e\n\u003c/p\u003e\n\n---\n\n## 🏛️ DeepTutor's Framework\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"assets/figs/full-pipe.png\" alt=\"DeepTutor Full-Stack Workflow\" width=\"100%\"\u003e\n\u003c/div\u003e\n\n### 💬 User Interface Layer\n• **Intuitive Interaction**: Simple bidirectional query-response flow for intuitive interaction.\u003cbr\u003e\n• **Structured Output**: Structured response generation that organizes complex information into actionable outputs.\n\n### 🤖 Intelligent Agent Modules\n• **Problem Solving \u0026 Assessment**: Step-by-step problem solving and custom assessment generation.\u003cbr\u003e\n• **Research \u0026 Learning**: Deep Research for topic exploration and Guided Learning with visualization.\u003cbr\u003e\n• **Idea Generation**: Automated and interactive concept development with multi-source insights.\n\n### 🔧 Tool Integration Layer\n• **Information Retrieval**: RAG hybrid retrieval, real-time web search, and academic paper databases.\u003cbr\u003e\n• **Processing \u0026 Analysis**: Python code execution, query item lookup, and PDF parsing for document analysis.\n\n### 🧠 Knowledge \u0026 Memory Foundation\n• **Knowledge Graph**: Entity-relation mapping for semantic connections and knowledge discovery.\u003cbr\u003e\n• **Vector Store**: Embedding-based semantic search for intelligent content retrieval.\u003cbr\u003e\n• **Memory System**: Session state management and citation tracking for contextual continuity.\n\n## 📋 Todo\n\u003e 🌟 Star to follow our future updates!\n- [ x ] Support More RAG Pipelines\n- [ x ] DataBase Robostness and Visualization\n- [   ] Personalized Interaction with Notebook\n\n## 🚀 Getting Started\n\n### Step 1: Pre-Configuration\n\n**① Clone Repository**\n\n```bash\ngit clone https://github.com/HKUDS/DeepTutor.git\ncd DeepTutor\n```\n\n**② Set Up Environment Variables**\n\n```bash\ncp .env.example .env\n# Edit .env file with your API keys\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e📋 \u003cb\u003eEnvironment Variables Reference\u003c/b\u003e\u003c/summary\u003e\n\n| Variable | Required | Description |\n|:---|:---:|:---|\n| `LLM_MODEL` | **Yes** | Model name (e.g., `gpt-4o`) |\n| `LLM_API_KEY` | **Yes** | Your LLM API key |\n| `LLM_HOST` | **Yes** | API endpoint URL |\n| `EMBEDDING_MODEL` | **Yes** | Embedding model name |\n| `EMBEDDING_API_KEY` | **Yes** | Embedding API key |\n| `EMBEDDING_HOST` | **Yes** | Embedding API endpoint |\n| `BACKEND_PORT` | No | Backend port (default: `8001`) |\n| `FRONTEND_PORT` | No | Frontend port (default: `3782`) |\n| `NEXT_PUBLIC_API_BASE` | No | **Frontend API URL** - Set this for remote/LAN access (e.g., `http://192.168.1.100:8001`) |\n| `TTS_*` | No | Text-to-Speech settings |\n| `SEARCH_PROVIDER` | No | Search provider (options: `perplexity`, `baidu`, default: `perplexity`) |\n| `PERPLEXITY_API_KEY` | No | For Perplexity web search |\n| `BAIDU_API_KEY` | No | For Baidu AI search |\n\n\u003e 💡 **Remote Access**: If accessing from another device (e.g., `192.168.31.66:3782`), add to `.env`:\n\u003e ```bash\n\u003e NEXT_PUBLIC_API_BASE=http://192.168.31.66:8001\n\u003e ```\n\n\u003c/details\u003e\n\n**③ Configure Ports \u0026 LLM** *(Optional)*\n\n- **Ports**: Set in `.env` file → `BACKEND_PORT` / `FRONTEND_PORT` (defaults: 8001/3782)\n- **LLM**: Edit `config/agents.yaml` → `temperature` / `max_tokens` per module\n- See [Configuration Docs](config/README.md) for details\n\n**④ Try Demo Knowledge Bases** *(Optional)*\n\n\u003cdetails\u003e\n\u003csummary\u003e📚 \u003cb\u003eAvailable Demos\u003c/b\u003e\u003c/summary\u003e\n\n- **Research Papers** — 5 papers from our lab ([AI-Researcher](https://github.com/HKUDS/AI-Researcher), [LightRAG](https://github.com/HKUDS/LightRAG), etc.)\n- **Data Science Textbook** — 8 chapters, 296 pages ([Book Link](https://ma-lab-berkeley.github.io/deep-representation-learning-book/))\n\n\u003c/details\u003e\n\n1. Download from [Google Drive](https://drive.google.com/drive/folders/1iWwfZXiTuQKQqUYb5fGDZjLCeTUP6DA6?usp=sharing)\n2. Extract into `data/` directory\n\n\u003e Demo KBs use `text-embedding-3-large` with `dimensions = 3072`\n\n**⑤ Create Your Own Knowledge Base** *(After Launch)*\n\n1. Go to http://localhost:3782/knowledge\n2. Click \"New Knowledge Base\" → Enter name → Upload PDF/TXT/MD files\n3. Monitor progress in terminal\n\n---\n\n### Step 2: Choose Your Installation Method\n\n#### 🐳 Option A: Docker Deployment\n\n\u003e No Python/Node.js setup required\n\n**Prerequisites**: [Docker](https://docs.docker.com/get-docker/) \u0026 [Docker Compose](https://docs.docker.com/compose/install/)\n\n**Quick Start** — Build from source:\n\n```bash\ndocker compose up --build -d    # Build and start (~5-10 min first run)\ndocker compose logs -f          # View logs\n```\n\n**Or use pre-built image** (faster):\n\n```bash\n# Linux/macOS (AMD64)\ndocker run -d --name deeptutor \\\n  -p 8001:8001 -p 3782:3782 \\\n  --env-file .env \\\n  -v $(pwd)/data:/app/data \\\n  -v $(pwd)/config:/app/config:ro \\\n  ghcr.io/hkuds/deeptutor:latest\n\n# Apple Silicon (ARM64): use ghcr.io/hkuds/deeptutor:latest-arm64\n# Windows PowerShell: use ${PWD} instead of $(pwd)\n```\n\n**Common Commands**:\n\n```bash\ndocker compose up -d      # Start\ndocker compose down       # Stop\ndocker compose logs -f    # View logs\ndocker compose up --build # Rebuild after changes\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e📋 \u003cb\u003eMore Docker Options\u003c/b\u003e (Pre-built images, Cloud deployment, Custom ports)\u003c/summary\u003e\n\n**Pre-built Image Architecture Reference:**\n\n| Architecture | Image Tag | Use Case |\n|:-------------|:----------|:---------|\n| **AMD64** | `ghcr.io/hkuds/deeptutor:latest` | Intel/AMD (most servers, Windows/Linux PCs) |\n| **ARM64** | `ghcr.io/hkuds/deeptutor:latest-arm64` | Apple Silicon, AWS Graviton, Raspberry Pi |\n\n\u003e 💡 Run `uname -m` to check: `x86_64` = AMD64, `arm64`/`aarch64` = ARM64\n\n**Cloud Deployment** — Must set external API URL:\n\n```bash\ndocker run -d --name deeptutor \\\n  -p 8001:8001 -p 3782:3782 \\\n  -e NEXT_PUBLIC_API_BASE_EXTERNAL=https://your-server.com:8001 \\\n  --env-file .env \\\n  -v $(pwd)/data:/app/data \\\n  ghcr.io/hkuds/deeptutor:latest\n```\n\n**Custom Ports Example:**\n\n```bash\ndocker run -d --name deeptutor \\\n  -p 9001:9001 -p 3000:3000 \\\n  -e BACKEND_PORT=9001 \\\n  -e FRONTEND_PORT=3000 \\\n  -e NEXT_PUBLIC_API_BASE_EXTERNAL=https://your-server.com:9001 \\\n  --env-file .env \\\n  -v $(pwd)/data:/app/data \\\n  ghcr.io/hkuds/deeptutor:latest\n```\n\n\u003c/details\u003e\n\n---\n\n#### 💻 Option B: Manual Installation\n\n\u003e For development or non-Docker environments\n\n**Prerequisites**: Python 3.10+, Node.js 18+\n\n**1. Set Up Environment**:\n\n```bash\n# Using conda (Recommended)\nconda create -n deeptutor python=3.10 \u0026\u0026 conda activate deeptutor\n\n# Or using venv\npython -m venv venv \u0026\u0026 source venv/bin/activate  # Windows: venv\\Scripts\\activate\n```\n\n**2. Install Dependencies**:\n\n```bash\npip install -r requirements.txt\nnpm install --prefix web\n```\n\n**3. Launch**:\n\n```bash\npython scripts/start_web.py    # Start frontend + backend\n# Or: python scripts/start.py  # CLI only\n# Stop: Ctrl+C\n```\n\n\u003cdetails\u003e\n\u003csummary\u003e🔧 \u003cb\u003eStart Frontend \u0026 Backend Separately\u003c/b\u003e\u003c/summary\u003e\n\n**Backend** (FastAPI):\n```bash\npython src/api/run_server.py\n# Or: uvicorn src.api.main:app --host 0.0.0.0 --port 8001 --reload\n```\n\n**Frontend** (Next.js):\n```bash\ncd web \u0026\u0026 npm install \u0026\u0026 npm run dev -- -p 3782\n```\n\n**Note**: Create `web/.env.local`:\n```\nNEXT_PUBLIC_API_BASE=http://localhost:8001\n```\n\n| Service | Default Port |\n|:---:|:---:|\n| Backend | `8001` |\n| Frontend | `3782` |\n\n\u003c/details\u003e\n\n### Access URLs\n\n| Service | URL | Description |\n|:---:|:---|:---|\n| **Frontend** | http://localhost:3782 | Main web interface |\n| **API Docs** | http://localhost:8001/docs | Interactive API documentation |\n\n---\n\n## 📂 Data Storage\n\nAll user content and system data are stored in the `data/` directory:\n\n```\ndata/\n├── knowledge_bases/              # Knowledge base storage\n└── user/                         # User activity data\n    ├── solve/                    # Problem solving results and artifacts\n    ├── question/                 # Generated questions\n    ├── research/                 # Research reports and cache\n    ├── co-writer/                # Interactive IdeaGen documents and audio files\n    ├── notebook/                 # Notebook records and metadata\n    ├── guide/                    # Guided learning sessions\n    ├── logs/                     # System logs\n    └── run_code_workspace/       # Code execution workspace\n```\n\nResults are automatically saved during all activities. Directories are created automatically as needed.\n\n## 📦 Core Modules\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e🧠 Smart Solver\u003c/b\u003e\u003c/summary\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eArchitecture Diagram\u003c/b\u003e\u003c/summary\u003e\n\n![Smart Solver Architecture](assets/figs/solve.png)\n\n\u003c/details\u003e\n\n\u003e **Intelligent problem-solving system** based on **Analysis Loop + Solve Loop** dual-loop architecture, supporting multi-mode reasoning and dynamic knowledge retrieval.\n\n**Core Features**\n\n| Feature | Description |\n|:---:|:---|\n| Dual-Loop Architecture | **Analysis Loop**: InvestigateAgent → NoteAgent\u003cbr\u003e**Solve Loop**: PlanAgent → ManagerAgent → SolveAgent → CheckAgent → Format |\n| Multi-Agent Collaboration | Specialized agents: InvestigateAgent, NoteAgent, PlanAgent, ManagerAgent, SolveAgent, CheckAgent |\n| Real-time Streaming | WebSocket transmission with live reasoning process display |\n| Tool Integration | RAG (naive/hybrid), Web Search, Query Item, Code Execution |\n| Persistent Memory | JSON-based memory files for context preservation |\n| Citation Management | Structured citations with reference tracking |\n\n**Usage**\n\n1. Visit http://localhost:{frontend_port}/solver\n2. Select a knowledge base\n3. Enter your question, click \"Solve\"\n4. Watch the real-time reasoning process and final answer\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003ePython API\u003c/b\u003e\u003c/summary\u003e\n\n```python\nimport asyncio\nfrom src.agents.solve import MainSolver\n\nasync def main():\n    solver = MainSolver(kb_name=\"ai_textbook\")\n    result = await solver.solve(\n        question=\"Calculate the linear convolution of x=[1,2,3] and h=[4,5]\",\n        mode=\"auto\"\n    )\n    print(result['formatted_solution'])\n\nasyncio.run(main())\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eOutput Location\u003c/b\u003e\u003c/summary\u003e\n\n```\ndata/user/solve/solve_YYYYMMDD_HHMMSS/\n├── investigate_memory.json    # Analysis Loop memory\n├── solve_chain.json           # Solve Loop steps \u0026 tool records\n├── citation_memory.json       # Citation management\n├── final_answer.md            # Final solution (Markdown)\n├── performance_report.json    # Performance monitoring\n└── artifacts/                 # Code execution outputs\n```\n\n\u003c/details\u003e\n\n\u003c/details\u003e\n\n---\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e📝 Question Generator\u003c/b\u003e\u003c/summary\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eArchitecture Diagram\u003c/b\u003e\u003c/summary\u003e\n\n![Question Generator Architecture](assets/figs/question-gen.png)\n\n\u003c/details\u003e\n\n\u003e **Dual-mode question generation system** supporting **custom knowledge-based generation** and **reference exam paper mimicking** with automatic validation.\n\n**Core Features**\n\n| Feature | Description |\n|:---:|:---|\n| Custom Mode | **Background Knowledge** → **Question Planning** → **Generation** → **Single-Pass Validation**\u003cbr\u003eAnalyzes question relevance without rejection logic |\n| Mimic Mode | **PDF Upload** → **MinerU Parsing** → **Question Extraction** → **Style Mimicking**\u003cbr\u003eGenerates questions based on reference exam structure |\n| ReAct Engine | QuestionGenerationAgent with autonomous decision-making (think → act → observe) |\n| Validation Analysis | Single-pass relevance analysis with `kb_coverage` and `extension_points` |\n| Question Types | Multiple choice, fill-in-the-blank, calculation, written response, etc. |\n| Batch Generation | Parallel processing with progress tracking |\n| Complete Persistence | All intermediate files saved (background knowledge, plan, individual results) |\n| Timestamped Output | Mimic mode creates batch folders: `mimic_YYYYMMDD_HHMMSS_{pdf_name}/` |\n\n**Usage**\n\n**Custom Mode:**\n1. Visit http://localhost:{frontend_port}/question\n2. Fill in requirements (topic, difficulty, question type, count)\n3. Click \"Generate Questions\"\n4. View generated questions with validation reports\n\n**Mimic Mode:**\n1. Visit http://localhost:{frontend_port}/question\n2. Switch to \"Mimic Exam\" tab\n3. Upload PDF or provide parsed exam directory\n4. Wait for parsing → extraction → generation\n5. View generated questions alongside original references\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003ePython API\u003c/b\u003e\u003c/summary\u003e\n\n**Custom Mode - Full Pipeline:**\n```python\nimport asyncio\nfrom src.agents.question import AgentCoordinator\n\nasync def main():\n    coordinator = AgentCoordinator(\n        kb_name=\"ai_textbook\",\n        output_dir=\"data/user/question\"\n    )\n\n    # Generate multiple questions from text requirement\n    result = await coordinator.generate_questions_custom(\n        requirement_text=\"Generate 3 medium-difficulty questions about deep learning basics\",\n        difficulty=\"medium\",\n        question_type=\"choice\",\n        count=3\n    )\n\n    print(f\"✅ Generated {result['completed']}/{result['requested']} questions\")\n    for q in result['results']:\n        print(f\"- Relevance: {q['validation']['relevance']}\")\n\nasyncio.run(main())\n```\n\n**Mimic Mode - PDF Upload:**\n```python\nfrom src.agents.question.tools.exam_mimic import mimic_exam_questions\n\nresult = await mimic_exam_questions(\n    pdf_path=\"exams/midterm.pdf\",\n    kb_name=\"calculus\",\n    output_dir=\"data/user/question/mimic_papers\",\n    max_questions=5\n)\n\nprint(f\"✅ Generated {result['successful_generations']} questions\")\nprint(f\"Output: {result['output_file']}\")\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eOutput Location\u003c/b\u003e\u003c/summary\u003e\n\n**Custom Mode:**\n```\ndata/user/question/custom_YYYYMMDD_HHMMSS/\n├── background_knowledge.json      # RAG retrieval results\n├── question_plan.json              # Question planning\n├── question_1_result.json          # Individual question results\n├── question_2_result.json\n└── ...\n```\n\n**Mimic Mode:**\n```\ndata/user/question/mimic_papers/\n└── mimic_YYYYMMDD_HHMMSS_{pdf_name}/\n    ├── {pdf_name}.pdf                              # Original PDF\n    ├── auto/{pdf_name}.md                          # MinerU parsed markdown\n    ├── {pdf_name}_YYYYMMDD_HHMMSS_questions.json  # Extracted questions\n    └── {pdf_name}_YYYYMMDD_HHMMSS_generated_questions.json  # Generated questions\n```\n\n\u003c/details\u003e\n\n\u003c/details\u003e\n\n---\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e🎓 Guided Learning\u003c/b\u003e\u003c/summary\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eArchitecture Diagram\u003c/b\u003e\u003c/summary\u003e\n\n![Guided Learning Architecture](assets/figs/guide.png)\n\n\u003c/details\u003e\n\n\u003e **Personalized learning system** based on notebook content, automatically generating progressive learning paths through interactive pages and smart Q\u0026A.\n\n**Core Features**\n\n| Feature | Description |\n|:---:|:---|\n| Multi-Agent Architecture | **LocateAgent**: Identifies 3-5 progressive knowledge points\u003cbr\u003e**InteractiveAgent**: Converts to visual HTML pages\u003cbr\u003e**ChatAgent**: Provides contextual Q\u0026A\u003cbr\u003e**SummaryAgent**: Generates learning summaries |\n| Smart Knowledge Location | Automatic analysis of notebook content |\n| Interactive Pages | HTML page generation with bug fixing |\n| Smart Q\u0026A | Context-aware answers with explanations |\n| Progress Tracking | Real-time status with session persistence |\n| Cross-Notebook Support | Select records from multiple notebooks |\n\n**Usage Flow**\n\n1. **Select Notebook(s)** — Choose one or multiple notebooks (cross-notebook selection supported)\n2. **Generate Learning Plan** — LocateAgent identifies 3-5 core knowledge points\n3. **Start Learning** — InteractiveAgent generates HTML visualization\n4. **Learning Interaction** — Ask questions, click \"Next\" to proceed\n5. **Complete Learning** — SummaryAgent generates learning summary\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eOutput Location\u003c/b\u003e\u003c/summary\u003e\n\n```\ndata/user/guide/\n└── session_{session_id}.json    # Complete session state, knowledge points, chat history\n```\n\n\u003c/details\u003e\n\n\u003c/details\u003e\n\n---\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e✏️ Interactive IdeaGen (Co-Writer)\u003c/b\u003e\u003c/summary\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eArchitecture Diagram\u003c/b\u003e\u003c/summary\u003e\n\n![Interactive IdeaGen Architecture](assets/figs/co-writer.png)\n\n\u003c/details\u003e\n\n\u003e **Intelligent Markdown editor** supporting AI-assisted writing, auto-annotation, and TTS narration.\n\n**Core Features**\n\n| Feature | Description |\n|:---:|:---|\n| Rich Text Editing | Full Markdown syntax support with live preview |\n| EditAgent | **Rewrite**: Custom instructions with optional RAG/web context\u003cbr\u003e**Shorten**: Compress while preserving key information\u003cbr\u003e**Expand**: Add details and context |\n| Auto-Annotation | Automatic key content identification and marking |\n| NarratorAgent | Script generation, TTS audio, multiple voices (Cherry, Stella, Annie, Cally, Eva, Bella) |\n| Context Enhancement | Optional RAG or web search for additional context |\n| Multi-Format Export | Markdown, PDF, etc. |\n\n**Usage**\n\n1. Visit http://localhost:{frontend_port}/co_writer\n2. Enter or paste text in the editor\n3. Use AI features: Rewrite, Shorten, Expand, Auto Mark, Narrate\n4. Export to Markdown or PDF\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eOutput Location\u003c/b\u003e\u003c/summary\u003e\n\n```\ndata/user/co-writer/\n├── audio/                    # TTS audio files\n│   └── {operation_id}.mp3\n├── tool_calls/               # Tool call history\n│   └── {operation_id}_{tool_type}.json\n└── history.json              # Edit history\n```\n\n\u003c/details\u003e\n\n\u003c/details\u003e\n\n---\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e🔬 Deep Research\u003c/b\u003e\u003c/summary\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eArchitecture Diagram\u003c/b\u003e\u003c/summary\u003e\n\n![Deep Research Architecture](assets/figs/deepresearch.png)\n\n\u003c/details\u003e\n\n\u003e **DR-in-KG** (Deep Research in Knowledge Graph) — A systematic deep research system based on **Dynamic Topic Queue** architecture, enabling multi-agent collaboration across three phases: **Planning → Researching → Reporting**.\n\n**Core Features**\n\n| Feature | Description |\n|:---:|:---|\n| Three-Phase Architecture | **Phase 1 (Planning)**: RephraseAgent (topic optimization) + DecomposeAgent (subtopic decomposition)\u003cbr\u003e**Phase 2 (Researching)**: ManagerAgent (queue scheduling) + ResearchAgent (research decisions) + NoteAgent (info compression)\u003cbr\u003e**Phase 3 (Reporting)**: Deduplication → Three-level outline generation → Report writing with citations |\n| Dynamic Topic Queue | Core scheduling system with TopicBlock state management: `PENDING → RESEARCHING → COMPLETED/FAILED`. Supports dynamic topic discovery during research |\n| Execution Modes | **Series Mode**: Sequential topic processing\u003cbr\u003e**Parallel Mode**: Concurrent multi-topic processing with `AsyncCitationManagerWrapper` for thread-safe operations |\n| Multi-Tool Integration | **RAG** (hybrid/naive), **Query Item** (entity lookup), **Paper Search**, **Web Search**, **Code Execution** — dynamically selected by ResearchAgent |\n| Unified Citation System | Centralized CitationManager as single source of truth for citation ID generation, ref_number mapping, and deduplication |\n| Preset Configurations | **quick**: Fast research (1-2 subtopics, 1-2 iterations)\u003cbr\u003e**medium/standard**: Balanced depth (5 subtopics, 4 iterations)\u003cbr\u003e**deep**: Thorough research (8 subtopics, 7 iterations)\u003cbr\u003e**auto**: Agent autonomously decides depth |\n\n**Citation System Architecture**\n\nThe citation system follows a centralized design with CitationManager as the single source of truth:\n\n```\n┌─────────────────────────────────────────────────────────────────┐\n│                      CitationManager                            │\n│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │\n│  │  ID Generation  │  │  ref_number Map │  │   Deduplication │  │\n│  │  PLAN-XX        │  │  citation_id →  │  │   (papers only) │  │\n│  │  CIT-X-XX       │  │  ref_number     │  │                 │  │\n│  └────────┬────────┘  └────────┬────────┘  └────────┬────────┘  │\n└───────────┼────────────────────┼────────────────────┼───────────┘\n            │                    │                    │\n     ┌──────┴──────┐      ┌──────┴──────┐      ┌──────┴──────┐\n     │DecomposeAgent│      │ReportingAgent│      │ References │\n     │ ResearchAgent│      │ (inline [N]) │      │  Section   │\n     │  NoteAgent   │      └─────────────┘      └────────────┘\n     └─────────────┘\n```\n\n| Component | Description |\n|:---:|:---|\n| ID Format | **PLAN-XX** (planning stage RAG queries) + **CIT-X-XX** (research stage, X=block number) |\n| ref_number Mapping | Sequential 1-based numbers built from sorted citation IDs, with paper deduplication |\n| Inline Citations | Simple `[N]` format in LLM output, post-processed to clickable `[[N]](#ref-N)` links |\n| Citation Table | Clear reference table provided to LLM: `Cite as [1] → (RAG) query preview...` |\n| Post-processing | Automatic format conversion + validation to remove invalid citation references |\n| Parallel Safety | Thread-safe async methods (`get_next_citation_id_async`, `add_citation_async`) for concurrent execution |\n\n**Parallel Execution Architecture**\n\nWhen `execution_mode: \"parallel\"` is enabled, multiple topic blocks are researched concurrently:\n\n```\n┌─────────────────────────────────────────────────────────────────────────┐\n│                    Parallel Research Execution                          │\n├─────────────────────────────────────────────────────────────────────────┤\n│                                                                         │\n│   DynamicTopicQueue                    AsyncCitationManagerWrapper      │\n│   ┌─────────────────┐                  ┌─────────────────────────┐      │\n│   │ Topic 1 (PENDING)│ ──┐             │  Thread-safe wrapper    │      │\n│   │ Topic 2 (PENDING)│ ──┼──→ asyncio  │  for CitationManager    │      │\n│   │ Topic 3 (PENDING)│ ──┤   Semaphore │                         │      │\n│   │ Topic 4 (PENDING)│ ──┤   (max=5)   │  • get_next_citation_   │      │\n│   │ Topic 5 (PENDING)│ ──┘             │    id_async()           │      │\n│   └─────────────────┘                  │  • add_citation_async() │      │\n│            │                           └───────────┬─────────────┘      │\n│            ▼                                       │                    │\n│   ┌─────────────────────────────────────────────────────────────┐      │\n│   │              Concurrent ResearchAgent Tasks                  │      │\n│   │  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐        │      │\n│   │  │ Task 1  │  │ Task 2  │  │ Task 3  │  │ Task 4  │  ...   │      │\n│   │  │(Topic 1)│  │(Topic 2)│  │(Topic 3)│  │(Topic 4)│        │      │\n│   │  └────┬────┘  └────┬────┘  └────┬────┘  └────┬────┘        │      │\n│   │       │            │            │            │              │      │\n│   │       └────────────┴────────────┴────────────┘              │      │\n│   │                         │                                    │      │\n│   │                         ▼                                    │      │\n│   │              AsyncManagerAgentWrapper                        │      │\n│   │              (Thread-safe queue updates)                     │      │\n│   └─────────────────────────────────────────────────────────────┘      │\n│                                                                         │\n└─────────────────────────────────────────────────────────────────────────┘\n```\n\n| Component | Description |\n|:---:|:---|\n| `asyncio.Semaphore` | Limits concurrent tasks to `max_parallel_topics` (default: 5) |\n| `AsyncCitationManagerWrapper` | Wraps CitationManager with `asyncio.Lock()` for thread-safe ID generation |\n| `AsyncManagerAgentWrapper` | Ensures queue state updates are atomic across parallel tasks |\n| Real-time Progress | Live display of all active research tasks with status indicators |\n\n**Agent Responsibilities**\n\n| Agent | Phase | Responsibility |\n|:---:|:---:|:---|\n| RephraseAgent | Planning | Optimizes user input topic, supports multi-turn user interaction for refinement |\n| DecomposeAgent | Planning | Decomposes topic into subtopics with RAG context, obtains citation IDs from CitationManager |\n| ManagerAgent | Researching | Queue state management, task scheduling, dynamic topic addition |\n| ResearchAgent | Researching | Knowledge sufficiency check, query planning, tool selection, requests citation IDs before each tool call |\n| NoteAgent | Researching | Compresses raw tool outputs into summaries, creates ToolTraces with pre-assigned citation IDs |\n| ReportingAgent | Reporting | Builds citation map, generates three-level outline, writes report sections with citation tables, post-processes citations |\n\n**Report Generation Pipeline**\n\n```\n1. Build Citation Map     →  CitationManager.build_ref_number_map()\n2. Generate Outline       →  Three-level headings (H1 → H2 → H3)\n3. Write Sections         →  LLM uses [N] citations with provided citation table\n4. Post-process           →  Convert [N] → [[N]](#ref-N), validate references\n5. Generate References    →  Academic-style entries with collapsible source details\n```\n\n**Usage**\n\n1. Visit http://localhost:{frontend_port}/research\n2. Enter research topic\n3. Select research mode (quick/medium/deep/auto)\n4. Watch real-time progress with parallel/series execution\n5. View structured report with clickable inline citations\n6. Export as Markdown or PDF (with proper page splitting and Mermaid diagram support)\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eCLI\u003c/b\u003e\u003c/summary\u003e\n\n```bash\n# Quick mode (fast research)\npython src/agents/research/main.py --topic \"Deep Learning Basics\" --preset quick\n\n# Medium mode (balanced)\npython src/agents/research/main.py --topic \"Transformer Architecture\" --preset medium\n\n# Deep mode (thorough research)\npython src/agents/research/main.py --topic \"Graph Neural Networks\" --preset deep\n\n# Auto mode (agent decides depth)\npython src/agents/research/main.py --topic \"Reinforcement Learning\" --preset auto\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003ePython API\u003c/b\u003e\u003c/summary\u003e\n\n```python\nimport asyncio\nfrom src.agents.research import ResearchPipeline\nfrom src.core.core import get_llm_config, load_config_with_main\n\nasync def main():\n    # Load configuration (main.yaml merged with any module-specific overrides)\n    config = load_config_with_main(\"research_config.yaml\")\n    llm_config = get_llm_config()\n\n    # Create pipeline (agent parameters loaded from agents.yaml automatically)\n    pipeline = ResearchPipeline(\n        config=config,\n        api_key=llm_config[\"api_key\"],\n        base_url=llm_config[\"base_url\"],\n        kb_name=\"ai_textbook\"  # Optional: override knowledge base\n    )\n\n    # Run research\n    result = await pipeline.run(topic=\"Attention Mechanisms in Deep Learning\")\n    print(f\"Report saved to: {result['final_report_path']}\")\n\nasyncio.run(main())\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eOutput Location\u003c/b\u003e\u003c/summary\u003e\n\n```\ndata/user/research/\n├── reports/                          # Final research reports\n│   ├── research_YYYYMMDD_HHMMSS.md   # Markdown report with clickable citations [[N]](#ref-N)\n│   └── research_*_metadata.json      # Research metadata and statistics\n└── cache/                            # Research process cache\n    └── research_YYYYMMDD_HHMMSS/\n        ├── queue.json                # DynamicTopicQueue state (TopicBlocks + ToolTraces)\n        ├── citations.json            # Citation registry with ID counters and ref_number mapping\n        │                             #   - citations: {citation_id: citation_info}\n        │                             #   - counters: {plan_counter, block_counters}\n        ├── step1_planning.json       # Planning phase results (subtopics + PLAN-XX citations)\n        ├── planning_progress.json    # Planning progress events\n        ├── researching_progress.json # Researching progress events\n        ├── reporting_progress.json   # Reporting progress events\n        ├── outline.json              # Three-level report outline structure\n        └── token_cost_summary.json   # Token usage statistics\n```\n\n**Citation File Structure** (`citations.json`):\n```json\n{\n  \"research_id\": \"research_20241209_120000\",\n  \"citations\": {\n    \"PLAN-01\": {\"citation_id\": \"PLAN-01\", \"tool_type\": \"rag_hybrid\", \"query\": \"...\", \"summary\": \"...\"},\n    \"CIT-1-01\": {\"citation_id\": \"CIT-1-01\", \"tool_type\": \"paper_search\", \"papers\": [...], ...}\n  },\n  \"counters\": {\n    \"plan_counter\": 2,\n    \"block_counters\": {\"1\": 3, \"2\": 2}\n  }\n}\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eConfiguration Options\u003c/b\u003e\u003c/summary\u003e\n\nKey configuration in `config/main.yaml` (research section) and `config/agents.yaml`:\n\n```yaml\n# config/agents.yaml - Agent LLM parameters\nresearch:\n  temperature: 0.5\n  max_tokens: 12000\n\n# config/main.yaml - Research settings\nresearch:\n  # Execution Mode\n  researching:\n    execution_mode: \"parallel\"    # \"series\" or \"parallel\"\n    max_parallel_topics: 5        # Max concurrent topics\n    max_iterations: 5             # Max iterations per topic\n\n  # Tool Switches\n    enable_rag_hybrid: true       # Hybrid RAG retrieval\n    enable_rag_naive: true        # Basic RAG retrieval\n    enable_paper_search: true     # Academic paper search\n    enable_web_search: true       # Web search (also controlled by tools.web_search.enabled)\n    enable_run_code: true         # Code execution\n\n  # Queue Limits\n  queue:\n    max_length: 5                 # Maximum topics in queue\n\n  # Reporting\n  reporting:\n    enable_inline_citations: true # Enable clickable [N] citations in report\n\n  # Presets: quick, medium, deep, auto\n\n# Global tool switches in tools section\ntools:\n  web_search:\n    enabled: true                 # Global web search switch (higher priority)\n```\n\n\u003c/details\u003e\n\n\u003c/details\u003e\n\n---\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e💡 Automated IdeaGen\u003c/b\u003e\u003c/summary\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eArchitecture Diagram\u003c/b\u003e\u003c/summary\u003e\n\n![Automated IdeaGen Architecture](assets/figs/ideagen.png)\n\n\u003c/details\u003e\n\n\u003e **Research idea generation system** that extracts knowledge points from notebook records and generates research ideas through multi-stage filtering.\n\n**Core Features**\n\n| Feature | Description |\n|:---:|:---|\n| MaterialOrganizerAgent | Extracts knowledge points from notebook records |\n| Multi-Stage Filtering | **Loose Filter** → **Explore Ideas** (5+ per point) → **Strict Filter** → **Generate Markdown** |\n| Idea Exploration | Innovative thinking from multiple dimensions |\n| Structured Output | Organized markdown with knowledge points and ideas |\n| Progress Callbacks | Real-time updates for each stage |\n\n**Usage**\n\n1. Visit http://localhost:{frontend_port}/ideagen\n2. Select a notebook with records\n3. Optionally provide user thoughts/preferences\n4. Click \"Generate Ideas\"\n5. View generated research ideas organized by knowledge points\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003ePython API\u003c/b\u003e\u003c/summary\u003e\n\n```python\nimport asyncio\nfrom src.agents.ideagen import IdeaGenerationWorkflow, MaterialOrganizerAgent\nfrom src.core.core import get_llm_config\n\nasync def main():\n    llm_config = get_llm_config()\n\n    # Step 1: Extract knowledge points from materials\n    organizer = MaterialOrganizerAgent(\n        api_key=llm_config[\"api_key\"],\n        base_url=llm_config[\"base_url\"]\n    )\n    knowledge_points = await organizer.extract_knowledge_points(\n        \"Your learning materials or notebook content here\"\n    )\n\n    # Step 2: Generate research ideas\n    workflow = IdeaGenerationWorkflow(\n        api_key=llm_config[\"api_key\"],\n        base_url=llm_config[\"base_url\"]\n    )\n    result = await workflow.process(knowledge_points)\n    print(result)  # Markdown formatted research ideas\n\nasyncio.run(main())\n```\n\n\u003c/details\u003e\n\n\u003c/details\u003e\n\n---\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e📊 Dashboard + Knowledge Base Management\u003c/b\u003e\u003c/summary\u003e\n\n\u003e **Unified system entry** providing activity tracking, knowledge base management, and system status monitoring.\n\n**Key Features**\n\n| Feature | Description |\n|:---:|:---|\n| Activity Statistics | Recent solving/generation/research records |\n| Knowledge Base Overview | KB list, statistics, incremental updates |\n| Notebook Statistics | Notebook counts, record distribution |\n| Quick Actions | One-click access to all modules |\n\n**Usage**\n\n- **Web Interface**: Visit http://localhost:{frontend_port} to view system overview\n- **Create KB**: Click \"New Knowledge Base\", upload PDF/Markdown documents\n- **View Activity**: Check recent learning activities on Dashboard\n\n\u003c/details\u003e\n\n---\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e📓 Notebook\u003c/b\u003e\u003c/summary\u003e\n\n\u003e **Unified learning record management**, connecting outputs from all modules to create a personalized learning knowledge base.\n\n**Core Features**\n\n| Feature | Description |\n|:---:|:---|\n| Multi-Notebook Management | Create, edit, delete notebooks |\n| Unified Record Storage | Integrate solving/generation/research/Interactive IdeaGen records |\n| Categorization Tags | Auto-categorize by type, knowledge base |\n| Custom Appearance | Color, icon personalization |\n\n**Usage**\n\n1. Visit http://localhost:{frontend_port}/notebook\n2. Create new notebook (set name, description, color, icon)\n3. After completing tasks in other modules, click \"Add to Notebook\"\n4. View and manage all records on the notebook page\n\n\u003c/details\u003e\n\n---\n\n### 📖 Module Documentation\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003ctd align=\"center\"\u003e\u003ca href=\"config/README.md\"\u003eConfiguration\u003c/a\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003ca href=\"data/README.md\"\u003eData Directory\u003c/a\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003ca href=\"src/api/README.md\"\u003eAPI Backend\u003c/a\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003ca href=\"src/core/README.md\"\u003eCore Utilities\u003c/a\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"center\"\u003e\u003ca href=\"src/knowledge/README.md\"\u003eKnowledge Base\u003c/a\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003ca href=\"src/tools/README.md\"\u003eTools\u003c/a\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003ca href=\"web/README.md\"\u003eWeb Frontend\u003c/a\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003ca href=\"src/agents/solve/README.md\"\u003eSolve Module\u003c/a\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"center\"\u003e\u003ca href=\"src/agents/question/README.md\"\u003eQuestion Module\u003c/a\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003ca href=\"src/agents/research/README.md\"\u003eResearch Module\u003c/a\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003ca href=\"src/agents/co_writer/README.md\"\u003eInteractive IdeaGen Module\u003c/a\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003ca href=\"src/agents/guide/README.md\"\u003eGuide Module\u003c/a\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"center\" colspan=\"4\"\u003e\u003ca href=\"src/agents/ideagen/README.md\"\u003eAutomated IdeaGen Module\u003c/a\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n## ❓ FAQ\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eBackend fails to start?\u003c/b\u003e\u003c/summary\u003e\n\n**Checklist**\n- Confirm Python version \u003e= 3.10\n- Confirm all dependencies installed: `pip install -r requirements.txt`\n- Check if port 8001 is in use\n- Check `.env` file configuration\n\n**Solutions**\n- **Change port**: Set `BACKEND_PORT=9001` in `.env` file\n- **Check logs**: Review terminal error messages\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003ePort occupied after Ctrl+C?\u003c/b\u003e\u003c/summary\u003e\n\n**Problem**\n\nAfter pressing Ctrl+C during a running task (e.g., deep research), restarting shows \"port already in use\" error.\n\n**Cause**\n\nCtrl+C sometimes only terminates the frontend process while the backend continues running in the background.\n\n**Solution**\n\n```bash\n# macOS/Linux: Find and kill the process\nlsof -i :8001\nkill -9 \u003cPID\u003e\n\n# Windows: Find and kill the process\nnetstat -ano | findstr :8001\ntaskkill /PID \u003cPID\u003e /F\n```\n\nThen restart the service with `python scripts/start_web.py`.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003enpm: command not found error?\u003c/b\u003e\u003c/summary\u003e\n\n**Problem**\n\nRunning `scripts/start_web.py` shows `npm: command not found` or exit status 127.\n\n**Checklist**\n- Check if npm is installed: `npm --version`\n- Check if Node.js is installed: `node --version`\n- Confirm conda environment is activated (if using conda)\n\n**Solutions**\n```bash\n# Option A: Using Conda (Recommended)\nconda install -c conda-forge nodejs\n\n# Option B: Using Official Installer\n# Download from https://nodejs.org/\n\n# Option C: Using nvm\nnvm install 18\nnvm use 18\n```\n\n**Verify Installation**\n```bash\nnode --version  # Should show v18.x.x or higher\nnpm --version   # Should show version number\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eFrontend cannot connect to backend?\u003c/b\u003e\u003c/summary\u003e\n\n**Checklist**\n- Confirm backend is running (visit http://localhost:8001/docs)\n- Check browser console for error messages\n\n**Solution**\n\nCreate `.env.local` in `web` directory:\n\n```bash\nNEXT_PUBLIC_API_BASE=http://localhost:8001\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eDocker: Frontend cannot connect in cloud deployment?\u003c/b\u003e\u003c/summary\u003e\n\n**Problem**\n\nWhen deploying to a cloud server, the frontend shows connection errors like \"Failed to fetch\" or \"NEXT_PUBLIC_API_BASE is not configured\".\n\n**Cause**\n\nThe default API URL is `localhost:8001`, which points to the user's local machine in the browser, not your server.\n\n**Solution**\n\nSet the `NEXT_PUBLIC_API_BASE_EXTERNAL` environment variable to your server's public URL:\n\n```bash\n# Using docker run\ndocker run -d --name deeptutor \\\n  -e NEXT_PUBLIC_API_BASE_EXTERNAL=https://your-server.com:8001 \\\n  ... other options ...\n  ghcr.io/hkuds/deeptutor:latest\n\n# Or in .env file\nNEXT_PUBLIC_API_BASE_EXTERNAL=https://your-server.com:8001\n```\n\n**Custom Port Example:**\n```bash\n# If using backend port 9001\n-e BACKEND_PORT=9001 \\\n-e NEXT_PUBLIC_API_BASE_EXTERNAL=https://your-server.com:9001\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eDocker: How to use custom ports?\u003c/b\u003e\u003c/summary\u003e\n\n**Solution**\n\nSet both the port environment variables AND the port mappings:\n\n```bash\ndocker run -d --name deeptutor \\\n  -p 9001:9001 -p 4000:4000 \\\n  -e BACKEND_PORT=9001 \\\n  -e FRONTEND_PORT=4000 \\\n  -e NEXT_PUBLIC_API_BASE_EXTERNAL=http://localhost:9001 \\\n  ... other env vars ...\n  ghcr.io/hkuds/deeptutor:latest\n```\n\n**Important**: The `-p` port mapping must match the `BACKEND_PORT`/`FRONTEND_PORT` values.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eWebSocket connection fails?\u003c/b\u003e\u003c/summary\u003e\n\n**Checklist**\n- Confirm backend is running\n- Check firewall settings\n- Confirm WebSocket URL is correct\n\n**Solution**\n- **Check backend logs**\n- **Confirm URL format**: `ws://localhost:8001/api/v1/...`\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eWhere are module outputs stored?\u003c/b\u003e\u003c/summary\u003e\n\n| Module | Output Path |\n|:---:|:---|\n| Solve | `data/user/solve/solve_YYYYMMDD_HHMMSS/` |\n| Question | `data/user/question/question_YYYYMMDD_HHMMSS/` |\n| Research | `data/user/research/reports/` |\n| Interactive IdeaGen | `data/user/co-writer/` |\n| Notebook | `data/user/notebook/` |\n| Guide | `data/user/guide/session_{session_id}.json` |\n| Logs | `data/user/logs/` |\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eHow to add a new knowledge base?\u003c/b\u003e\u003c/summary\u003e\n\n**Web Interface**\n1. Visit http://localhost:{frontend_port}/knowledge\n2. Click \"New Knowledge Base\"\n3. Enter knowledge base name\n4. Upload PDF/TXT/MD documents\n5. System will process documents in background\n\n**CLI**\n```bash\npython -m src.knowledge.start_kb init \u003ckb_name\u003e --docs \u003cpdf_path\u003e\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eHow to incrementally add documents to existing KB?\u003c/b\u003e\u003c/summary\u003e\n\n**CLI (Recommended)**\n```bash\npython -m src.knowledge.add_documents \u003ckb_name\u003e --docs \u003cnew_document.pdf\u003e\n```\n\n**Benefits**\n- Only processes new documents, saves time and API costs\n- Automatically merges with existing knowledge graph\n- Preserves all existing data\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eNumbered items extraction failed with uvloop.Loop error?\u003c/b\u003e\u003c/summary\u003e\n\n**Problem**\n\nWhen initializing a knowledge base, you may encounter this error:\n```\nValueError: Can't patch loop of type \u003cclass 'uvloop.Loop'\u003e\n```\n\nThis occurs because Uvicorn uses `uvloop` event loop by default, which is incompatible with `nest_asyncio`.\n\n**Solution**\n\nUse one of the following methods to extract numbered items:\n\n```bash\n# Option 1: Using the shell script (recommended)\n./scripts/extract_numbered_items.sh \u003ckb_name\u003e\n\n# Option 2: Direct Python command\npython src/knowledge/extract_numbered_items.py --kb \u003ckb_name\u003e --base-dir ./data/knowledge_bases\n```\n\nThis will extract numbered items (Definitions, Theorems, Equations, etc.) from your knowledge base without reinitializing it.\n\n\u003c/details\u003e\n\n\n\u003c/div\u003e\n\n## ⭐ Star History\n\n\u003cdiv align=\"center\"\u003e\n\n\u003cp\u003e\n  \u003ca href=\"https://github.com/HKUDS/DeepTutor/stargazers\"\u003e\u003cimg src=\"assets/roster/stargazers.svg\" alt=\"Stargazers\"/\u003e\u003c/a\u003e\n  \u0026nbsp;\u0026nbsp;\n  \u003ca href=\"https://github.com/HKUDS/DeepTutor/network/members\"\u003e\u003cimg src=\"assets/roster/forkers.svg\" alt=\"Forkers\"/\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003ca href=\"https://www.star-history.com/#HKUDS/DeepTutor\u0026type=timeline\u0026legend=top-left\"\u003e\n  \u003cpicture\u003e\n    \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"https://api.star-history.com/svg?repos=HKUDS/DeepTutor\u0026type=timeline\u0026theme=dark\u0026legend=top-left\" /\u003e\n    \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"https://api.star-history.com/svg?repos=HKUDS/DeepTutor\u0026type=timeline\u0026legend=top-left\" /\u003e\n    \u003cimg alt=\"Star History Chart\" src=\"https://api.star-history.com/svg?repos=HKUDS/DeepTutor\u0026type=timeline\u0026legend=top-left\" /\u003e\n  \u003c/picture\u003e\n\u003c/a\u003e\n\n\u003c/div\u003e\n\n## 🤝 Contribution\n\n\u003cdiv align=\"center\"\u003e\n\nWe hope DeepTutor could become a gift for the community. 🎁\n\n\u003ca href=\"https://github.com/HKUDS/DeepTutor/graphs/contributors\"\u003e\n  \u003cimg src=\"https://contrib.rocks/image?repo=HKUDS/DeepTutor\u0026max=999\" /\u003e\n\u003c/a\u003e\n\n\u003c/div\u003e\n\n## 🔗 Related Projects\n\n\u003cdiv align=\"center\"\u003e\n\n| [⚡ LightRAG](https://github.com/HKUDS/LightRAG) | [🎨 RAG-Anything](https://github.com/HKUDS/RAG-Anything) | [💻 DeepCode](https://github.com/HKUDS/DeepCode) | [🔬 AI-Researcher](https://github.com/HKUDS/AI-Researcher) |\n|:---:|:---:|:---:|:---:|\n| Simple and Fast RAG | Multimodal RAG | AI Code Assistant | Research Automation |\n\n**[Data Intelligence Lab @ HKU](https://github.com/HKUDS)**\n\n[⭐ Star us](https://github.com/HKUDS/DeepTutor/stargazers) · [🐛 Report a bug](https://github.com/HKUDS/DeepTutor/issues) · [💬 Discussions](https://github.com/HKUDS/DeepTutor/discussions)\n\n---\n\nThis project is licensed under the ***[AGPL-3.0 License](LICENSE)***.\n\n*✨ Thanks for visiting **DeepTutor**!*\n\n\u003cimg src=\"https://visitor-badge.laobi.icu/badge?page_id=HKUDS.DeepTutor\u0026style=for-the-badge\u0026color=00d4ff\" alt=\"Views\"\u003e\n\n\u003c/div\u003e\n","funding_links":[],"categories":["Python","Autonomous Research \u0026 Content Generation","A01_文本生成_文本对话","🧠 AI Applications \u0026 Platforms"],"sub_categories":["AI Assistants","大语言对话模型及数据","Tools"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FHKUDS%2FDeepTutor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FHKUDS%2FDeepTutor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FHKUDS%2FDeepTutor/lists"}