{"id":50125523,"url":"https://github.com/cycleuser/GangDan","last_synced_at":"2026-06-09T11:00:56.573Z","repository":{"id":339950376,"uuid":"1163085047","full_name":"cycleuser/GangDan","owner":"cycleuser","description":"A tool to use local LLM to help to Code.","archived":false,"fork":false,"pushed_at":"2026-06-07T00:37:03.000Z","size":25489,"stargazers_count":7,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-06-07T01:12:20.559Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cycleuser.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-02-21T04:23:59.000Z","updated_at":"2026-06-07T00:37:06.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/cycleuser/GangDan","commit_stats":null,"previous_names":["cycleuser/gangdan"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/cycleuser/GangDan","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cycleuser%2FGangDan","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cycleuser%2FGangDan/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cycleuser%2FGangDan/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cycleuser%2FGangDan/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cycleuser","download_url":"https://codeload.github.com/cycleuser/GangDan/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cycleuser%2FGangDan/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34103357,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-09T02:00:06.510Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-23T20:00:20.646Z","updated_at":"2026-06-09T11:00:56.568Z","avatar_url":"https://github.com/cycleuser.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# GangDan (纲担)\n\nLLM-powered knowledge management and teaching assistant with offline support.\n\n\u003e **GangDan (纲担)** — Principled and Accountable.\n\n![Chat Panel](images/chat.png)\n\n## Overview\n\nGangDan is a **local-first, offline programming assistant** powered by [Ollama](https://ollama.ai/) and [ChromaDB](https://www.trychroma.com/). It combines RAG-based knowledge management with teaching assistance tools, all running entirely on your machine — no cloud APIs required.\n\n![System Architecture](diagrams/architecture.svg)\n\n## Features\n\n### Knowledge Management\n\n- **Unified Literature Search** — Search arXiv, bioRxiv, medRxiv, Semantic Scholar, CrossRef, OpenAlex, DBLP, PubMed, and GitHub in one interface. AI-powered query refinement with automatic translation and synonym expansion.\n- **Batch Operations** — Multi-select, select-all, batch convert (PDF/HTML/TeX to Markdown with image and formula preservation), batch add to knowledge base. Sort by relevance, date, or title.\n- **Smart Renaming** — Downloaded papers automatically renamed to citation format: `Author et al. (Year) - Title.pdf`\n- **LLM-Generated Wiki** — Build structured wiki pages from knowledge base content with cross-KB concept linking. Like Wikipedia for your documents.\n- **Image Gallery** — Browse and search images stored in knowledge bases with context and source attribution.\n- **Document Manager** — One-click download and indexing of 30+ library docs (Python, Rust, Go, JS, CUDA, Docker, etc.). Upload custom docs, batch operations, GitHub repo search, web search to KB.\n- **Custom Knowledge Base Upload** — Upload your own Markdown (.md) and plain text (.txt) documents to create named knowledge bases with automatic indexing.\n\n### Teaching Assistant\n\n- **Question Generator** — MCQ, short answer, fill-in-the-blank, true/false from KB content.\n- **Guided Learning** — Auto-extract knowledge points, generate interactive lessons with Q\u0026A.\n- **Deep Research** — Multi-phase research pipeline: topic decomposition → RAG research → comprehensive reports.\n- **Lecture Maker** — Generate structured lecture content from KB materials.\n- **Exam Generator** — Create complete exam papers with answer keys from KB content.\n- **Literature Review \u0026 Paper Writer** — Generate academic reviews and papers from KB content.\n\n### Core Features\n\n- **RAG Chat** — Streaming chat with knowledge base retrieval and web search. Strict KB mode ensures grounded answers.\n- **Cross-Lingual Search** — Automatically detects query and document languages, enabling cross-lingual RAG (e.g., query English documents in Chinese).\n- **Citation References** — Each response automatically includes source document references for verification.\n- **AI Command Assistant** — Natural language → shell commands, draggable to terminal.\n- **Built-in Terminal** — Run commands with stdout/stderr display directly in the browser.\n- **Conversation Save/Load** — JSON export/import for session continuity.\n- **10-Language UI** — Chinese, English, Japanese, French, Russian, German, Italian, Spanish, Portuguese, Korean.\n- **Dark/Light Theme** — Full theme support with CSS variables.\n- **Offline by Design** — Runs entirely on your machine. No cloud APIs required.\n\n![Feature Map](diagrams/feature_map.svg)\n\n### Multi-Provider LLM Support\n\nGangDan supports a **separated mode**: local Ollama for chat/embedding/reranking, with optional external LLM providers for deep research and paper writing.\n\n![Provider System](diagrams/provider_system.svg)\n\n| Provider | API Type | Use Case |\n|----------|----------|----------|\n| **Ollama** (local) | ollama | Chat, Embedding, Reranking |\n| **DashScope** | OpenAI-compatible | Deep Research, Paper Writing |\n| **MiniMax** | OpenAI-compatible | Deep Research |\n| **Bailian Coding** | Anthropic-compatible | Deep Research |\n| **OpenAI / DeepSeek / Moonshot** | OpenAI-compatible | Deep Research |\n| **Custom** | OpenAI-compatible | Any compatible API |\n\n### CLI\n\n- Streaming chat (`gangdan chat \"question\"`), interactive REPL (`gangdan cli`)\n- KB operations, doc management, config, conversation persistence\n- AI command generation, shell execution with safety checks\n- Rich terminal output with formatted tables and syntax highlighting\n\n## Screenshots\n\n| Chat | Terminal |\n|:----:|:--------:|\n| ![Chat](images/chat.png) | ![Terminal](images/terminal.png) |\n\n| Documentation | Settings |\n|:-------------:|:--------:|\n| ![Docs](images/documents.png) | ![Settings](images/setting.png) |\n\n| Upload Documents | KB Scope Selection |\n|:----------------:|:------------------:|\n| ![Upload](images/upload.png) | ![Knowledge](images/knowledge.png) |\n\n| Strict KB Chat with Citations |\n|:-----------------------------:|\n| ![Strict Chat](images/specificated_knowledge_chat.png) |\n\nThe above screenshot demonstrates Strict KB Mode in action: after selecting a specific knowledge base, the system retrieves content only from that KB and automatically appends a reference list at the end of each response, citing the source documents.\n\n| Load Conversation | Conversation Loaded |\n|:-----------------:|:-------------------:|\n| ![Load](images/load_history.png) | ![Loaded](images/history_loaded.png) |\n\nSave your chat as a JSON file and load it anytime to continue the conversation.\n\n## RAG Pipeline\n\n![RAG Pipeline](diagrams/rag_pipeline.svg)\n\nThe complete pipeline from document ingestion to retrieval:\n\n1. **Document Ingestion** — Download from GitHub repositories or upload custom files (.rst, .py, .html, .cpp, .md)\n2. **Format Conversion** — Automatic conversion to unified Markdown format\n3. **Sliding Window Chunking** — Fixed-size segmentation with configurable overlap (default: 800 chars, 150 overlap)\n4. **Vector Embedding** — nomic-embed-text model via Ollama API (768-dim vectors, 500-char truncation)\n5. **Vector Storage** — ChromaDB with HNSW indexing and cosine similarity\n6. **Query Retrieval** — Top-K search with distance filtering (threshold 1.5), deduplication, and context construction\n\n### Chunking Strategy\n\n![Chunking Strategy](diagrams/chunking_strategy.svg)\n\nThe sliding window approach ensures contextual continuity across chunk boundaries. Key parameters:\n\n| Parameter | Default | Range | Description |\n|-----------|---------|-------|-------------|\n| CHUNK_SIZE | 800 chars | 100-2000 | Characters per chunk |\n| CHUNK_OVERLAP | 150 chars | N/A | Overlap between consecutive chunks |\n| MIN_CHUNK | 50 chars | N/A | Minimum chunk length threshold |\n\n## Requirements\n\n- Python 3.10+\n- [Ollama](https://ollama.ai/) running locally (default `http://localhost:11434`)\n- Chat model (e.g. `ollama pull qwen3`)\n- Embedding model (e.g. `ollama pull nomic-embed-text`)\n\n## Installation\n\n### Method 1: Install from PyPI (Recommended)\n\n```bash\npip install gangdan\ngangdan                    # Web GUI\ngangdan cli                # Interactive CLI\ngangdan --port 8080        # Custom port\n```\n\n### Method 2: Install from Source\n\n```bash\ngit clone https://github.com/cycleuser/GangDan.git\ncd GangDan\npip install -e .\ngangdan\n```\n\nOpen [http://127.0.0.1:5000](http://127.0.0.1:5000) in your browser.\n\n## Ollama Setup\n\n```bash\nollama serve\nollama pull qwen3\nollama pull nomic-embed-text\n```\n\n## Project Structure\n\n```\nGangDan/\n├── pyproject.toml\n├── README.md / README_CN.md\n├── gangdan/\n│   ├── __init__.py / __main__.py\n│   ├── cli.py / cli_app.py          # CLI entry + REPL\n│   ├── app.py                       # Flask backend\n│   ├── learning_routes.py           # Learning module blueprint\n│   ├── preprint_routes.py           # Preprint search + convert\n│   ├── research_routes.py           # Paper search\n│   ├── kb_routes.py                 # Custom KB management\n│   ├── export_routes.py             # Export API\n│   ├── core/                        # Shared modules\n│   │   ├── config.py                # Config, i18n, translations\n│   │   ├── ollama_client.py         # Ollama API\n│   │   ├── chroma_manager.py        # ChromaDB\n│   │   ├── vector_db.py             # Multi-backend vector DB\n│   │   ├── kb_manager.py            # Custom KB CRUD\n│   │   ├── conversation.py          # Chat history\n│   │   ├── doc_manager.py           # Doc download/index\n│   │   ├── wiki_builder.py          # LLM wiki generation\n│   │   ├── preprint_fetcher.py      # Preprint search\n│   │   ├── preprint_converter.py    # HTML/TeX/PDF → MD\n│   │   ├── pdf_converter.py         # PDF → MD (marker/mineru/docling)\n│   │   ├── export_manager.py        # Batch convert/export\n│   │   ├── web_searcher.py          # Web search\n│   │   └── ...\n│   ├── templates/index.html         # Main SPA template\n│   └── static/{css,js}/             # Frontend assets\n├── tests/                           # Test suite\n├── images/                          # Screenshots\n└── diagrams/                        # Architecture diagrams (SVG)\n```\n\n## Configuration\n\nAll settings through the **Settings** tab: Ollama URL, chat/embedding/reranker models, proxy, context length, output language, vector DB type, LLM provider selection, and API keys.\n\n## Testing\n\n```bash\npip install pytest pytest-cov\npytest tests/ -v\npytest tests/ --cov=gangdan\n```\n\n## Academic Paper\n\nFor a detailed empirical study of the RAG pipeline and chunking strategies, see [Article.md](Article.md) / [Article_CN.md](Article_CN.md).\n\n## License\n\nGPL-3.0-or-later. See [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcycleuser%2FGangDan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcycleuser%2FGangDan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcycleuser%2FGangDan/lists"}