{"id":26197197,"url":"https://github.com/ai-mindset/docuverse","last_synced_at":"2026-02-28T23:02:58.847Z","repository":{"id":281852339,"uuid":"943884075","full_name":"ai-mindset/docuverse","owner":"ai-mindset","description":"Q\u0026A app for easy information retrieval from documents of interest","archived":false,"fork":false,"pushed_at":"2025-04-02T13:17:54.000Z","size":123,"stargazers_count":1,"open_issues_count":5,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-18T07:47:52.698Z","etag":null,"topics":["customtkinter","langchain","llms","ollama","on-prem","pydantic","python","ruff","self-hosted","sqlite","uv"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ai-mindset.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-06T12:27:30.000Z","updated_at":"2025-04-02T13:13:53.000Z","dependencies_parsed_at":"2025-03-11T14:20:37.673Z","dependency_job_id":"77712c7c-a562-4701-9f10-a2aa0d166e4e","html_url":"https://github.com/ai-mindset/docuverse","commit_stats":null,"previous_names":["ai-mindset/docuverse"],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/ai-mindset/docuverse","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ai-mindset%2Fdocuverse","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ai-mindset%2Fdocuverse/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ai-mindset%2Fdocuverse/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ai-mindset%2Fdocuverse/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ai-mindset","download_url":"https://codeload.github.com/ai-mindset/docuverse/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ai-mindset%2Fdocuverse/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263427302,"owners_count":23464842,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["customtkinter","langchain","llms","ollama","on-prem","pydantic","python","ruff","self-hosted","sqlite","uv"],"created_at":"2025-03-12T02:24:39.479Z","updated_at":"2026-02-28T23:02:53.805Z","avatar_url":"https://github.com/ai-mindset.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DocuVerse\n\n[![Python Linting](https://github.com/ai-mindset/docuverse/actions/workflows/py-lint-format.yml/badge.svg)](https://github.com/ai-mindset/docuverse/actions/workflows/py-lint-format.yml) [![Python Type Checking](https://github.com/ai-mindset/docuverse/actions/workflows/py-type-check.yml/badge.svg)](https://github.com/ai-mindset/docuverse/actions/workflows/py-type-check.yml) [![Build Linux AppImage](https://github.com/ai-mindset/docuverse/actions/workflows/build-linux.yml/badge.svg)](https://github.com/ai-mindset/docuverse/actions/workflows/build-linux.yml) [![Build Windows Executable](https://github.com/ai-mindset/docuverse/actions/workflows/build-windows.yml/badge.svg)](https://github.com/ai-mindset/docuverse/actions/workflows/build-windows.yml) [![Create Release](https://github.com/ai-mindset/docuverse/actions/workflows/release.yml/badge.svg)](https://github.com/ai-mindset/docuverse/actions/workflows/release.yml) [![Dependabot Updates](https://github.com/ai-mindset/docuverse/actions/workflows/dependabot/dependabot-updates/badge.svg)](https://github.com/ai-mindset/docuverse/actions/workflows/dependabot/dependabot-updates)\n\n\n\nA self-hosted, privacy-preserving Question \u0026 Answer application for easy information retrieval from your personal document collection. DocuVerse (a blend of \"Document\" and \"Converse\") helps you interact with your documents through natural language queries, providing contextually relevant answers powered by language models—all while keeping your data completely private on your local machine.\n\n## System Architecture\n\n### Component Structure\n```mermaid\n%%{init: { 'themeVariables': { 'darkMode': true }, 'theme': 'base' }}%%\nflowchart TB\n    User([User]) --- UI[/GUI/CLI Interface\\]\n    \n    subgraph DocuVerse [\"DocuVerse System\"]\n        UI --- Document_Processing\n        UI --- Retrieval_QA\n        Document_Processing --- Storage\n        Retrieval_QA --- Storage\n        Retrieval_QA --- External\n        \n        subgraph Document_Processing [\"Document Processing\"]\n            TextSplitter[(Text Splitter)]\n            Embeddings[\"Nomic Embeddings\"]\n            TextSplitter --\u003e Embeddings\n        end\n        \n        subgraph Storage [\"Data Storage\"]\n            SQLite[(SQLite Database)]\n        end\n        \n        subgraph Retrieval_QA [\"Retrieval \u0026 QA\"]\n            VectorSearch{{Vector Search}}\n            QAChain{{QA Chain}}\n            VectorSearch --\u003e QAChain\n        end\n        \n        subgraph External [\"External Services\"]\n            Ollama[\"Ollama LLM (Mistral)\"]\n        end\n    end\n    \n    %% Dark Mode Colors\n    classDef interfaceDark fill:#6E40C9,stroke:#d6d8db,stroke-width:2px,stroke-dasharray:5 5,color:#FFFFFF\n    classDef storageDark fill:#0366D6,stroke:#d6d8db,stroke-width:2px,color:#FFFFFF\n    classDef processingDark fill:#28A745,stroke:#d6d8db,stroke-width:2px,color:#000000\n    classDef externalDark fill:#D73A49,stroke:#d6d8db,stroke-width:3px,color:#FFFFFF\n    \n    %% Light Mode Colors\n    classDef interfaceLight fill:#8A63D2,stroke:#24292e,stroke-width:2px,stroke-dasharray:5 5,color:#FFFFFF\n    classDef storageLight fill:#2188FF,stroke:#24292e,stroke-width:2px,color:#FFFFFF\n    classDef processingLight fill:#22863A,stroke:#24292e,stroke-width:2px,color:#FFFFFF\n    classDef externalLight fill:#CB2431,stroke:#24292e,stroke-width:3px,color:#FFFFFF\n    \n    %% Apply classes conditionally based on theme\n    class UI interfaceDark\n    class SQLite storageDark\n    class TextSplitter,Embeddings,VectorSearch,QAChain processingDark\n    class Ollama externalDark\n```\n\n### Data flow \n```mermaid\n%%{init: { 'themeVariables': { 'darkMode': false }, 'theme': 'base' }}%%\nflowchart TB\n    User([User]) --- UI[/GUI/CLI Interface\\]\n    \n    subgraph DocuVerse [\"DocuVerse System\"]\n        UI --- Document_Processing\n        UI --- Retrieval_QA\n        Document_Processing --- Storage\n        Retrieval_QA --- Storage\n        Retrieval_QA --- External\n        \n        subgraph Document_Processing [\"Document Processing\"]\n            TextSplitter[(Text Splitter)]\n            Embeddings[\"Nomic Embeddings\"]\n            TextSplitter --\u003e Embeddings\n        end\n        \n        subgraph Storage [\"Data Storage\"]\n            SQLite[(SQLite Database)]\n        end\n        \n        subgraph Retrieval_QA [\"Retrieval \u0026 QA\"]\n            VectorSearch{{Vector Search}}\n            QAChain{{QA Chain}}\n            VectorSearch --\u003e QAChain\n        end\n        \n        subgraph External [\"External Services\"]\n            Ollama[\"Ollama LLM (Mistral)\"]\n        end\n    end\n    \n    %% Apply Light Mode Classes\n    classDef interfaceLight fill:#8A63D2,stroke:#24292e,stroke-width:2px,stroke-dasharray:5 5,color:#FFFFFF\n    classDef storageLight fill:#2188FF,stroke:#24292e,stroke-width:2px,color:#FFFFFF\n    classDef processingLight fill:#22863A,stroke:#24292e,stroke-width:2px,color:#FFFFFF\n    classDef externalLight fill:#CB2431,stroke:#24292e,stroke-width:3px,color:#FFFFFF\n    \n    class UI interfaceLight\n    class SQLite storageLight\n    class TextSplitter,Embeddings,VectorSearch,QAChain processingLight\n    class Ollama externalLight\n```\n\n## Features\n\n- **Interactive Q\u0026A**: Ask questions about your documents in natural language\n- **Document Management**: Add and index text and markdown documents\n- **Modern GUI**: Clean, responsive interface with dark and light mode support\n- **Conversation History**: Track the full context of your document exploration\n- **Customisable Retrieval**: Adjust parameters to optimise search relevance\n\n## Installation\n\n### Prerequisites\n- Python 3.13 or higher (for development)\n- Install [Ollama](https://ollama.com/download) installed and running locally (for language model support)\n- Download the [mistral 7b](https://ollama.com/library/mistral) LLM with `ollama pull mistral` if you have a mainstream computer. Opt for [mistral-small 24b](https://ollama.com/library/mistral-small:24b) if you have a higher-end setup (run `ollama run mistral-small:24b`)\n\n### Method 1: Using AppImage (Linux)\n1. Download the latest AppImage from the [Releases](https://github.com/ai-mindset/docuverse/releases) page\n2. Make it executable: `chmod +x dv-*-x86_64.AppImage`\n3. Run the application: `./dv-*-x86_64.AppImage`\n\n### Method 2: From Source\n```bash\n# Clone the repository\ngit clone https://github.com/ai-mindset/docuverse.git\ncd docuverse\n\n# Install using uv (recommended)\ncurl -LsSf https://astral.sh/uv/install.sh | sh\nuv venv\nsource .venv/bin/activate\nuv pip install -e .\n\n# Or install using pip\npip install -e .\n\n# Run the application\npython -m dv.main\n```\n\n## Usage\n\n### Adding Documents\n1. Start the application\n2. Click \"Add Document\" and select your text (.txt) or markdown (.md) files\n3. Click \"Reindex Documents\" to process and prepare them for queries\n\n### Querying Your Documents\n1. Type your question in the input box\n2. Press Enter or click \"Send\"\n3. View the AI's response, which will include information from relevant documents\n\n### Command Line Options\nDocuVerse can be run with various options:\n\n```bash\npython -m dv.main [OPTIONS]\n\nOptions:\n  --cli                  Use command-line interface instead of GUI\n  --model MODEL          Specify which Ollama model to use\n  --temperature TEMP     Set the temperature (0-1) for LLM responses\n  --results NUM          Number of documents to retrieve for context\n  --reindex              Force reindexing of all documents\n  --light-mode           Use light mode for GUI\n```\n\n## Configuration\n\nDocuVerse can be customised by modifying settings in the config.py file:\n\n- **LLM_MODEL**: The default Ollama model (default: \"mistral-small:24b-instruct-2501-q4_K_M\")\n- **CHUNK_SIZE**: Size of document chunks for processing (default: 1000)\n- **DOCS_DIR**: Directory for document storage\n- **GUI_FONT**: Font settings for the UI\n\n## Development\n\n### Setting Up the Development Environment\n```bash\n# Install development dependencies\nuv pip install -e \".[dev]\"\n\n# Run code quality checks\nruff check .\nruff format .\npyright .\n```\n\n## License\n\n[MIT License](./LICENSE)\n\n## Acknowledgements\n\nDocuVerse is built using several open-source technologies:\n- [LangChain](https://www.langchain.com/) for document processing and retrieval\n- [Ollama](https://ollama.ai/) for local language model inference\n- [CustomTkinter](https://github.com/TomSchimansky/CustomTkinter) for the modern UI\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fai-mindset%2Fdocuverse","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fai-mindset%2Fdocuverse","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fai-mindset%2Fdocuverse/lists"}