{"id":50766245,"url":"https://github.com/kenken64/misoto-indexer","last_synced_at":"2026-06-11T14:01:24.786Z","repository":{"id":301877351,"uuid":"1007231618","full_name":"kenken64/misoto-indexer","owner":"kenken64","description":"An AI-powered terminal application for intelligent code search and indexing using Spring AI and Qdrant vector databases.","archived":false,"fork":false,"pushed_at":"2025-06-29T10:27:55.000Z","size":383,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-29T11:29:27.277Z","etag":null,"topics":["embedded","ollama","qdrant-vector-database","spring","spring-ai"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kenken64.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-23T17:01:28.000Z","updated_at":"2025-06-29T10:27:58.000Z","dependencies_parsed_at":"2025-06-29T11:39:44.537Z","dependency_job_id":null,"html_url":"https://github.com/kenken64/misoto-indexer","commit_stats":null,"previous_names":["kenken64/misoto-indexer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/kenken64/misoto-indexer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kenken64%2Fmisoto-indexer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kenken64%2Fmisoto-indexer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kenken64%2Fmisoto-indexer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kenken64%2Fmisoto-indexer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kenken64","download_url":"https://codeload.github.com/kenken64/misoto-indexer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kenken64%2Fmisoto-indexer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34201842,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-11T02:00:06.485Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["embedded","ollama","qdrant-vector-database","spring","spring-ai"],"created_at":"2026-06-11T14:01:13.289Z","updated_at":"2026-06-11T14:01:24.761Z","avatar_url":"https://github.com/kenken64.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Misoto Codebase Indexer\n\nAn AI-powered terminal application for intelligent code search and indexing using Spring AI and vector databases.\n\n## Features\n\n- 🔍 **Natural Language Search**: Search code using plain English queries\n- 🧠 **Semantic Search**: Find conceptually similar code using AI embeddings\n- 📝 **Text Search**: Traditional keyword-based search\n- ⚙️ **Advanced Search**: Filter by file type, language, repository\n- 📚 **Intelligent Indexing**: AI-powered code analysis and indexing\n- 📊 **Detailed Status Tracking**: Real-time indexing progress and file type statistics\n- 💾 **Persistent Caching**: Avoids re-indexing unchanged files\n- 🔄 **Background Processing**: Non-blocking indexing with immediate search availability\n\n## 🔄 Application Logic Flow\n\n### **Hybrid Indexing Pipeline**\n\n```mermaid\ngraph TD\n    A[🚀 Application Start] --\u003e B[📋 Initialize Qdrant Collection]\n    B --\u003e C[🔍 Set Indexing Directory]\n    C --\u003e D[📂 Load File Cache]\n    D --\u003e E[🔍 Scan Directory Structure]\n    E --\u003e F{📄 File Validation}\n    \n    F --\u003e|Supported Extension| G[✅ Check Cache Status]\n    F --\u003e|Unsupported Extension| H[📊 Track Skipped Extensions]\n    \n    G --\u003e|New/Modified| I[🚀 Queue for Indexing]\n    G --\u003e|Unchanged| J[⏭️ Skip Processing]\n    \n    I --\u003e K[📋 Phase 1: Priority Files]\n    K --\u003e L[⚡ Virtual Thread Processing]\n    L --\u003e M[📄 Raw Text Extraction]\n    M --\u003e N[🤖 nomic-embed-text Embedding]\n    N --\u003e O[📊 768D Vector Generation]\n    O --\u003e P[☁️ Qdrant Vector Storage]\n    P --\u003e Q[💾 Update Cache]\n    \n    Q --\u003e R[📋 Phase 2: Remaining Files]\n    R --\u003e S[🔄 Background Batch Processing]\n    S --\u003e T[✅ Indexing Complete]\n    \n    H --\u003e U[📊 Status Reporting]\n    J --\u003e U\n    T --\u003e U\n```\n\n### **Embedding Flow Architecture**\n\n```\n📄 Raw Text (from source files)\n    ↓\n🤖 nomic-embed-text (Ollama embedding model - 768 dimensions)  \n    ↓\n📊 Vector Representation (768-dimensional float array)\n    ↓\n☁️ Qdrant Cloud (vector database storage with metadata)\n```\n\n### **File Processing Strategy**\n\n#### **Priority-Based Indexing**\n1. **Phase 1 - Critical Files (Priority 1-5):**\n   - Controllers (`*Controller.java`) - Priority 1\n   - Services (`*Service.java`) - Priority 2  \n   - Repositories (`*Repository.java`) - Priority 3\n   - Configuration (`*Config.java`) - Priority 4\n   - Applications (`*Application.java`) - Priority 5\n\n2. **Phase 2 - Background Processing:**\n   - All remaining supported files\n   - Processed in batches using virtual threads\n   - Non-blocking execution\n\n#### **Supported File Extensions**\n\n| Category | Extensions | Purpose |\n|----------|------------|---------|\n| **Java Ecosystem** | `.java`, `.xml`, `.properties`, `.yml`, `.yaml`, `.json` | Core application files |\n| **Documentation** | `.md`, `.txt`, `.st`, `.adoc` | Project documentation |\n| **JVM Languages** | `.kt`, `.scala` | Kotlin and Scala source |\n| **Database** | `.sql`, `.cql` | Database schemas and queries |\n| **Web Technologies** | `.html`, `.css`, `.js`, `.ts`, `.jsp`, `.asp`, `.aspx`, `.php` | Frontend and web components |\n| **System Scripts** | `.conf`, `.cmd`, `.sh`, `.ps1` | Configuration and automation |\n| **Programming Languages** | `.py`, `.c`, `.cpp`, `.cs`, `.rb`, `.vb`, `.go`, `.swift`, `.lua`, `.pl`, `.r` | Multi-language support |\n| **Documents** | `.pdf` | Documentation and specs |\n\n### **Search Execution Flow**\n\n```mermaid\ngraph LR\n    A[🔍 Search Query] --\u003e B{Search Type}\n    \n    B --\u003e|Natural Language| C[🤖 Process with LLM]\n    B --\u003e|Semantic| D[🧠 Direct Vector Search]\n    B --\u003e|Text| E[📝 Keyword Search]\n    \n    C --\u003e F[🔍 Generate Search Context]\n    F --\u003e G[📊 Vector Similarity Search]\n    \n    D --\u003e G\n    E --\u003e H[📂 File Content Search]\n    \n    G --\u003e I[📋 Rank Results by Relevance]\n    H --\u003e I\n    \n    I --\u003e J[📊 Format and Display Results]\n```\n\n### **Performance Optimizations**\n\n- **Virtual Threads**: Concurrent processing for I/O-intensive operations\n- **Persistent Cache**: Tracks file modification times to avoid re-indexing\n- **Batch Processing**: Groups files for efficient processing\n- **Priority Queuing**: Critical files indexed first for immediate search availability\n- **Smart Chunking**: Large files split into manageable 3KB chunks with 500-character overlap\n- **Background Execution**: Indexing runs asynchronously without blocking the CLI\n\n### **Status Tracking \u0026 Metrics**\n\nThe application provides comprehensive real-time metrics:\n\n- **📊 Progress**: Indexed vs. total files percentage\n- **⏱️ Timing**: Current duration, estimated completion time\n- **🚀 Performance**: Files per second processing speed\n- **🧵 Threading**: Active and peak virtual thread usage\n- **📄 File Types**: Breakdown by extension and count\n- **⚠️ Issues**: Failed and skipped file counts\n- **🚫 Skipped Extensions**: Non-supported file types encountered\n\n## Prerequisites\n\n- Java 21+\n- Maven 3.8+\n- Ollama (for local AI models)\n- Qdrant Cloud cluster (for vector search)\n\n## 🤖 Ollama Model Setup\n\nThis application uses specialized AI models for embeddings and chat:\n\n### Required Models:\n- **nomic-embed-text**: High-quality embedding model (768 dimensions)\n- **codellama:7b**: Code-aware chat model for intelligent analysis\n\n### Quick Setup:\n```bash\n# Run the setup script (Windows)\nsetup-models.bat\n\n# Or run manually:\nollama pull nomic-embed-text\nollama pull codellama:7b\n```\n\n### Linux/Mac Setup:\n```bash\n# Make script executable and run\nchmod +x setup-models.sh\n./setup-models.sh\n```\n\n### Why nomic-embed-text?\n- **Optimized for text**: Better semantic understanding than code-specific models for embeddings\n- **Efficient**: 768-dimensional vectors (vs 4096 for CodeLlama)\n- **Fast**: Quicker indexing and search operations\n- **Quality**: High-quality embeddings for code and documentation\n\n## ☁️ Qdrant Cloud Setup\n\n1. **Create Qdrant Cloud Account:**\n   - Go to [https://cloud.qdrant.io/](https://cloud.qdrant.io/)\n   - Sign up for a free account (includes 1GB storage)\n\n2. **Create a Cluster:**\n   - Click \"Create Cluster\"\n   - Choose your preferred region\n   - Select the free tier\n   - Wait for cluster deployment\n\n3. **Get Connection Details:**\n   - Copy your cluster URL (e.g., `https://xyz-123.qdrant.tech`)\n   - Generate an API key from the dashboard\n\n4. **Update Configuration:**\n   ```bash\n   # Copy the environment template\n   cp .env.example .env\n   \n   # Edit .env file with your Qdrant details:\n   QDRANT_HOST=xyz-123.qdrant.tech\n   QDRANT_API_KEY=your-generated-api-key\n   ```\n\n5. **Dynamic Collection Naming:**\n   - **Codebase directory**: Creates collection `codebase-index-ollama`\n   - **Other directories**: Creates collection `codebase-index`\n   - Collection names are set automatically based on the directory being indexed\n\n## 🚀 Quick Start Summary\n\n1. **Install Ollama**\n   ```bash\n   # Download and install Ollama from https://ollama.ai\n   # Or use curl (Linux/macOS):\n   curl -fsSL https://ollama.ai/install.sh | sh\n   ```\n\n2. **Pull CodeLlama Model**\n   ```bash\n   ollama pull codellama:7b\n   ```\n\n3. **Clone and Build**\n   ```bash\n   git clone \u003crepository-url\u003e\n   cd misoto-indexer\n   mvn clean compile\n   ```\n\n4. **Configure Environment Variables**\n   ```bash\n   # Copy the environment template\n   cp .env.example .env\n   \n   # Edit .env with your configuration:\n   # - Qdrant Cloud details (QDRANT_HOST, QDRANT_API_KEY)\n   # - Ollama configuration (OLLAMA_BASE_URL, models)\n   ```\n\n5. **Run the Application**\n   ```bash\n   mvn spring-boot:run\n   \n   # OR use the clean run script (recommended)\n   run-clean.bat\n   ```\n\n6. **Access Interactive CLI Menu**\n   The application will start with a clean interface directly to the menu:   \n   \n   ```\n   ╔══════════════════════════════════════════════════════════════╗\n   ║                    MISOTO CODEBASE INDEXER                   ║\n   ║                   Intelligent Code Search                    ║\n   ╚══════════════════════════════════════════════════════════════╝\n   \n   ┌─────────────────── SEARCH MENU ───────────────────┐\n   │ 1. [\u003e] Search with Natural Language Prompt        │\n   │ 2. [i] Indexing Status                            │\n   │ 3. [S] Semantic Code Search                       │\n   │ 4. [T] Text Search                                │\n   │ 5. [A] Advanced Search                            │\n   │ 6. [I] Index Codebase                             │\n   │ 7. [?] Help                                       │\n   │ 0. [X] Exit                                       │\n   └───────────────────────────────────────────────────┘\n   ```\n\n### Detailed Menu Options\n\n#### **1. 🔍 Natural Language Search**\nUse conversational queries to find code with AI assistance:\n\n**Example Queries:**\n```\n🔍 Search Query: Find authentication logic\n🔍 Search Query: Show me REST API endpoints for user management  \n🔍 Search Query: Classes that implement caching\n🔍 Search Query: Database connection configuration\n🔍 Search Query: JWT token validation\n```\n\n**How it works:**\n- AI processes your natural language intent\n- Converts to optimized search terms\n- Returns ranked results with relevance scores\n- Shows code snippets with context\n\n#### **2. 📊 Indexing Status**\nMonitor real-time indexing progress and system performance:\n\n```\n╔═════════════════ INDEXING STATUS ═════════════════╗\n║ 📊 Progress: 1,247 / 2,150 files (58.0%)         ║\n║ ⏱️  Duration: 45s | Estimated: 78s remaining      ║\n║ 🚀 Speed: 27.7 files/second                      ║\n║ 🧵 Threads: 8 active, 12 peak                    ║\n║                                                   ║\n║ 📄 File Types Indexed:                           ║\n║   • .java: 423 files                             ║\n║   • .xml: 156 files                              ║\n║   • .properties: 89 files                        ║\n║   • .md: 67 files                                ║\n║   • .kt: 45 files                                ║\n║                                                   ║\n║ 🚫 Skipped Extensions: .class (234), .jar (12)   ║\n║ ⚠️  Failed: 3 files | Skipped: 456 files         ║\n╚═══════════════════════════════════════════════════╝\n```\n\n**Status Information:**\n- **Progress**: Percentage of files processed\n- **Performance**: Files per second processing speed\n- **Threading**: Virtual thread usage for optimal performance\n- **File Breakdown**: Count by file type/extension\n- **Issues**: Failed and skipped file tracking\n\n#### **3. 🧠 Semantic Code Search**\nFind conceptually similar code using vector embeddings:\n\n**Example Usage:**\n```\n🧠 Enter search query: database repository pattern\n🎯 Similarity threshold (0.0-1.0) [0.7]: 0.8\n🔍 Max results [10]: 5\n\n📊 Found 5 results (similarity \u003e 0.8):\n\n1. UserRepository.java (0.92) - Line 23\n   @Repository\n   public class UserRepository extends JpaRepository\u003cUser, Long\u003e {\n       Optional\u003cUser\u003e findByUsername(String username);\n   }\n\n2. ProductService.java (0.89) - Line 45\n   private final ProductRepository productRepository;\n   \n3. OrderRepository.java (0.85) - Line 12\n   public interface OrderRepository extends CrudRepository\u003cOrder, UUID\u003e {\n```\n\n**Features:**\n- Adjustable similarity threshold (0.0 to 1.0)\n- Vector-based semantic matching\n- Ranked results by relevance score\n- Context-aware code snippets\n\n#### **4. 📝 Text Search**\nFast keyword-based search across all indexed files:\n\n**Example Usage:**\n```\n📝 Enter search term: @RestController\n🔍 Case sensitive? [y/N]: n\n📊 Max results [20]: 10\n\n📊 Found 8 matches in 6 files:\n\n1. UserController.java - Line 15\n   @RestController\n   @RequestMapping(\"/api/users\")\n   public class UserController {\n\n2. AuthController.java - Line 12\n   @RestController\n   @RequestMapping(\"/api/auth\") \n   public class AuthController {\n```\n\n**Search Options:**\n- Case-sensitive or insensitive matching\n- Regular expression support\n- File path filtering\n- Configurable result limits\n\n#### **5. ⚙️ Advanced Search**\nCombine multiple search criteria for precise results:\n\n**Filter Options:**\n```\n⚙️ Advanced Search Configuration:\n📁 File extensions: .java,.kt,.scala\n🏷️  File name pattern: *Service*\n📂 Directory filter: src/main/java\n🔍 Content contains: @Transactional\n📏 File size: 1KB - 100KB\n📅 Modified after: 2024-01-01\n```\n\n**Example Results:**\n```\n📊 Advanced Search Results (12 matches):\n\nFilters Applied:\n✅ Extensions: .java, .kt\n✅ Pattern: *Service*  \n✅ Content: @Transactional\n✅ Directory: src/main/java\n\n1. UserService.java (src/main/java/service/)\n   @Transactional\n   public void updateUser(User user) { ... }\n\n2. OrderService.kt (src/main/java/service/)\n   @Transactional\n   fun processOrder(order: Order) { ... }\n```\n\n#### **6. 📚 Index Codebase**\nStart or restart the indexing process:\n\n**Options:**\n```\n📚 Codebase Indexing Options:\n\n1. 🔄 Restart indexing (current directory)\n2. 📁 Change indexing directory\n3. 🗑️  Clear cache and reindex all files\n4. ⏸️  Pause/Resume indexing\n5. 📊 View indexing statistics\n\nCurrent directory: /path/to/project/src\nIndexed files: 1,247 | Cache entries: 1,189\n```\n\n**Directory Selection:**\n```\n📁 Select indexing directory:\n   Current: /project/src\n   \n1. 📂 /project/src (current)\n2. 📂 /project/src/main/java\n3. 📂 /project/codebase\n4. 📝 Enter custom path\n5. 🔙 Back to main menu\n\nEnter choice [1-5]:\n```\n\n#### **7. ❓ Help**\nComprehensive help and documentation:\n\n```\n╔═══════════════════ HELP \u0026 TIPS ═══════════════════╗\n║                                                   ║\n║ 🔍 SEARCH TIPS:                                   ║\n║   • Use specific terms: \"JWT authentication\"      ║\n║   • Try different phrasings if no results         ║\n║   • Combine keywords: \"user repository database\"  ║\n║                                                   ║\n║ 🎯 SIMILARITY THRESHOLDS:                         ║\n║   • 0.9-1.0: Very similar (exact matches)         ║\n║   • 0.7-0.9: Similar (related concepts)           ║\n║   • 0.5-0.7: Somewhat related                     ║\n║   • 0.0-0.5: Loose associations                   ║\n║                                                   ║\n║ 📁 SUPPORTED FILE TYPES:                          ║\n║   • Code: .java, .kt, .scala, .py, .js, .ts       ║\n║   • Config: .xml, .yml, .properties, .json        ║\n║   • Web: .html, .css, .jsp, .php                  ║\n║   • Docs: .md, .txt, .adoc                        ║\n║   • Scripts: .sh, .cmd, .ps1, .sql                ║\n║                                                   ║\n╚═══════════════════════════════════════════════════╝\n```\n\n### Search Examples \u0026 Best Practices\n\n#### **Natural Language Search Examples**\n\n| Query Type | Example | What it finds |\n|------------|---------|---------------|\n| **Functionality** | \"user authentication\" | Login methods, auth filters, JWT handling |\n| **Architecture** | \"repository pattern\" | Data access objects, JPA repositories |\n| **Error Handling** | \"exception handling\" | Try-catch blocks, error controllers |\n| **Configuration** | \"database configuration\" | DataSource beans, connection properties |\n| **API Endpoints** | \"REST endpoints for users\" | UserController methods, API routes |\n| **Security** | \"authorization logic\" | Security configs, role-based access |\n\n#### **Semantic Search Best Practices**\n\n- **High Similarity (0.8-1.0)**: Find exact patterns and implementations\n- **Medium Similarity (0.6-0.8)**: Find related concepts and similar logic\n- **Low Similarity (0.4-0.6)**: Explore loosely related code\n- **Use specific technical terms**: \"repository\", \"controller\", \"service\"\n- **Combine concepts**: \"user authentication JWT token\"\n\n#### **Text Search Tips**\n\n- **Class names**: `UserService`, `@RestController`\n- **Method names**: `findByUsername`, `authenticate`\n- **Annotations**: `@Transactional`, `@Autowired`\n- **Patterns**: Use wildcards like `find*` or `*Controller`\n- **Regular expressions**: Enable regex for complex patterns\n\n### Workflow Examples\n\n#### **Example 1: Finding Authentication Code**\n```\n1. Start with Natural Language: \"user authentication\"\n2. Review results, note relevant classes\n3. Use Semantic Search: \"JWT token validation\" (similarity: 0.7)\n4. Drill down with Text Search: \"@PreAuthorize\"\n5. Use Advanced Search: Files containing \"auth\" in src/main/java\n```\n\n#### **Example 2: Understanding Data Access Layer**\n```\n1. Natural Language: \"database repository pattern\"\n2. Semantic Search: \"JPA repository\" (similarity: 0.8)\n3. Text Search: \"extends JpaRepository\"\n4. Advanced Search: Filter by *.java files containing \"@Repository\"\n```\n\n#### **Example 3: API Endpoint Discovery**\n```\n1. Natural Language: \"REST API endpoints\"\n2. Text Search: \"@RestController\"\n3. Semantic Search: \"HTTP GET POST endpoints\" (similarity: 0.7)\n4. Advanced Search: Files matching \"*Controller.java\"\n```\n\n### Performance \u0026 Monitoring\n\n- **Real-time Status**: Check option 2 for live indexing progress\n- **Search During Indexing**: Search works immediately, even while indexing\n- **Cache Management**: System automatically manages file change detection\n- **Background Processing**: Indexing doesn't block the interactive menu\n- **Memory Efficient**: Virtual threads optimize resource usage\n\n## Development\n\n### Project Structure\n```\nsrc/main/java/sg/edu/nus/iss/codebase/indexer/\n├── IndexerApplication.java          # Main Spring Boot application\n├── cli/\n│   ├── SearchCLI.java              # Interactive command-line interface\n│   └── command/                    # Command Pattern implementation\n│       ├── Command.java            # Command interface\n│       └── IndexingStatusCommand.java # Status display command\n├── config/\n│   ├── EnvironmentConfig.java      # Environment variable configuration\n│   ├── IndexingConfiguration.java  # Centralized indexing configuration\n│   ├── QdrantCollectionInitializer.java # Vector database setup\n│   └── VirtualThreadConfig.java    # Async processing configuration\n├── controller/\n│   └── SearchController.java       # REST API endpoints (optional)\n├── dto/\n│   └── SearchRequest.java          # Data transfer objects\n├── model/\n│   └── IndexingStatus.java         # Status and metrics model\n├── service/\n│   ├── FileSearchService.java      # File-based search implementation\n│   ├── HybridSearchService.java    # Main search orchestration\n│   ├── impl/                       # Service implementations\n│   │   ├── DocumentFactoryManager.java # Factory manager\n│   │   ├── FileCacheRepositoryImpl.java # Cache repository implementation\n│   │   ├── FileIndexingServiceImpl.java # Core indexing service\n│   │   ├── TextDocumentFactory.java # Text document factory\n│   │   └── search/                 # Search strategy implementations\n│   │       └── SemanticSearchStrategy.java # Semantic search strategy\n│   └── interfaces/                 # Service interfaces\n│       ├── DocumentFactory.java    # Factory pattern interface\n│       ├── FileCacheRepository.java # Repository pattern interface\n│       ├── FileIndexingService.java # Service interface\n│       ├── IndexingStatusObserver.java # Observer pattern interface\n│       └── SearchStrategy.java     # Strategy pattern interface\n```\n\n### Architecture \u0026 Design Patterns\n\n## 🔄 Sequence Diagrams\n\n### **Main Application Flow - Indexing and Search Operations**\n\n```mermaid\nsequenceDiagram\n    participant User\n    participant CLI as SearchCLI\n    participant IndexSvc as FileIndexingService\n    participant Config as IndexingConfiguration\n    participant Cache as FileCacheRepository\n    participant QdrantSvc as QdrantDocumentService\n    participant SearchSvc as HybridSearchService\n    participant SearchStrat as SearchStrategy\n    participant Observer as StatusObserver\n\n    Note over User,Observer: Application Startup \u0026 Initialization\n    User-\u003e\u003eCLI: Start Application\n    CLI-\u003e\u003eConfig: Load Configuration\n    Config--\u003e\u003eCLI: Return config settings\n    CLI-\u003e\u003eIndexSvc: Initialize indexing service\n    IndexSvc-\u003e\u003eQdrantSvc: Initialize Qdrant collection\n    QdrantSvc--\u003e\u003eIndexSvc: Collection ready\n    CLI-\u003e\u003eIndexSvc: Add CLI as status observer\n    \n    Note over User,Observer: Directory Indexing Flow\n    User-\u003e\u003eCLI: Select \"Index Directory\"\n    CLI-\u003e\u003eUser: Prompt for directory path\n    User-\u003e\u003eCLI: Provide directory path\n    CLI-\u003e\u003eIndexSvc: indexDirectory(path)\n    \n    IndexSvc-\u003e\u003eCache: loadCache()\n    Cache--\u003e\u003eIndexSvc: Return cached file info\n    IndexSvc-\u003e\u003eIndexSvc: scanDirectory(path)\n    IndexSvc-\u003e\u003eConfig: getSupportedExtensions()\n    Config--\u003e\u003eIndexSvc: Return extensions list\n    \n    loop For each file in directory\n        IndexSvc-\u003e\u003eCache: isFileModified(file)\n        alt File is new or modified\n            Cache--\u003e\u003eIndexSvc: true\n            IndexSvc-\u003e\u003eIndexSvc: queueForIndexing(file)\n        else File unchanged\n            Cache--\u003e\u003eIndexSvc: false\n            Note over IndexSvc: Skip processing\n        end\n    end\n    \n    Note over IndexSvc,Observer: Background Processing with Status Updates\n    IndexSvc-\u003e\u003eObserver: onStatusUpdate(status)\n    Observer-\u003e\u003eCLI: displayStatusUpdate(status)\n    CLI-\u003e\u003eUser: Show progress information\n    \n    loop Batch processing files\n        IndexSvc-\u003e\u003eQdrantSvc: processFileChunks(file)\n        QdrantSvc-\u003e\u003eQdrantSvc: extractText(file)\n        QdrantSvc-\u003e\u003eQdrantSvc: generateEmbeddings(text)\n        QdrantSvc-\u003e\u003eQdrantSvc: storeVectors(embeddings)\n        QdrantSvc--\u003e\u003eIndexSvc: Processing complete\n        IndexSvc-\u003e\u003eCache: updateFileCache(file)\n        IndexSvc-\u003e\u003eObserver: onStatusUpdate(updatedStatus)\n        Observer-\u003e\u003eCLI: displayStatusUpdate(updatedStatus)\n    end\n    \n    IndexSvc-\u003e\u003eObserver: onIndexingComplete(finalStatus)\n    Observer-\u003e\u003eCLI: displayCompletionMessage()\n    CLI-\u003e\u003eUser: Show indexing completed\n    \n    Note over User,Observer: Search Operation Flow\n    User-\u003e\u003eCLI: Select search option\n    CLI-\u003e\u003eUser: Prompt for search query\n    User-\u003e\u003eCLI: Provide search query and type\n    CLI-\u003e\u003eSearchSvc: search(searchRequest)\n    \n    SearchSvc-\u003e\u003eSearchStrat: findStrategy(searchType)\n    SearchStrat--\u003e\u003eSearchSvc: Return appropriate strategy\n    \n    alt Semantic Search\n        SearchSvc-\u003e\u003eSearchStrat: search(semanticQuery)\n        SearchStrat-\u003e\u003eQdrantSvc: vectorSearch(queryEmbedding)\n        QdrantSvc--\u003e\u003eSearchStrat: Return vector results\n    else Text Search\n        SearchSvc-\u003e\u003eSearchStrat: search(textQuery)\n        SearchStrat-\u003e\u003eSearchStrat: performTextSearch(query)\n    else Natural Language Search\n        SearchSvc-\u003e\u003eSearchStrat: search(nlQuery)\n        SearchStrat-\u003e\u003eSearchStrat: processNaturalLanguageQuery()\n        SearchStrat-\u003e\u003eSearchStrat: delegateToSemanticSearch()\n    end\n    \n    SearchStrat--\u003e\u003eSearchSvc: Return search results\n    SearchSvc-\u003e\u003eSearchSvc: rankAndMergeResults()\n    SearchSvc--\u003e\u003eCLI: Return formatted results\n    CLI-\u003e\u003eUser: Display search results\n    \n    Note over User,Observer: Status Monitoring Flow\n    User-\u003e\u003eCLI: Select \"View Status\"\n    CLI-\u003e\u003eIndexSvc: getIndexingStatus()\n    IndexSvc-\u003e\u003eIndexSvc: calculateCurrentMetrics()\n    IndexSvc-\u003e\u003eQdrantSvc: getCollectionInfo()\n    QdrantSvc--\u003e\u003eIndexSvc: Return collection stats\n    IndexSvc--\u003e\u003eCLI: Return status information\n    CLI-\u003e\u003eCLI: formatStatusDisplay()\n    CLI-\u003e\u003eUser: Display detailed status\n```\n\n### **Error Handling and Recovery Flow**\n\n```mermaid\nsequenceDiagram\n    participant CLI as SearchCLI\n    participant IndexSvc as FileIndexingService\n    participant QdrantSvc as QdrantDocumentService\n    participant Cache as FileCacheRepository\n    participant Observer as StatusObserver\n\n    Note over CLI,Observer: Error Scenarios and Recovery\n    \n    CLI-\u003e\u003eIndexSvc: indexDirectory(invalidPath)\n    IndexSvc-\u003e\u003eIndexSvc: validateDirectory(path)\n    alt Directory doesn't exist\n        IndexSvc--\u003e\u003eCLI: DirectoryNotFoundException\n        CLI-\u003e\u003eCLI: handleDirectoryError()\n        CLI-\u003e\u003eCLI: showErrorMessage()\n        CLI-\u003e\u003eCLI: promptForValidPath()\n    end\n    \n    IndexSvc-\u003e\u003eQdrantSvc: processFile(corruptedFile)\n    QdrantSvc-\u003e\u003eQdrantSvc: extractText(file)\n    alt File processing fails\n        QdrantSvc--\u003e\u003eIndexSvc: FileProcessingException\n        IndexSvc-\u003e\u003eIndexSvc: incrementFailedCount()\n        IndexSvc-\u003e\u003eCache: markFileAsFailed(file)\n        IndexSvc-\u003e\u003eObserver: onStatusUpdate(statusWithError)\n        Observer-\u003e\u003eCLI: displayErrorInStatus()\n    end\n    \n    CLI-\u003e\u003eQdrantSvc: search(query)\n    QdrantSvc-\u003e\u003eQdrantSvc: performVectorSearch()\n    alt Qdrant collection not found\n        QdrantSvc--\u003e\u003eCLI: QdrantException(\"Collection not found\")\n        CLI-\u003e\u003eCLI: handleQdrantError()\n        CLI-\u003e\u003eCLI: showNoIndexMessage()\n        CLI-\u003e\u003eCLI: promptForIndexing()\n    end\n    \n    IndexSvc-\u003e\u003eCache: loadCache()\n    Cache-\u003e\u003eCache: readCacheFile()\n    alt Cache file corrupted\n        Cache--\u003e\u003eIndexSvc: CacheCorruptedException\n        IndexSvc-\u003e\u003eCache: rebuildCache()\n        IndexSvc-\u003e\u003eObserver: onStatusUpdate(rebuildingStatus)\n        Observer-\u003e\u003eCLI: showCacheRebuildMessage()\n    end\n```\n\n## 🏗️ Deployment Diagrams\n\n### **System Architecture and Component Deployment**\n\n```mermaid\ngraph TB\n    subgraph \"User Environment 💻\"\n        USER[👤 Developer]\n        TERMINAL[🖥️ Terminal/CLI]\n        IDE[💻 IDE/VS Code]\n        WORKSPACE[📁 Code Workspace]\n    end\n\n    subgraph \"Local Machine 🖥️\"\n        subgraph \"Spring Boot Application 🚀\"\n            CLI_APP[🎛️ SearchCLI\u003cbr/\u003eInteractive Menu]\n            SPRING_BOOT[⚙️ Spring Boot Container\u003cbr/\u003ePort: 8080]\n            \n            subgraph \"Service Layer 🔧\"\n                INDEX_SVC[📚 FileIndexingService\u003cbr/\u003eVirtual Thread Pool]\n                SEARCH_SVC[🔍 HybridSearchService\u003cbr/\u003eMulti-Strategy Search]\n                CACHE_SVC[💾 FileCacheRepository\u003cbr/\u003eLocal File Cache]\n            end\n            \n            subgraph \"Configuration 📋\"\n                CONFIG[⚙️ IndexingConfiguration\u003cbr/\u003eapplication.properties]\n                ENV_CONFIG[🌍 Environment Variables\u003cbr/\u003e.env file]\n            end\n        end\n        \n        subgraph \"Ollama AI Platform 🤖\"\n            OLLAMA_SERVER[🤖 Ollama Server\u003cbr/\u003ePort: 11434]\n            EMBEDDING_MODEL[📊 nomic-embed-text\u003cbr/\u003e768D Embeddings]\n            CHAT_MODEL[💬 codellama:7b\u003cbr/\u003eNatural Language]\n        end\n        \n        subgraph \"Local Storage 💾\"\n            FILE_CACHE[📄 File Cache\u003cbr/\u003e.indexer-cache.json]\n            LOG_FILES[📝 Application Logs\u003cbr/\u003elogs/]\n            TEMP_FILES[🗂️ Temporary Files\u003cbr/\u003etemp/]\n        end\n    end\n\n    subgraph \"Cloud Infrastructure ☁️\"\n        subgraph \"Qdrant Cloud 🌐\"\n            QDRANT_CLUSTER[🗄️ Qdrant Vector DB\u003cbr/\u003eCloud Cluster]\n            VECTOR_STORE[📊 Vector Collections\u003cbr/\u003e768D Embeddings]\n            METADATA_STORE[📋 Document Metadata\u003cbr/\u003eFile Paths \u0026 Content]\n        end\n    end\n\n    subgraph \"External Resources 🌍\"\n        GITHUB[📚 GitHub Repositories\u003cbr/\u003eSource Code]\n        DOCS[📖 Documentation\u003cbr/\u003eMarkdown/Text Files]\n        CONFIG_FILES[⚙️ Configuration Files\u003cbr/\u003eYAML/Properties/JSON]\n    end\n\n    %% User Interactions\n    USER --\u003e TERMINAL\n    USER --\u003e IDE\n    TERMINAL --\u003e CLI_APP\n    IDE --\u003e WORKSPACE\n\n    %% Application Flow\n    CLI_APP --\u003e INDEX_SVC\n    CLI_APP --\u003e SEARCH_SVC\n    INDEX_SVC --\u003e CACHE_SVC\n    SEARCH_SVC --\u003e INDEX_SVC\n    \n    %% Configuration\n    SPRING_BOOT --\u003e CONFIG\n    CONFIG --\u003e ENV_CONFIG\n    \n    %% AI Model Integration\n    INDEX_SVC -.-\u003e|HTTP/REST| OLLAMA_SERVER\n    SEARCH_SVC -.-\u003e|HTTP/REST| OLLAMA_SERVER\n    OLLAMA_SERVER --\u003e EMBEDDING_MODEL\n    OLLAMA_SERVER --\u003e CHAT_MODEL\n    \n    %% Vector Database\n    INDEX_SVC -.-\u003e|HTTPS/gRPC| QDRANT_CLUSTER\n    SEARCH_SVC -.-\u003e|HTTPS/gRPC| QDRANT_CLUSTER\n    QDRANT_CLUSTER --\u003e VECTOR_STORE\n    QDRANT_CLUSTER --\u003e METADATA_STORE\n    \n    %% Local Storage\n    CACHE_SVC --\u003e FILE_CACHE\n    SPRING_BOOT --\u003e LOG_FILES\n    INDEX_SVC --\u003e TEMP_FILES\n    \n    %% Data Sources\n    WORKSPACE --\u003e GITHUB\n    WORKSPACE --\u003e DOCS\n    WORKSPACE --\u003e CONFIG_FILES\n    INDEX_SVC --\u003e WORKSPACE\n\n    %% Styling\n    style USER fill:#e1f5fe\n    style CLI_APP fill:#f3e5f5\n    style SPRING_BOOT fill:#e8f5e8\n    style OLLAMA_SERVER fill:#fff3e0\n    style QDRANT_CLUSTER fill:#fce4ec\n    style WORKSPACE fill:#f1f8e9\n```\n\n### **Network Communication and Data Flow**\n\n```mermaid\ngraph LR\n    subgraph \"Local Development Environment\"\n        subgraph \"Spring Boot Application:8080\"\n            CLI[🎛️ CLI Interface]\n            APP[🚀 Spring Boot App]\n            CACHE[💾 Local Cache]\n        end\n        \n        subgraph \"Ollama AI:11434\"\n            OLLAMA[🤖 Ollama API]\n            MODELS[📊 AI Models]\n        end\n    end\n    \n    subgraph \"Cloud Services\"\n        QDRANT[☁️ Qdrant Cloud\u003cbr/\u003e443/HTTPS]\n        CDN[🌐 Model CDN\u003cbr/\u003eOllama Registry]\n    end\n    \n    subgraph \"File System\"\n        WORKSPACE[📁 Code Workspace]\n        CACHE_FILE[📄 .indexer-cache.json]\n        LOGS[📝 Application Logs]\n    end\n\n    %% API Communications\n    CLI -.-\u003e|REST API| APP\n    APP -.-\u003e|HTTP POST\u003cbr/\u003eEmbeddings| OLLAMA\n    APP -.-\u003e|HTTPS\u003cbr/\u003eVector Ops| QDRANT\n    \n    %% Data Persistence\n    APP --\u003e CACHE_FILE\n    APP --\u003e LOGS\n    APP --\u003e WORKSPACE\n    CACHE --\u003e CACHE_FILE\n    \n    %% Model Management\n    OLLAMA -.-\u003e|Model Download\u003cbr/\u003eHTTPS| CDN\n    \n    %% Data Flow Labels\n    APP -.-\u003e|\"📊 Store Vectors\u003cbr/\u003e📋 Query Metadata\"| QDRANT\n    OLLAMA -.-\u003e|\"🔄 768D Embeddings\u003cbr/\u003e💬 Chat Responses\"| APP\n    WORKSPACE -.-\u003e|\"📄 File Content\u003cbr/\u003e📂 Directory Scan\"| APP\n```\n\n### **Deployment Architecture by Environment**\n\n```mermaid\ngraph TB\n    subgraph \"Development Environment 🛠️\"\n        subgraph \"Developer Workstation\"\n            DEV_IDE[💻 IDE/Terminal]\n            DEV_SPRING[🚀 Spring Boot Dev]\n            DEV_OLLAMA[🤖 Ollama Local]\n            DEV_CACHE[💾 Local Cache]\n        end\n        \n        DEV_IDE --\u003e DEV_SPRING\n        DEV_SPRING --\u003e DEV_OLLAMA\n        DEV_SPRING --\u003e DEV_CACHE\n        DEV_SPRING -.-\u003e|HTTPS| QDRANT_DEV[☁️ Qdrant Dev Cluster]\n    end\n    \n    subgraph \"Production Environment 🚀\"\n        subgraph \"Production Server\"\n            PROD_CLI[🎛️ Production CLI]\n            PROD_SPRING[⚙️ Spring Boot Prod]\n            PROD_OLLAMA[🤖 Ollama Server]\n            PROD_CACHE[💾 Persistent Cache]\n            PROD_LOGS[📝 Centralized Logs]\n        end\n        \n        PROD_CLI --\u003e PROD_SPRING\n        PROD_SPRING --\u003e PROD_OLLAMA\n        PROD_SPRING --\u003e PROD_CACHE\n        PROD_SPRING --\u003e PROD_LOGS\n        PROD_SPRING -.-\u003e|HTTPS| QDRANT_PROD[☁️ Qdrant Prod Cluster]\n    end\n    \n    subgraph \"CI/CD Environment 🔄\"\n        subgraph \"Build Pipeline\"\n            CI_BUILD[🔨 Maven Build]\n            CI_TEST[🧪 Unit Tests]\n            CI_PACKAGE[📦 JAR Package]\n            CI_DEPLOY[🚀 Deployment]\n        end\n        \n        CI_BUILD --\u003e CI_TEST\n        CI_TEST --\u003e CI_PACKAGE\n        CI_PACKAGE --\u003e CI_DEPLOY\n        CI_DEPLOY -.-\u003e PROD_SPRING\n    end\n    \n    subgraph \"Monitoring \u0026 Observability 📊\"\n        METRICS[📈 Application Metrics]\n        HEALTH[💚 Health Checks]\n        ALERTS[🚨 Alert System]\n        \n        PROD_SPRING --\u003e METRICS\n        PROD_SPRING --\u003e HEALTH\n        HEALTH --\u003e ALERTS\n    end\n    \n    %% Environment Connections\n    DEV_SPRING -.-\u003e|\"Promote to Prod\"| CI_BUILD\n    METRICS -.-\u003e|\"Feedback\"| DEV_SPRING\n```\n\n### **Security and Access Control**\n\n```mermaid\ngraph TB\n    subgraph \"Security Layers 🔒\"\n        subgraph \"Authentication \u0026 Authorization\"\n            ENV_VARS[🔐 Environment Variables\u003cbr/\u003eAPI Keys \u0026 Secrets]\n            API_KEYS[🗝️ Qdrant API Key\u003cbr/\u003eEncrypted Storage]\n            SSL_CERTS[📜 SSL Certificates\u003cbr/\u003eHTTPS/TLS 1.3]\n        end\n        \n        subgraph \"Network Security\"\n            FIREWALL[🛡️ Local Firewall\u003cbr/\u003ePort Restrictions]\n            VPN[🌐 VPN Connection\u003cbr/\u003eSecure Tunneling]\n            RATE_LIMIT[⏱️ Rate Limiting\u003cbr/\u003eAPI Call Throttling]\n        end\n        \n        subgraph \"Data Protection\"\n            ENCRYPTION[🔒 Data Encryption\u003cbr/\u003eAt Rest \u0026 In Transit]\n            BACKUP[💾 Encrypted Backups\u003cbr/\u003eCache \u0026 Logs]\n            AUDIT[📋 Audit Logging\u003cbr/\u003eAccess Tracking]\n        end\n    end\n    \n    subgraph \"Application Security\"\n        INPUT_VALID[✅ Input Validation\u003cbr/\u003eSearch Queries]\n        ERROR_HANDLE[🚫 Error Handling\u003cbr/\u003eNo Data Leakage]\n        SECURE_CONFIG[⚙️ Secure Configuration\u003cbr/\u003eDefault Deny]\n    end\n    \n    %% Security Flow\n    ENV_VARS --\u003e API_KEYS\n    API_KEYS --\u003e SSL_CERTS\n    SSL_CERTS --\u003e ENCRYPTION\n    \n    FIREWALL --\u003e VPN\n    VPN --\u003e RATE_LIMIT\n    \n    ENCRYPTION --\u003e BACKUP\n    BACKUP --\u003e AUDIT\n    \n    INPUT_VALID --\u003e ERROR_HANDLE\n    ERROR_HANDLE --\u003e SECURE_CONFIG\n    \n    %% Cross-cutting Security\n    ENV_VARS -.-\u003e INPUT_VALID\n    RATE_LIMIT -.-\u003e ERROR_HANDLE\n    AUDIT -.-\u003e SECURE_CONFIG\n```\n\n### **Scalability and Performance Architecture**\n\n```mermaid\ngraph TB\n    subgraph \"Performance Optimization 🚀\"\n        subgraph \"Concurrent Processing\"\n            VIRTUAL_THREADS[🧵 Virtual Threads\u003cbr/\u003eJDK 21 Fibers]\n            THREAD_POOL[🏊‍♂️ Thread Pool\u003cbr/\u003eConfigurable Size]\n            ASYNC_PROC[⚡ Async Processing\u003cbr/\u003eNon-blocking I/O]\n        end\n        \n        subgraph \"Caching Strategy\"\n            L1_CACHE[💾 L1: Memory Cache\u003cbr/\u003eHot Data]\n            L2_CACHE[📄 L2: File Cache\u003cbr/\u003ePersistent Storage]\n            L3_CACHE[☁️ L3: Vector Cache\u003cbr/\u003eQdrant Optimization]\n        end\n        \n        subgraph \"Resource Management\"\n            MEMORY_OPT[🧠 Memory Optimization\u003cbr/\u003eJVM Tuning]\n            DISK_OPT[💿 Disk Optimization\u003cbr/\u003eSequential I/O]\n            NETWORK_OPT[🌐 Network Optimization\u003cbr/\u003eConnection Pooling]\n        end\n    end\n    \n    subgraph \"Scaling Capabilities 📈\"\n        subgraph \"Horizontal Scaling\"\n            LOAD_BALANCE[⚖️ Load Balancing\u003cbr/\u003eMultiple Instances]\n            DISTRIBUTED[🌍 Distributed Processing\u003cbr/\u003eCluster Mode]\n            QUEUE[📋 Job Queuing\u003cbr/\u003eBackground Tasks]\n        end\n        \n        subgraph \"Vertical Scaling\"\n            CPU_SCALE[⚡ CPU Scaling\u003cbr/\u003eMulti-core Usage]\n            RAM_SCALE[🧠 Memory Scaling\u003cbr/\u003eHeap Optimization]\n            STORAGE_SCALE[💾 Storage Scaling\u003cbr/\u003eSSD Performance]\n        end\n    end\n    \n    %% Performance Connections\n    VIRTUAL_THREADS --\u003e ASYNC_PROC\n    ASYNC_PROC --\u003e THREAD_POOL\n    \n    L1_CACHE --\u003e L2_CACHE\n    L2_CACHE --\u003e L3_CACHE\n    \n    MEMORY_OPT --\u003e DISK_OPT\n    DISK_OPT --\u003e NETWORK_OPT\n    \n    %% Scaling Connections\n    LOAD_BALANCE --\u003e DISTRIBUTED\n    DISTRIBUTED --\u003e QUEUE\n    \n    CPU_SCALE --\u003e RAM_SCALE\n    RAM_SCALE --\u003e STORAGE_SCALE\n    \n    %% Cross-cutting Optimizations\n    VIRTUAL_THREADS -.-\u003e CPU_SCALE\n    L1_CACHE -.-\u003e RAM_SCALE\n    NETWORK_OPT -.-\u003e DISTRIBUTED\n```\n\nThese deployment diagrams provide a comprehensive view of:\n\n1. **System Architecture**: Complete component deployment across user environment, local machine, and cloud infrastructure\n2. **Network Communication**: Data flow and API communications between services\n3. **Multi-Environment Support**: Development, production, and CI/CD pipeline architectures\n4. **Security Architecture**: Comprehensive security layers and access controls\n5. **Performance \u0026 Scalability**: Optimization strategies and scaling capabilities\n\nThe diagrams show how the Misoto Codebase Indexer integrates with:\n- **Local Development Tools**: IDEs, terminals, and file systems\n- **AI Platforms**: Ollama for embeddings and natural language processing\n- **Cloud Services**: Qdrant Cloud for vector storage and search\n- **Infrastructure**: Security, monitoring, and deployment pipelines\n\n## 👥 Use Case Diagrams\n\n### **Primary Use Cases and Actor Interactions**\n\n```mermaid\ngraph TB\n    subgraph \"Misoto Codebase Indexer System\"\n        subgraph \"Search Use Cases 🔍\"\n            UC1[Search Code with Natural Language]\n            UC2[Perform Semantic Code Search]\n            UC3[Execute Text-based Search]\n            UC4[Advanced Multi-filter Search]\n            UC5[Browse Search Results]\n            UC6[Export Search Results]\n        end\n          subgraph \"Indexing Use Cases 📚\"\n            UC7[Index Codebase Directory]\n            UC8[Monitor Indexing Progress]\n            UC9[Configure Indexing Settings]\n            UC10[Manage File Cache]\n            UC11[Handle Indexing Errors]\n            UC12[Validate File Types]\n        end\n          subgraph \"Configuration Use Cases ⚙️\"\n            UC13[Setup AI Models - Ollama]\n            UC14[Configure Vector Database - Qdrant]\n            UC15[Manage Environment Variables]\n            UC16[Customize File Priorities]\n            UC17[Set Performance Parameters]\n        end\n        \n        subgraph \"Monitoring Use Cases 📊\"\n            UC18[View System Status]\n            UC19[Track Performance Metrics]\n            UC20[Monitor Resource Usage]\n            UC21[Handle System Errors]\n            UC22[Generate Status Reports]\n        end\n        \n        subgraph \"Management Use Cases 🔧\"\n            UC23[Clear System Cache]\n            UC24[Restart Indexing Process]\n            UC25[Change Target Directory]\n            UC26[Backup/Restore Index Data]\n            UC27[Update System Configuration]\n        end\n    end\n    \n    subgraph \"External Systems 🌐\"\n        EXT1[Ollama AI Platform]\n        EXT2[Qdrant Cloud Service]\n        EXT3[File System]\n        EXT4[Git Repositories]\n        EXT5[IDE Integration]\n    end\n    \n    subgraph \"Actors 👥\"\n        DEV[👨‍💻 Software Developer]\n        ADMIN[👨‍🔧 System Administrator]\n        ANALYST[👨‍💼 Code Analyst]\n        RESEARCHER[👩‍🔬 Researcher]\n        TEAM_LEAD[👨‍💼 Team Lead]\n    end\n    \n    %% Developer Use Cases\n    DEV --\u003e UC1\n    DEV --\u003e UC2\n    DEV --\u003e UC3\n    DEV --\u003e UC4\n    DEV --\u003e UC5\n    DEV --\u003e UC7\n    DEV --\u003e UC8\n    DEV --\u003e UC25\n    \n    %% System Administrator Use Cases\n    ADMIN --\u003e UC9\n    ADMIN --\u003e UC10\n    ADMIN --\u003e UC13\n    ADMIN --\u003e UC14\n    ADMIN --\u003e UC15\n    ADMIN --\u003e UC16\n    ADMIN --\u003e UC17\n    ADMIN --\u003e UC23\n    ADMIN --\u003e UC24\n    ADMIN --\u003e UC26\n    ADMIN --\u003e UC27\n    \n    %% Code Analyst Use Cases\n    ANALYST --\u003e UC1\n    ANALYST --\u003e UC2\n    ANALYST --\u003e UC4\n    ANALYST --\u003e UC6\n    ANALYST --\u003e UC18\n    ANALYST --\u003e UC22\n    \n    %% Researcher Use Cases\n    RESEARCHER --\u003e UC2\n    RESEARCHER --\u003e UC4\n    RESEARCHER --\u003e UC6\n    RESEARCHER --\u003e UC19\n    RESEARCHER --\u003e UC22\n    \n    %% Team Lead Use Cases\n    TEAM_LEAD --\u003e UC18\n    TEAM_LEAD --\u003e UC19\n    TEAM_LEAD --\u003e UC20\n    TEAM_LEAD --\u003e UC22\n    TEAM_LEAD --\u003e UC27\n    \n    %% System Dependencies\n    UC1 -.-\u003e EXT1\n    UC2 -.-\u003e EXT1\n    UC2 -.-\u003e EXT2\n    UC7 -.-\u003e EXT3\n    UC7 -.-\u003e EXT4\n    UC13 -.-\u003e EXT1\n    UC14 -.-\u003e EXT2\n    UC25 -.-\u003e EXT3\n    UC5 -.-\u003e EXT5\n    \n    %% Use Case Relationships\n    UC7 --\u003e UC8\n    UC8 --\u003e UC11\n    UC9 --\u003e UC16\n    UC9 --\u003e UC17\n    UC13 --\u003e UC14\n    UC18 --\u003e UC19\n    UC19 --\u003e UC20\n    UC23 --\u003e UC24\n    \n    %% Styling\n    style DEV fill:#e1f5fe\n    style ADMIN fill:#f3e5f5\n    style ANALYST fill:#e8f5e8\n    style RESEARCHER fill:#fff3e0\n    style TEAM_LEAD fill:#fce4ec\n```\n\n### **Detailed Use Case Scenarios**\n\n```mermaid\ngraph LR\n    subgraph \"Search Workflow 🔍\"\n        subgraph \"Natural Language Search\"\n            NL1[Enter Query: Find authentication logic]\n            NL2[AI Processing: Query Understanding]\n            NL3[Context Generation: Search Terms]\n            NL4[Vector Search: Semantic Matching]\n            NL5[Results Ranking: Relevance Scoring]\n            NL6[Display Results: Code Snippets]\n            \n            NL1 --\u003e NL2 --\u003e NL3 --\u003e NL4 --\u003e NL5 --\u003e NL6\n        end\n        \n        subgraph \"Semantic Search\"\n            SEM1[Enter Technical Query: repository pattern]\n            SEM2[Set Similarity: Threshold 0.7]\n            SEM3[Generate Embeddings: 768D Vectors]\n            SEM4[Vector Similarity: Search in Qdrant]\n            SEM5[Filter Results: By Similarity Score]\n            SEM6[Present Matches: With Context]\n            \n            SEM1 --\u003e SEM2 --\u003e SEM3 --\u003e SEM4 --\u003e SEM5 --\u003e SEM6\n        end\n        \n        subgraph \"Text Search\"\n            TXT1[Enter Keywords: @RestController]\n            TXT2[Configure Options: Case Sensitivity]\n            TXT3[Scan Files: Pattern Matching]\n            TXT4[Collect Matches: Line Numbers]\n            TXT5[Format Output: File Locations]\n            \n            TXT1 --\u003e TXT2 --\u003e TXT3 --\u003e TXT4 --\u003e TXT5\n        end\n    end\n    \n    subgraph \"Indexing Workflow 📚\"\n        subgraph \"Initial Indexing\"\n            IDX1[Select Directory: Choose Codebase]\n            IDX2[Scan Structure: File Discovery]\n            IDX3[Validate Files: Extension Check]\n            IDX4[Priority Sorting: Critical Files First]\n            IDX5[Batch Processing: Virtual Threads]\n            IDX6[Vector Generation: Embedding Creation]\n            IDX7[Store Vectors: Qdrant Upload]\n            IDX8[Update Cache: File Tracking]\n            \n            IDX1 --\u003e IDX2 --\u003e IDX3 --\u003e IDX4 --\u003e IDX5 --\u003e IDX6 --\u003e IDX7 --\u003e IDX8\n        end\n        \n        subgraph \"Incremental Indexing\"\n            INC1[Monitor Changes: File Modification]\n            INC2[Check Cache: Comparison]\n            INC3[Queue Updates: Modified Files]\n            INC4[Background Process: Non-blocking]\n            INC5[Merge Vectors: Update Collection]\n            \n            INC1 --\u003e INC2 --\u003e INC3 --\u003e INC4 --\u003e INC5\n        end\n    end\n    \n    subgraph \"Configuration Workflow ⚙️\"\n        subgraph \"System Setup\"\n            CFG1[Install Ollama: AI Platform]\n            CFG2[Download Models: nomic-embed-text]\n            CFG3[Setup Qdrant: Cloud Account]\n            CFG4[Configure API: Keys \u0026 URLs]\n            CFG5[Test Connection: Verify Setup]\n            \n            CFG1 --\u003e CFG2 --\u003e CFG3 --\u003e CFG4 --\u003e CFG5\n        end\n        \n        subgraph \"Performance Tuning\"\n            PERF1[Set Thread Pool: Virtual Threads]\n            PERF2[Configure Cache: Size \u0026 Location]\n            PERF3[Adjust Batch Size: Processing Groups]\n            PERF4[Set Timeouts: Network Calls]\n            PERF5[Monitor Metrics: Performance Check]\n            \n            PERF1 --\u003e PERF2 --\u003e PERF3 --\u003e PERF4 --\u003e PERF5\n        end\n    end\n```\n\n### **Actor Responsibilities and Permissions**\n\n```mermaid\ngraph TB\n    subgraph \"Role-Based Access Control 🔐\"\n        subgraph \"Software Developer 👨‍💻\"\n            DEV_PERM[Permissions:]\n            DEV_P1[• Search all indexed code]\n            DEV_P2[• View search results]\n            DEV_P3[• Index personal projects]\n            DEV_P4[• Monitor indexing status]\n            DEV_P5[• Change target directories]\n            DEV_P6[• Export search results]\n            \n            DEV_PERM --\u003e DEV_P1\n            DEV_PERM --\u003e DEV_P2\n            DEV_PERM --\u003e DEV_P3\n            DEV_PERM --\u003e DEV_P4\n            DEV_PERM --\u003e DEV_P5\n            DEV_PERM --\u003e DEV_P6\n        end\n        \n        subgraph \"System Administrator 👨‍🔧\"\n            ADMIN_PERM[Permissions:]\n            ADMIN_P1[• Full system configuration]\n            ADMIN_P2[• Manage AI model setup]\n            ADMIN_P3[• Configure Qdrant connection]\n            ADMIN_P4[• Set performance parameters]\n            ADMIN_P5[• Clear system cache]\n            ADMIN_P6[• Backup/restore data]\n            ADMIN_P7[• Monitor system health]\n            ADMIN_P8[• Manage user access]\n            \n            ADMIN_PERM --\u003e ADMIN_P1\n            ADMIN_PERM --\u003e ADMIN_P2\n            ADMIN_PERM --\u003e ADMIN_P3\n            ADMIN_PERM --\u003e ADMIN_P4\n            ADMIN_PERM --\u003e ADMIN_P5\n            ADMIN_PERM --\u003e ADMIN_P6\n            ADMIN_PERM --\u003e ADMIN_P7\n            ADMIN_PERM --\u003e ADMIN_P8\n        end\n        \n        subgraph \"Code Analyst 👨‍💼\"\n            ANALYST_PERM[Permissions:]\n            ANALYST_P1[• Advanced search features]\n            ANALYST_P2[• Generate analysis reports]\n            ANALYST_P3[• Export detailed results]\n            ANALYST_P4[• Access metrics dashboard]\n            ANALYST_P5[• Configure search filters]\n            ANALYST_P6[• View system statistics]\n            \n            ANALYST_PERM --\u003e ANALYST_P1\n            ANALYST_PERM --\u003e ANALYST_P2\n            ANALYST_PERM --\u003e ANALYST_P3\n            ANALYST_PERM --\u003e ANALYST_P4\n            ANALYST_PERM --\u003e ANALYST_P5\n            ANALYST_PERM --\u003e ANALYST_P6\n        end\n        \n        subgraph \"Researcher 👩‍🔬\"\n            RESEARCHER_PERM[Permissions:]\n            RESEARCHER_P1[• Semantic search access]\n            RESEARCHER_P2[• Pattern analysis tools]\n            RESEARCHER_P3[• Research data export]\n            RESEARCHER_P4[• Custom query building]\n            RESEARCHER_P5[• Similarity threshold tuning]\n            \n            RESEARCHER_PERM --\u003e RESEARCHER_P1\n            RESEARCHER_PERM --\u003e RESEARCHER_P2\n            RESEARCHER_PERM --\u003e RESEARCHER_P3\n            RESEARCHER_PERM --\u003e RESEARCHER_P4\n            RESEARCHER_PERM --\u003e RESEARCHER_P5\n        end\n        \n        subgraph \"Team Lead 👨‍💼\"\n            LEAD_PERM[Permissions:]\n            LEAD_P1[• Team usage monitoring]\n            LEAD_P2[• Performance oversight]\n            LEAD_P3[• Resource planning]\n            LEAD_P4[• Usage reports generation]\n            LEAD_P5[• Configuration approval]\n            \n            LEAD_PERM --\u003e LEAD_P1\n            LEAD_PERM --\u003e LEAD_P2\n            LEAD_PERM --\u003e LEAD_P3\n            LEAD_PERM --\u003e LEAD_P4\n            LEAD_PERM --\u003e LEAD_P5\n        end\n    end\n    \n    subgraph \"Common Use Cases 🔄\"\n        COMMON[All Users Can:]\n        COMMON_P1[• View help documentation]\n        COMMON_P2[• Access basic search]\n        COMMON_P3[• See indexing status]\n        COMMON_P4[• Use interactive CLI]\n        \n        COMMON --\u003e COMMON_P1\n        COMMON --\u003e COMMON_P2\n        COMMON --\u003e COMMON_P3\n        COMMON --\u003e COMMON_P4\n    end\n```\n\n### **System Integration Use Cases**\n\n```mermaid\ngraph TB\n    subgraph \"External System Integrations 🔌\"\n        subgraph \"Ollama AI Integration\"\n            OLL1[Install Ollama Platform]\n            OLL2[Download AI Models]\n            OLL3[Start Ollama Service]\n            OLL4[Generate Embeddings]\n            OLL5[Process Natural Language]\n            OLL6[Monitor AI Performance]\n            \n            OLL1 --\u003e OLL2 --\u003e OLL3\n            OLL3 --\u003e OLL4\n            OLL3 --\u003e OLL5\n            OLL4 --\u003e OLL6\n            OLL5 --\u003e OLL6\n        end\n        \n        subgraph \"Qdrant Cloud Integration\"\n            QDR1[Create Qdrant Account]\n            QDR2[Setup Cloud Cluster]\n            QDR3[Configure API Access]\n            QDR4[Initialize Collections]\n            QDR5[Store Vector Data]\n            QDR6[Perform Vector Search]\n            QDR7[Manage Collection Metadata]\n            \n            QDR1 --\u003e QDR2 --\u003e QDR3 --\u003e QDR4\n            QDR4 --\u003e QDR5\n            QDR4 --\u003e QDR6\n            QDR5 --\u003e QDR7\n            QDR6 --\u003e QDR7\n        end\n        \n        subgraph \"File System Integration\"\n            FS1[Access Local Directories]\n            FS2[Read Source Code Files]\n            FS3[Monitor File Changes]\n            FS4[Cache File Metadata]\n            FS5[Handle File Permissions]\n            FS6[Manage Temporary Files]\n            \n            FS1 --\u003e FS2 --\u003e FS3\n            FS2 --\u003e FS4\n            FS3 --\u003e FS4\n            FS1 --\u003e FS5\n            FS2 --\u003e FS6\n        end\n        \n        subgraph \"IDE Integration\"\n            IDE1[VS Code Extension]\n            IDE2[IntelliJ Plugin]\n            IDE3[Search Result Display]\n            IDE4[Code Navigation]\n            IDE5[Context Menu Integration]\n            \n            IDE1 --\u003e IDE3 --\u003e IDE4\n            IDE2 --\u003e IDE3 --\u003e IDE4\n            IDE3 --\u003e IDE5\n        end\n    end\n    \n    subgraph \"Actor Interactions with External Systems 👥🔌\"\n        DEV_EXT[Developer] --\u003e OLL4\n        DEV_EXT --\u003e QDR6\n        DEV_EXT --\u003e FS2\n        DEV_EXT --\u003e IDE3\n        \n        ADMIN_EXT[Administrator] --\u003e OLL1\n        ADMIN_EXT --\u003e QDR2\n        ADMIN_EXT --\u003e FS5\n        \n        ANALYST_EXT[Analyst] --\u003e QDR7\n        ANALYST_EXT --\u003e FS4\n        ANALYST_EXT --\u003e IDE4\n        \n        RESEARCHER_EXT[Researcher] --\u003e OLL5\n        RESEARCHER_EXT --\u003e QDR6\n        \n        LEAD_EXT[Team Lead] --\u003e OLL6\n        LEAD_EXT --\u003e QDR7\n    end\n```\n\n### **Error Handling and Recovery Use Cases**\n\n```mermaid\ngraph LR\n    subgraph \"Error Scenarios and Recovery 🚨\"\n        subgraph \"Indexing Errors\"\n            ERR1[File Access Denied]\n            ERR2[Corrupted File Content]\n            ERR3[Network Connection Lost]\n            ERR4[Qdrant Service Unavailable]\n            ERR5[Ollama Model Not Found]\n            ERR6[Insufficient Disk Space]\n            \n            REC1[Retry with Permissions]\n            REC2[Skip and Log Error]\n            REC3[Queue for Retry]\n            REC4[Switch to Offline Mode]\n            REC5[Download Missing Model]\n            REC6[Clean Temporary Files]\n            \n            ERR1 --\u003e REC1\n            ERR2 --\u003e REC2\n            ERR3 --\u003e REC3\n            ERR4 --\u003e REC4\n            ERR5 --\u003e REC5\n            ERR6 --\u003e REC6\n        end\n        \n        subgraph \"Search Errors\"\n            SERR1[No Results Found]\n            SERR2[Search Timeout]\n            SERR3[Invalid Query Syntax]\n            SERR4[Collection Not Initialized]\n            SERR5[AI Model Overloaded]\n            \n            SREC1[Suggest Alternative Queries]\n            SREC2[Extend Timeout Period]\n            SREC3[Provide Query Examples]\n            SREC4[Initialize Collection]\n            SREC5[Queue Request for Retry]\n            \n            SERR1 --\u003e SREC1\n            SERR2 --\u003e SREC2\n            SERR3 --\u003e SREC3\n            SERR4 --\u003e SREC4\n            SERR5 --\u003e SREC5\n        end\n        \n        subgraph \"System Errors\"\n            SYSERR1[Configuration Missing]\n            SYSERR2[Cache Corruption]\n            SYSERR3[Memory Overflow]\n            SYSERR4[Thread Pool Exhaustion]\n            \n            SYSREC1[Load Default Config]\n            SYSREC2[Rebuild Cache]\n            SYSREC3[Restart with More Memory]\n            SYSREC4[Scale Thread Pool]\n            \n            SYSERR1 --\u003e SYSREC1\n            SYSERR2 --\u003e SYSREC2\n            SYSERR3 --\u003e SYSREC3\n            SYSERR4 --\u003e SYSREC4\n        end\n    end\n```\n\nThe codebase has been refactored to implement several design patterns for better maintainability and extensibility:\n\n- **Command Pattern**: For encapsulating indexing status commands\n- **Strategy Pattern**: To define and switch between search algorithms\n- **Observer Pattern**: For notifying status updates to the CLI\n- **Factory Pattern**: To create document instances for indexing\n- **Repository Pattern**: For abstracting file cache operations\n- **Dependency Injection**: Using Spring's @Autowired for service dependencies\n\n### **Command Pattern Example**\n\n```java\n// Command.java - Command interface\npublic interface Command {\n    void execute();\n}\n\n// IndexingStatusCommand.java - Concrete command\npublic class IndexingStatusCommand implements Command {\n    private final IndexingService indexingService;\n\n    public IndexingStatusCommand(IndexingService indexingService) {\n        this.indexingService = indexingService;\n    }\n\n    @Override\n    public void execute() {\n        indexingService.displayStatus();\n    }\n}\n```\n\n### **Strategy Pattern Example**\n\n```java\n// SearchStrategy.java - Strategy interface\npublic interface SearchStrategy {\n    List\u003cSearchResult\u003e search(String query, double threshold);\n}\n\n// SemanticSearchStrategy.java - Concrete strategy\npublic class SemanticSearchStrategy implements SearchStrategy {\n    @Override\n    public List\u003cSearchResult\u003e search(String query, double threshold) {\n        // Implementation for semantic search using embeddings\n    }\n}\n```\n\n### **Observer Pattern Example**\n\n```java\n// IndexingStatusObserver.java - Observer interface\npublic interface IndexingStatusObserver {\n    void onStatusUpdate(IndexingStatus status);\n}\n\n// CLI.java - Concrete observer\npublic class CLI implements IndexingStatusObserver {\n    @Override\n    public void onStatusUpdate(IndexingStatus status) {\n        displayStatus(status);\n    }\n}\n```\n\n### **Factory Pattern Example**\n\n```java\n// DocumentFactory.java - Factory interface\npublic interface DocumentFactory {\n    Document createDocument(File file);\n}\n\n// TextDocumentFactory.java - Concrete factory\npublic class TextDocumentFactory implements DocumentFactory {\n    @Override\n    public Document createDocument(File file) {\n        return new TextDocument(file);\n    }\n}\n```\n\n### **Repository Pattern Example**\n\n```java\n// FileCacheRepository.java - Repository interface\npublic interface FileCacheRepository {\n    void save(FileCacheEntry entry);\n    FileCacheEntry find(String filePath);\n}\n\n// FileCacheRepositoryImpl.java - Repository implementation\npublic class FileCacheRepositoryImpl implements FileCacheRepository {\n    @Override\n    public void save(FileCacheEntry entry) {\n        // Save to cache\n    }\n\n    @Override\n    public FileCacheEntry find(String filePath) {\n        // Find from cache\n    }\n}\n```\n\n### **Dependency Injection Example**\n\n```java\n// SearchService.java - Service with dependencies\n@Service\npublic class SearchService {\n    private final FileSearchService fileSearchService;\n    private final HybridSearchService hybridSearchService;\n\n    @Autowired\n    public SearchService(FileSearchService fileSearchService, HybridSearchService hybridSearchService) {\n        this.fileSearchService = fileSearchService;\n        this.hybridSearchService = hybridSearchService;\n    }\n\n    public List\u003cSearchResult\u003e search(String query) {\n        // Use injected services to perform search\n    }\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkenken64%2Fmisoto-indexer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkenken64%2Fmisoto-indexer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkenken64%2Fmisoto-indexer/lists"}