{"id":46077874,"url":"https://github.com/fless-lab/xfinder","last_synced_at":"2026-03-01T15:02:04.798Z","repository":{"id":323993537,"uuid":"1095515777","full_name":"fless-lab/xfinder","owner":"fless-lab","description":"A high-performance desktop search application (AI Powered)","archived":false,"fork":false,"pushed_at":"2025-11-13T07:04:53.000Z","size":230,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-11-13T09:07:16.810Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fless-lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-13T06:41:48.000Z","updated_at":"2025-11-13T06:53:52.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/fless-lab/xfinder","commit_stats":null,"previous_names":["fless-lab/xfinder"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/fless-lab/xfinder","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fless-lab%2Fxfinder","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fless-lab%2Fxfinder/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fless-lab%2Fxfinder/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fless-lab%2Fxfinder/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fless-lab","download_url":"https://codeload.github.com/fless-lab/xfinder/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fless-lab%2Fxfinder/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29973119,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-01T14:44:57.896Z","status":"ssl_error","status_checked_at":"2026-03-01T14:43:27.662Z","response_time":124,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-03-01T15:02:03.763Z","updated_at":"2026-03-01T15:02:04.777Z","avatar_url":"https://github.com/fless-lab.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# xfinder\n\n**Advanced file search and retrieval system for Windows (will be extended to other OS) administrative environments**\n\n## Overview\n\nxfinder is a high-performance desktop search application designed for administrative users who need to locate files and information quickly across large document repositories. Built with Rust and native UI technologies, it provides enterprise-grade search capabilities in a lightweight package.\n\n## Key Features\n\n- **Fast Indexing**: Full-text search engine powered by Tantivy with sub-100ms query response time\n- **Real-time Monitoring**: Automatic file system watching and index updates\n- **Semantic Search**: AI-powered search understanding natural language queries\n- **Email Integration**: Unified search across Outlook PST files, Thunderbird MBOX, and IMAP accounts\n- **OCR Support**: Automatic text extraction from scanned PDFs and images (Tesseract 5)\n- **Conversational Interface**: \"Assist Me\" mode providing contextual answers with verifiable sources\n\n---\n\n## Core Capabilities\n\n### File Search\n- Instant filename search with sub-100ms response for 100k+ files\n- Fuzzy matching algorithm for typo-tolerant queries\n- Advanced filtering by extension, date, size, and directory\n- Global keyboard shortcut access (Ctrl+Shift+F)\n\n### Content Indexing\n- Full-text search across document contents (SQLite FTS5)\n- Automatic detection and indexing of scanned PDFs\n- OCR text extraction from images (JPEG, PNG, TIFF)\n- Configurable by directory and file type\n- Multi-language support (French and English priority)\n\n### Semantic Search\n- Natural language query understanding\n- Vector-based similarity search using compact embeddings (LEANN)\n- Conversational \"Assist Me\" mode with source attribution\n- 97% smaller index size compared to traditional vector databases\n\n### Email Search\n- Outlook PST/MAPI integration\n- Thunderbird MBOX parsing\n- IMAP and Exchange server support\n- Attachment indexing and search\n\n### Real-time Updates\n- File system monitoring via watchdog\n- Automatic index updates on file creation, modification, and deletion\n- Intelligent handling of file moves and renames\n- Scheduled indexing with configurable intervals\n\n---\n\n## Technology Stack\n\n| Component | Technology | Rationale |\n|-----------|------------|-----------|\n| **Language** | Rust | Memory safety, performance, concurrency |\n| **UI Framework** | egui | Native, lightweight, GPU-accelerated |\n| **Windowing** | winit | Cross-platform window management |\n| **Rendering** | wgpu | Hardware-accelerated graphics |\n| **Search Engine** | Tantivy | Lucene-like full-text search in Rust |\n| **Database** | SQLite with FTS5 | Embedded, ACID-compliant, full-text capable |\n| **Embeddings** | all-MiniLM-L6-v2 | Compact (80MB), multilingual, 384 dimensions |\n| **Vector Database** | LEANN | Ultra-compact indices (97% size reduction) |\n| **OCR** | Tesseract 5 | Industry standard, offline, multi-language |\n| **File Monitoring** | notify-rs | Cross-platform filesystem events |\n| **Email Parsing** | mailparse, libpff | PST and MBOX format support |\n\n**Binary Size**: ~8MB base + 110MB (OCR + ML models) = 118MB total\n\n---\n\n## Architecture\n\n```\n┌─────────────────────────────────────────────────────────┐\n│                 UI Layer (egui)                         │\n│    Search Interface | Configuration | Assist Me Mode    │\n└────────────────────┬────────────────────────────────────┘\n                     │\n┌────────────────────▼────────────────────────────────────┐\n│              Core Application (Rust)                    │\n│                                                          │\n│  File System Watchdog → Indexer → Content Extractor    │\n│  Search Engine: Tantivy + SQLite FTS5 + LEANN          │\n│  Email Parser: PST/MBOX/IMAP                            │\n└────────────────────┬────────────────────────────────────┘\n                     │\n┌────────────────────▼────────────────────────────────────┐\n│              Storage Layer                              │\n│  tantivy_index/ | metadata.db (SQLite) | vectors.leann │\n└─────────────────────────────────────────────────────────┘\n```\n\n---\n\n## Getting Started\n\n### For Developers\n\n```bash\n# Prerequisites\nrustc \u003e= 1.70\ncargo \u003e= 1.70\n\n# Clone and build\ngit clone https://github.com/fless-lab/xfinder.git\ncd xfinder\ncargo build --release\n\n# Run tests\ncargo test\n\n# Launch application\ncargo run\n```\n\n\n### For End Users (Future)\n\n```bash\n# Installation\nDownload xfinder-setup.msi from releases\nRun installer and follow prompts\n\n# First Use\n1. Launch xfinder\n2. Select directories to monitor\n3. Start indexing\n4. Search using Ctrl+Shift+F\n```\n\n---\n\n## Performance Targets\n\n| Metric | Target | Measurement |\n|--------|--------|-------------|\n| Search query (100k files) | \u003c100ms | P95 latency |\n| Indexing throughput | \u003e1000 files/min | Average on SSD |\n| OCR processing (A4 page) | \u003c5s | PaddleOCR/Tesseract standard quality |\n| Semantic search | \u003c3s | Including embedding generation |\n| Index size overhead | \u003c5% of corpus | Metadata + vectors |\n| Memory footprint (idle) | \u003c100MB | Application only |\n| Cold start time | \u003c500ms | To main window display |\n\n---\n\n## Design Decisions\n\n### Language Priority\nMulti-language support with French and English as primary targets. OCR and semantic search models selected for optimal French performance.\n\n### Vector Database\nLEANN selected for 97% index size reduction compared to FAISS. Proof-of-concept validation scheduled for Week 13-14.\n\n### Email Parsing Strategy\n- Primary: Windows MAPI API (requires Outlook installation)\n- Fallback: libpff library for direct PST parsing\n- Thunderbird: mailparse crate for MBOX files\n\n### Network Drives\nUNC path monitoring (`\\\\Server\\Share`) supported via same watchdog mechanism as local drives.\n\n### GPU Acceleration\nOptional CUDA support for embedding generation provides 10x speed improvement at cost of 500MB additional dependencies. Disabled by default.\n\n---\n\n## Contributing\n\nProject currently in active development. Contributions welcome after Phase 1 MVP completion.\n\n## License\n\nThis software is licensed under a **Custom Non-Commercial License**.\n\n### Permissions\n- ✅ Free personal and non-commercial use\n- ✅ Modification and distribution for non-commercial purposes\n- ✅ Source code access and study\n\n### Restrictions\n- ❌ Commercial use prohibited without explicit written permission\n- ❌ Sale or sublicensing of the software or derivative works\n- ❌ Use in commercial products or paid services (SaaS)\n\nFor the complete license terms, see [LICENSE](LICENSE).\n\n### Commercial Licensing\nFor commercial use inquiries, please contact:\n**achilleatarmla@gmail.com**\n\n## Project Status\n\n**Current Phase**: Phase 1 - Core Search Implementation (Week 1)\n**Last Updated**: 2025-11-12\n**Version**: 0.1.0-alpha\n\n---\n\nBuilt with Rust for performance, security, and reliability.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffless-lab%2Fxfinder","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffless-lab%2Fxfinder","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffless-lab%2Fxfinder/lists"}