{"id":38649314,"url":"https://github.com/arach/scout","last_synced_at":"2026-01-17T09:18:06.474Z","repository":{"id":301067718,"uuid":"1002053121","full_name":"arach/scout","owner":"arach","description":"Cross-platform local-first dictation app optimized for agentic use cases","archived":false,"fork":false,"pushed_at":"2025-11-12T23:43:34.000Z","size":635590,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-11-13T00:21:30.856Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://scout.arach.dev","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/arach.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-06-14T16:03:18.000Z","updated_at":"2025-11-12T23:43:38.000Z","dependencies_parsed_at":"2025-06-25T02:18:59.750Z","dependency_job_id":"07b6729f-10bf-44aa-9fd8-4189978877e9","html_url":"https://github.com/arach/scout","commit_stats":null,"previous_names":["arach/scout"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/arach/scout","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arach%2Fscout","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arach%2Fscout/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arach%2Fscout/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arach%2Fscout/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/arach","download_url":"https://codeload.github.com/arach/scout/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arach%2Fscout/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28505163,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-17T06:57:29.758Z","status":"ssl_error","status_checked_at":"2026-01-17T06:56:03.931Z","response_time":85,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-17T09:18:06.263Z","updated_at":"2026-01-17T09:18:06.432Z","avatar_url":"https://github.com/arach.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Local \u0026 Open Source AI Dictation App (MacOS)\n\nScout is a privacy-focused, cross-platform voice transcription application built with Tauri v2, React/TypeScript, and Rust. It provides real-time voice-to-text transcription with advanced model management and file upload capabilities.\n\n![Scout Application](docs/screenshots/recording.png)\n\n## 🚀 Quick Start\n\n**Download Scout v0.4.0**: [Latest Release](https://github.com/arach/scout/releases/latest) • **Only 12MB!** 🪶\n\n1. Download the DMG for your Mac (Apple Silicon or Intel)\n2. Drag Scout to Applications\n3. Launch and download a Whisper model from Settings\n4. Press `Cmd+Shift+Space` to start recording!\n\n## 📊 Project Highlights\n\n- **🪶 Tiny Bundle Size**: Just 12MB for the entire application!\n- **✅ 100% Test Pass Rate**: All tests passing, ensuring reliability\n- **⚡ \u003c300ms Latency**: Near real-time transcription\n- **💾 \u003c215MB Memory**: Efficient resource usage\n\n## Features\n\n- **Local-First Processing**: All audio processing and transcription happens locally on your device\n- **Cross-Platform**: Works on macOS, Windows, and Linux (iOS support planned)  \n- **Real-Time Transcription**: Low-latency voice-to-text conversion using Whisper models\n- **File Upload Support**: Drag \u0026 drop or upload audio files for transcription\n- **Model Management**: Download and switch between different Whisper models (tiny, base, small, medium, large)\n- **Smart Model Selection**: Automatic model downloads and intelligent fallback handling\n- **Privacy-Focused**: No cloud dependencies or telemetry\n- **Push-to-Talk Interface**: Recording with global hotkeys (Cmd+Shift+Space)\n- **Native macOS Overlay**: Minimal recording indicator with position customization\n- **Transcript Management**: Save, search, export, and manage your transcriptions locally\n- **Export Options**: Download transcripts in JSON, Text, or Markdown formats\n- **Settings System**: Comprehensive settings with hotkey customization and model selection\n- **Audio Format Support**: Handles various audio formats with automatic conversion\n- **Background Processing**: Queued processing system for file uploads\n- **Clean UI**: Modern, VSCode-inspired interface with dark mode support\n\n## Architecture\n\n- **Frontend**: React with TypeScript + Vite\n- **Backend**: Rust with Tauri v2\n- **Audio Processing**: cpal (Cross-platform Audio Library) with Voice Activity Detection\n- **Transcription**: \n  - **Built-in Mode**: whisper-rs with CoreML support for optimized performance\n  - **External Service Mode**: Standalone transcriber service with multiple AI models (Whisper, Parakeet MLX, Hugging Face)\n- **Database**: SQLite with sqlx for local transcript storage\n- **Settings**: JSON-based configuration system with hot-reload support\n- **File Processing**: Background queue system with audio format conversion\n- **Service Management**: macOS LaunchAgent integration for external transcription services\n\n## Project Structure\n\nSee [docs/architecture/project-structure.md](docs/architecture/project-structure.md) for detailed directory layout.\n\n```\nscout/\n├── src/                    # React frontend\n│   ├── components/         # React components (ModelManager, Overlay, etc.)\n│   ├── hooks/              # Custom React hooks  \n│   ├── contexts/           # React context providers\n│   ├── lib/                # Utilities and helpers\n│   └── types/              # TypeScript type definitions\n├── src-tauri/              # Rust backend\n│   ├── src/\n│   │   ├── audio/          # Audio recording and conversion\n│   │   ├── transcription/  # Whisper transcription engine\n│   │   ├── db/             # SQLite database layer\n│   │   ├── llm/            # LLM processing pipeline\n│   │   ├── service_manager.rs  # External service management\n│   │   └── macos/          # macOS-specific overlay implementation\n│   └── Cargo.toml          # Rust dependencies\n├── transcriber/            # Standalone transcription service\n│   ├── src/                # Rust service core\n│   ├── python/             # Python ML workers\n│   └── README.md           # Service documentation\n├── docs/                   # Technical documentation\n│   ├── architecture/       # System design and structure\n│   ├── features/           # Feature specifications\n│   └── development/        # Development guides and testing\n├── config/                 # Build and development configuration\n├── scripts/                # Setup and utility scripts\n├── models/                 # Downloaded Whisper model files\n└── package.json            # Node.js dependencies\n```\n\n## Prerequisites\n\n- Node.js (v16 or later)\n- pnpm (v8 or later) - Install with `npm install -g pnpm`\n- Rust (latest stable)\n- CMake (for building whisper.cpp)\n- macOS, Windows, or Linux\n\n## Installation\n\n### Option 1: Download Pre-built Release (Recommended)\n\n1. Download the latest release from the [Releases page](https://github.com/arach/scout/releases)\n   - **macOS (Apple Silicon)**: `Scout_0.4.0_aarch64.dmg`\n   - **macOS (Intel)**: `Scout_0.4.0_x86_64.dmg`\n\n2. Open the DMG file and drag Scout to your Applications folder\n\n3. On first launch, you'll need to download Whisper models:\n   - Open Scout\n   - Go to Settings → Transcription Models\n   - Download at least one model (Base recommended for starting)\n\n**Note for macOS**: You may need to right-click and select \"Open\" the first time to bypass Gatekeeper if the app isn't code-signed.\n\n### Option 2: Build from Source\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/arach/scout.git\ncd scout\n```\n\n2. Install dependencies:\n```bash\npnpm install\n```\n\n3. Run in development mode:\n```bash\npnpm tauri dev\n```\n\nThe application will automatically download a Whisper model on first launch.\n\n## Building\n\nTo build for production:\n\n```bash\npnpm tauri build\n```\n\nThis will create platform-specific binaries in `src-tauri/target/release/bundle/`.\n\n## Usage\n\n### Live Recording\n1. Launch the application\n2. Click the \"Start Recording\" button or use the global hotkey (Cmd+Shift+Space)\n3. Speak clearly into your microphone\n4. The native overlay shows recording status\n5. Click \"Stop Recording\" or press the hotkey again to end recording\n6. The transcript will appear automatically after processing\n\n### File Upload\n1. Drag and drop audio files onto the recording area, or\n2. Click \"Upload Audio File\" to select files\n3. Supported formats: WAV, MP3, M4A, FLAC, and more\n4. Files are processed in the background queue\n5. View progress and results in the transcript list\n\n### Model Management\n\n#### Built-in Models (Integrated Mode)\n1. Open Settings to access the Model Manager\n2. Download different Whisper models based on your needs:\n   - **Tiny (39MB)**: Fastest, basic accuracy\n   - **Base (74MB)**: Good balance of speed and accuracy  \n   - **Small (244MB)**: Better accuracy, slower processing\n   - **Medium (769MB)**: High accuracy\n   - **Large (1550MB)**: Best accuracy, slowest\n3. Switch between models by clicking \"Use This Model\"\n4. The active model is shown with a green \"Active\" badge\n\n#### External Service Models (Advanced Mode)\nFor enhanced performance and additional model options:\n1. Install the [Scout Transcriber Service](transcriber/README.md)\n2. Open Settings → Transcription → Switch to \"Advanced Mode\"\n3. Choose from advanced models:\n   - **Parakeet MLX**: NVIDIA's model optimized for Apple Silicon\n   - **Whisper Large V3 Turbo**: Hugging Face's latest optimized model\n   - **Custom Models**: Bring your own fine-tuned models\n4. Configure worker processes for parallel transcription\n5. Monitor service health in real-time\n\n### Settings \u0026 Customization\n- **Global Hotkeys**: Customize the recording shortcut\n- **Overlay Position**: Move the recording indicator to different screen positions\n- **Transcription Mode**: \n  - **Integrated**: Use built-in Whisper models (simple, no setup)\n  - **Advanced**: Connect to external transcriber service (better performance, more models)\n- **Model Selection**: Choose which model to use for transcription\n- **Voice Activity Detection**: Enable/disable automatic silence detection\n- **External Service Configuration**: Set up distributed transcription with custom ports and worker counts\n\n## Current Status\n\n### ✅ Completed Features\n- **Core Application**: Tauri v2 project with React/TypeScript frontend\n- **Audio Recording**: High-quality recording with cpal and Voice Activity Detection\n- **Transcription**: Full whisper-rs integration with CoreML optimization\n- **Model Management**: Download, switch, and manage multiple Whisper models\n- **File Upload**: Drag \u0026 drop support with automatic audio format conversion\n- **Settings System**: JSON-based configuration with hotkey customization\n- **Database**: SQLite storage with full transcript management\n- **Native Overlay**: macOS-specific recording indicator with positioning\n- **Background Processing**: Queued file processing system\n- **Search \u0026 Export**: Full-text search and export in multiple formats\n- **Global Hotkeys**: Customizable shortcuts for hands-free operation\n- **UI/UX**: VSCode-inspired theme with responsive design\n- **Testing Infrastructure**: Comprehensive test suite with 100% passing rate\n\n### 🚧 In Progress\n- Advanced VAD tuning and noise reduction\n- Real-time streaming transcription\n- Cloud sync options (optional)\n- Plugin system for custom workflows\n\n### 🎯 Planned Features\n- Multiple language support beyond English\n- Custom model training utilities\n- Team collaboration features\n- API endpoint for external integrations\n\n## Development\n\n### Essential Commands\n\n```bash\n# Development\npnpm dev              # Start Vite dev server  \npnpm tauri dev        # Run full app in development mode\n\n# Build\npnpm build            # TypeScript + Vite build\npnpm tauri build      # Build production binaries\n\n# Testing\ncd src-tauri \u0026\u0026 cargo test    # Run Rust tests\n```\n\n### Running Tests\n\nScout has comprehensive test coverage with 97.6% success rate (163/167 tests passing):\n\n```bash\n# Frontend tests (Vitest + React Testing Library)\npnpm test\n\n# Rust tests  \ncd src-tauri\ncargo test\n\n# Run tests with coverage\npnpm test --coverage\n```\n\n### Code Style\n\n- Frontend: ESLint and Prettier\n- Backend: rustfmt and clippy\n\n## Performance Targets\n\n### Integrated Mode (Built-in Whisper)\n- User-perceived latency: \u003c300ms\n- Memory usage: \u003c215MB for base model\n- Processing efficiency: 0.1-0.5 RTF for small models\n\n### Advanced Mode (External Service)\n- User-perceived latency: \u003c200ms with Parakeet MLX\n- Parallel processing: 2-8 concurrent transcriptions\n- Memory usage: ~500MB base + 200MB per worker\n- Processing efficiency: 0.03-0.1 RTF with optimized models\n- Automatic failover to built-in mode if service unavailable\n\n## Security Considerations\n\n- All processing is done locally\n- No network requests for transcription\n- Audio files are stored temporarily and deleted after processing\n- Database is stored in the app's local data directory\n\n## License\n\n[License information to be added]\n\n## Contributing\n\n[Contributing guidelines to be added]\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farach%2Fscout","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Farach%2Fscout","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farach%2Fscout/lists"}