{"id":35169767,"url":"https://github.com/acoyfellow/t2t","last_synced_at":"2026-01-13T23:38:32.274Z","repository":{"id":328465318,"uuid":"1114041855","full_name":"acoyfellow/t2t","owner":"acoyfellow","description":"Voice-to-text with MCP support. System-wide dictation (hold fn) and AI agent mode (hold fn+ctrl) that connects to any MCP server. Cross-platform desktop app with local Whisper transcription.","archived":false,"fork":false,"pushed_at":"2025-12-30T00:52:17.000Z","size":11819,"stargazers_count":8,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-13T19:43:32.694Z","etag":null,"topics":["accessibility","ai-agents","clipboard","desktop-app","dictation","linux","local-first","macos","mcp","offline","openrouter","productivity","push-to-talk","rust","speech-to-text","svelte","sveltekit","tauri","whisper","windows"],"latest_commit_sha":null,"homepage":"https://t2t.now","language":"Svelte","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/acoyfellow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-10T20:19:47.000Z","updated_at":"2026-01-11T21:49:56.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/acoyfellow/t2t","commit_stats":null,"previous_names":["acoyfellow/t2t"],"tags_count":8,"template":false,"template_full_name":null,"purl":"pkg:github/acoyfellow/t2t","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/acoyfellow%2Ft2t","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/acoyfellow%2Ft2t/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/acoyfellow%2Ft2t/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/acoyfellow%2Ft2t/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/acoyfellow","download_url":"https://codeload.github.com/acoyfellow/t2t/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/acoyfellow%2Ft2t/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28405304,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-13T21:51:37.118Z","status":"ssl_error","status_checked_at":"2026-01-13T21:45:14.585Z","response_time":56,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accessibility","ai-agents","clipboard","desktop-app","dictation","linux","local-first","macos","mcp","offline","openrouter","productivity","push-to-talk","rust","speech-to-text","svelte","sveltekit","tauri","whisper","windows"],"created_at":"2025-12-28T20:04:43.164Z","updated_at":"2026-01-13T23:38:32.268Z","avatar_url":"https://github.com/acoyfellow.png","language":"Svelte","funding_links":[],"categories":[],"sub_categories":[],"readme":"# t2t\n\n![t2t logo](web/static/logo.svg)\n\n**Voice-to-text with intelligence. Hold fn to talk, hold fn+ctrl to command.**\n\n## Download\n\n**[Download for macOS →](https://t2t.now)**\n\n[View all releases on GitHub →](https://github.com/acoyfellow/t2t/releases)\n\n\u003e **Note:** The app is not code-signed yet. On first launch, macOS may show a security warning. To open it:\n\u003e - Right-click the app → **Open**, then click **Open** in the dialog\n\u003e - Or run: `xattr -cr /Applications/t2t.app` in Terminal\n\u003e\n\u003e **Heads up:** This is an unsigned build while we polish things up. Each time you update to a new version, you'll need to remove t2t from System Settings → Privacy \u0026 Security → Accessibility (and Microphone if needed), then re-add it. We'll get it properly signed soon!\n\n## How It Works\n\n- **Hold Fn key** → records microphone audio\n- **Release Fn key** → transcribes using local Whisper model\n- **Typing mode** (red bar): Hold Fn alone → pastes transcription into focused text field, preserves clipboard\n- **Agent mode** (cyan bar): Hold Fn+Ctrl → speaks commands to AI agent\n  - **MCP mode** (if configured): Connects to MCP servers, uses their tools via OpenRouter AI\n  - **AppleScript mode** (fallback): Generates and executes AppleScript for macOS automation\n- Visual feedback: red/cyan bar while recording (based on mode), amber while processing\n\n## Requirements\n\n- **macOS** (currently macOS only; tested on Apple Silicon)\n- **Accessibility permission** - Required for Fn key detection and focusing the correct field before paste\n- **Microphone permission** - Required for audio recording\n- **OpenRouter API key** (for agent mode) - Get one at [openrouter.ai](https://openrouter.ai)\n\nThe app will prompt you if permissions are missing.\n\n## Getting Started\n\n1. **Download and install** the app from [t2t.now](https://t2t.now)\n2. **Grant permissions** when prompted (Accessibility and Microphone)\n3. **Get an OpenRouter API key** at [openrouter.ai](https://openrouter.ai) (required for agent mode)\n4. **Open settings**: Click the menu bar icon → **View Settings**\n5. **Configure agent mode** (optional):\n   - Add your OpenRouter API key in settings\n   - Optionally configure MCP servers for extended automation\n\n## Settings \u0026 Analytics\n\nThe settings window (Menu bar icon → **View Settings**) includes three tabs:\n\n### Analytics Tab\n\nView your transcription usage statistics:\n- **Total Words**: Lifetime count of all transcribed words\n- **Lifetime Average**: Average words per minute across all sessions\n- **Session Average**: Average words per minute for current session\n- **Sessions**: Total number of transcription sessions\n- **Hours Active**: Total time spent transcribing\n- **Recent Activity**: 48-hour hourly activity chart\n\n### Settings Tab\n\nConfigure your t2t installation:\n- **Theme**: Toggle between light and dark mode\n- **OpenRouter API Key**: Set your API key for agent mode\n- **AI Model Selection**: Choose which model to use for agent mode\n  - Supports all OpenRouter models\n  - Auto-refresh available to fetch latest models\n- **MCP Servers**: Add, configure, and manage MCP servers\n  - Test connections and view available tools\n  - Enable/disable servers individually\n  - Supports stdio, HTTP, and SSE transports\n\n### History Tab\n\nSee [History \u0026 Logging](#history--logging) section below.\n\n## MCP (Model Context Protocol) Support\n\nWhen MCP servers are configured in settings, agent mode uses MCP instead of AppleScript. This enables:\n\n- **Extensible automation**: Connect to any MCP-compatible service (databases, APIs, file systems, etc.)\n- **Tool-based execution**: AI agent uses tools provided by your MCP servers\n- **Multiple servers**: Connect to multiple MCP servers simultaneously\n- **Transport options**: Supports stdio, HTTP, and SSE transports\n\n**To configure**: Menu bar icon → **View Settings** → Settings tab → MCP Servers section. Requires an OpenRouter API key.\n\n## Vision Support \u0026 Automatic Screenshots\n\nt2t automatically captures and includes a screenshot with every agent call, enabling vision-capable models to \"see\" your screen context. This works seamlessly with any model - vision-capable models process the image, while text-only models simply ignore it.\n\n### How It Works\n\n- **Automatic capture**: When you use agent mode (Fn+Ctrl), a screenshot is captured before sending your prompt\n- **Universal support**: Screenshots are included with all agent calls, regardless of model selection\n- **Smart routing**: OpenRouter automatically routes to vision-capable models when available, or ignores the image for text-only models\n- **Seamless integration**: Screenshots are included in the API request without any additional UI or user action\n- **Privacy**: Screenshots are only sent to the API (not stored locally), and thumbnails are visible in History\n\n### Privacy \u0026 Permissions\n\n- **Screen Recording permission**: macOS may prompt for screen recording permission the first time you use agent mode\n- **No local storage**: Full screenshots are not saved to disk - they're only sent to the API\n- **Thumbnails**: Small thumbnails (150x150px) are stored locally in History for reference\n- **Error handling**: If screenshot capture fails (e.g., permission denied), the agent falls back to text-only mode\n\n### Technical Details\n\n- Screenshots are captured using macOS `screencapture` command\n- Images are encoded as base64 PNG and included in the OpenAI-compatible message format\n- The screenshot is included in both initial requests and follow-up requests after tool execution\n- Vision-capable models (GPT-4 Vision, Claude 3.5 Sonnet, etc.) can process the image to understand your screen context\n\n## History \u0026 Logging\n\nt2t automatically logs all transcriptions and agent calls for review and debugging.\n\n### Features\n\n- **Transcription history**: All voice transcriptions are saved with timestamps\n- **Agent call logging**: Complete request/response logs for all OpenRouter API calls\n- **Screenshot thumbnails**: Tiny thumbnails (150x150px) of screenshots captured with all agent calls\n- **Search**: Fast local search across all history entries\n- **Expandable details**: Click any entry to view full request/response JSON and tool calls\n\n### Accessing History\n\nMenu bar icon → **View Settings** → **History** tab\n\n### Configuration\n\n- **History limit**: Set `T2T_HISTORY_LIMIT` environment variable (default: 1000 entries)\n- **Storage**: History is stored locally in `history.json` via Tauri's store plugin\n- **Privacy**: All data stays on your machine - nothing is sent to external services\n\n### What's Logged\n\n**Transcriptions:**\n- Timestamp\n- Transcribed text\n\n**Agent Calls:**\n- Timestamp\n- Transcript (your voice input)\n- Model used\n- Full request JSON (messages, parameters)\n- Full response JSON (AI output, tool calls)\n- Tool calls executed (if any)\n- Screenshot thumbnail (captured automatically with each agent call)\n- Success/error status\n\n## First Run\n\nOn first launch, the app automatically downloads the Whisper model (~150MB) to `~/.cache/whisper/ggml-base.en.bin`. This happens in the background.\n\n## For Developers\n\n### Setup\n\n```bash\n# Install dependencies (in desktop/)\ncd desktop \u0026\u0026 bun install\n\n# Development\nbun dev              # From root, or:\ncd desktop \u0026\u0026 bun tauri dev\n\n# Build\nbun build            # From root, or:\ncd desktop \u0026\u0026 bun tauri build\n```\n\n### Requirements\n\n- **Rust** (install via rustup)\n- **Bun** (recommended) or Node.js 18+\n\n### Tech Stack\n\n- **Frontend**: Svelte 5 + SvelteKit\n- **Backend**: Rust + Tauri\n- **STT**: whisper-rs (local Whisper.cpp model)\n- **AI**: OpenRouter API (direct calls, no infrastructure needed)\n- **MCP**: Model Context Protocol client (local stdio/HTTP/SSE)\n- **Hotkey**: macOS event monitoring (Fn key) + fallbacks\n- **Audio capture**: native (Rust via cpal)\n\n**Architecture**: Fully local. Only OpenRouter API calls go out. No servers, workers, or infrastructure required.\n\n### Debugging\n\n- **Logs**: `~/Library/Logs/t2t.log`\n- **Model location**: `~/.cache/whisper/ggml-base.en.bin`\n- **History storage**: `history.json` (via Tauri store, location depends on Tauri config)\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Facoyfellow%2Ft2t","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Facoyfellow%2Ft2t","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Facoyfellow%2Ft2t/lists"}