{"id":51008258,"url":"https://github.com/ghostwright/shadow","last_synced_at":"2026-06-20T23:04:21.777Z","repository":{"id":344733000,"uuid":"1163060511","full_name":"ghostwright/shadow","owner":"ghostwright","description":"Your computer was paying attention the whole time. 14-modality capture. Proactive intelligence. Computer-use training data. Native macOS. All on-device. Open source.","archived":false,"fork":false,"pushed_at":"2026-03-16T10:51:04.000Z","size":7524,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-16T14:42:38.350Z","etag":null,"topics":["accessibility","ai","ai-agents","apple-silicon","computer-use","local-first","machine-learning","macos","macos-automation","mcp","memory","mlx","multimodal","personal-ai","proactive-ai","recall-alternative","rewind-alternative","rust","screen-recording","swift"],"latest_commit_sha":null,"homepage":null,"language":"Swift","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ghostwright.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-21T03:17:14.000Z","updated_at":"2026-03-16T13:20:55.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/ghostwright/shadow","commit_stats":null,"previous_names":["ghostwright/shadow"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/ghostwright/shadow","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ghostwright%2Fshadow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ghostwright%2Fshadow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ghostwright%2Fshadow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ghostwright%2Fshadow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ghostwright","download_url":"https://codeload.github.com/ghostwright/shadow/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ghostwright%2Fshadow/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34588011,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-20T02:00:06.407Z","response_time":98,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["accessibility","ai","ai-agents","apple-silicon","computer-use","local-first","machine-learning","macos","macos-automation","mcp","memory","mlx","multimodal","personal-ai","proactive-ai","recall-alternative","rewind-alternative","rust","screen-recording","swift"],"created_at":"2026-06-20T23:04:21.034Z","updated_at":"2026-06-20T23:04:21.767Z","avatar_url":"https://github.com/ghostwright.png","language":"Swift","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"logo-animated.svg\" width=\"200\" alt=\"Shadow\"\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003eShadow\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\u003cem\u003eYour computer was paying attention the whole time.\u003c/em\u003e\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-MIT-blue.svg\" alt=\"MIT License\"\u003e\u003c/a\u003e\n  \u003cimg src=\"https://img.shields.io/badge/platform-macOS%2014%2B-black.svg\" alt=\"macOS 14+\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/swift-6-orange.svg\" alt=\"Swift 6\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/rust-2021-brown.svg\" alt=\"Rust\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/data-100%25%20local-brightgreen.svg\" alt=\"100% Local\"\u003e\n\u003c/p\u003e\n\n---\n\nShadow is a personal intelligence engine for macOS. It captures every signal your computer produces while you work, turns raw behavior into structured understanding, and acts on what it learns. Screen, audio, keystrokes, the full accessibility tree, clipboard, files, git, terminal, search queries, notifications, calendar, system context. All of it, synchronized by timestamp, stored locally, processed on-device. Crash-proof recording that loses at most ten seconds on a force quit. Automatic sleep/wake recovery with display hot-plug detection. Under 3% CPU average. Under 600 MB per day.\n\nThis is not a screen recorder. Shadow generates episodes from your work, runs a continuous heartbeat that pushes proactive observations, operates vision models and LLMs entirely on Apple Silicon, fine-tunes its own grounding models on your behavior, replays learned procedures through a safety-gated computer-use engine, and exposes a 26-tool agent runtime with streaming UI. It captures how you work, learns why, and starts helping before you ask.\n\nWe are open-sourcing Shadow because the capture layer is the hardest problem to solve and we have solved it. The next layer, memory graphs, MCP servers, personal models, agents trained on real human behavior, belongs to the community. Build on top of what is here.\n\n## Shadow in Action\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"demo.gif\" width=\"720\" alt=\"Shadow in action\"\u003e\n\u003c/p\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eOnboarding\u003c/strong\u003e\u003c/summary\u003e\n\u003cp\u003eFour-step setup: welcome, permissions, model download, launch. Shadow walks you through granting Screen Recording, Accessibility, Input Monitoring, Microphone, and Speech Recognition. Models download on-device during setup.\u003c/p\u003e\n\u003c!-- \u003cimg src=\"demo-onboarding.gif\" width=\"720\" alt=\"Shadow onboarding\"\u003e --\u003e\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eProactive Inbox and Heartbeat\u003c/strong\u003e\u003c/summary\u003e\n\u003cp\u003eShadow's heartbeat runs two-tier analysis (fast every 10 minutes, deep every 30) and pushes observations to an overlay and inbox. \"Your commit is still pending.\" \"150 context switches in 2 hours.\" \"Meeting follow-up needed.\" Real suggestions, not canned templates.\u003c/p\u003e\n\u003c!-- \u003cimg src=\"demo-proactive.gif\" width=\"720\" alt=\"Proactive suggestions\"\u003e --\u003e\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eSearch\u003c/strong\u003e\u003c/summary\u003e\n\u003cp\u003eSpotlight-quality overlay (Option+Space). Hybrid search across CLIP visual embeddings, Tantivy full-text, and timeline. \"When was I looking at that chart?\" returns results by meaning, not just text matching.\u003c/p\u003e\n\u003c!-- \u003cimg src=\"demo-search.gif\" width=\"720\" alt=\"Search overlay\"\u003e --\u003e\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eTimeline\u003c/strong\u003e\u003c/summary\u003e\n\u003cp\u003eMulti-track scrubber like a DAW. Screenshot track, app track, audio waveform, input density. Drag the playhead to any moment in your day.\u003c/p\u003e\n\u003c!-- \u003cimg src=\"demo-timeline.gif\" width=\"720\" alt=\"Timeline view\"\u003e --\u003e\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eLive Stats\u003c/strong\u003e\u003c/summary\u003e\n\u003cp\u003e343,000+ events captured across 39 apps. 473 episodes synthesized. 3,386 CLIP embeddings generated. All processed locally, all while the Mac runs normally. Close the lid, open it back up, Shadow picks up exactly where it left off.\u003c/p\u003e\n\u003c!-- \u003cimg src=\"demo-stats.gif\" width=\"720\" alt=\"Live capture stats\"\u003e --\u003e\n\u003c/details\u003e\n\n## How Shadow Compares\n\nEvery existing tool looks at one or two modalities. Shadow captures fourteen.\n\n| | Shadow | Screenpipe | Microsoft Recall | Mem0 | Anthropic Computer Use |\n|---|---|---|---|---|---|\n| Modalities | 14 | 3-4 | 1 (screenshots) | 0 (text from chat) | 1 (screenshots, on-demand) |\n| Accessibility tree | Full snapshots | Partial | No | No | No |\n| Episode generation | Yes | No | No | No | No |\n| Proactive intelligence | Heartbeat with push suggestions | No | No | No | No |\n| On-device LLM | Qwen 7B/32B via MLX | No | No | No | Cloud API |\n| Vision grounding | ShowUI-2B + LoRA fine-tuning | No | No | No | Screenshot-only |\n| Computer-use agent | 26-tool agent + Mimicry system | No | No | No | Yes (cloud) |\n| Safety gates | Pre-action checks + undo manager | No | No | No | No |\n| Meeting intelligence | Whisper + summaries + speaker attribution | No | No | No | No |\n| Learned procedures | Workflow replay from observation | No | No | No | No |\n| Open source | MIT | MIT | No | Apache 2.0 | No |\n| Price | Free | $400 lifetime | Free (requires $1000+ PC) | $19+/mo | Per token |\n\nEach modality multiplies every other. A screenshot tells you what was on screen. Add keystrokes and you know what the user typed. Add the accessibility tree and you know what every element is and what was clicked. Add clipboard and you know what they deemed important. Add git and you know what they produced. Add terminal and you know what succeeded and failed. Add search queries and you know their intent. This is not additive, it is combinatorial.\n\n## What Shadow Does\n\nShadow records your Mac like a studio records a band. Each signal gets its own track. Time is the universal key.\n\n**Capture.** Continuous screen recording across all displays using fragmented MP4 (H.265 hardware-encoded). Multi-display hot-plug: connect or disconnect a monitor mid-session and Shadow adapts. Microphone and system audio with on-device Whisper transcription and word-level timestamps. Audio is mic-triggered: Shadow does not record silence, it starts when your mic goes active and waits 30 seconds after it goes quiet. Every keystroke, click, scroll, and gesture, with every mouse click enriched by the accessibility element at those coordinates (role, title, identifier) at sub-millisecond latency. Passwords never reach the event pipeline: secure text fields are detected at the CGEventTap level before anything is written to storage. Full accessibility tree snapshots of every focused app, diff-aware, capturing the semantic structure of every UI element. Clipboard with source and destination app. File changes, git commits, terminal commands with exit codes, search queries, notifications, calendar events, and system context. 200-600 MB per day. A 512 GB Mac stores 6-12 months.\n\n**Understand.** Episode generation detects activity boundaries and produces structured work units with LLM summaries. A proactive heartbeat runs two-tier analysis and pushes suggestions without being asked. Semantic search combines CLIP vector embeddings (search by meaning), Tantivy full-text search, and timeline queries. Meeting intelligence transcribes, summarizes, and attributes speakers. Pattern detection over weeks reflects how you actually work: when your focus happens, how you communicate, what you consistently underestimate. A two-tier local LLM system (7B for fast tasks, 32B for deep reasoning) runs entirely on Apple Silicon with KV-cache session reuse that drops first-token latency from 14 seconds to under 1 second across multi-turn conversations.\n\n**Act.** A 26-tool agent runtime with streaming UI handles search, context retrieval, visual analysis, AX-based actions, and memory operations. The Mimicry system watches how you perform tasks, synthesizes replayable procedures, and executes them through a safety-gated pipeline with pre-action checks, post-action verification, and undo support. A grounding oracle cascades through four strategies: AX exact match, AX fuzzy match, on-device VLM (ShowUI-2B), and cloud vision. 70-80% of interactions are resolved by the free, instant AX path. Built-in LoRA training generates grounding data from your actual clicks and fine-tunes the vision model to your specific apps and workflows. When the agent takes actions, those events are tagged and excluded from recording. Shadow learns from you, not from itself.\n\n**Remember.** A semantic memory store holds knowledge entries by category: preferences, facts, patterns, relationships, skills. Directive memory stores your instructions. Behavioral search finds past workflows similar to what you are doing now. Procedure matching surfaces learned workflows when context suggests they are relevant. Three-tier retention manages storage automatically: hot (7 days, full video and audio), warm (8-30 days, keyframes and transcripts), cold (31+ days, indices only). Transcripts are never deleted until their source audio has been fully transcribed. Storage stays under a configurable cap.\n\n## Why This Matters for Computer-Use AI\n\nLLMs can write code that takes staff engineers days. They still cannot use a computer like an eight-year-old. They cannot click buttons reliably, navigate between apps, handle unexpected popups, or recover from errors. A recent study found the best agent completes only 24% of real office tasks. Another showed that just 312 real human trajectories can outperform Claude 3.7 Sonnet at computer use.\n\nThe problem is training data. Synthetic benchmarks use scripted tasks in sandboxes. The messy, multi-app reality of how people actually work does not exist in any dataset. Shadow produces it. Every user action is preceded by a screen state (screenshot + accessibility tree) and followed by a new screen state. That is the exact `(state, action, next_state)` format needed for behavioral cloning. One user generates 25,000-40,000 actions per day. Undo detection provides negative examples. Episode boundaries provide goal annotations. This data cannot be synthetically generated.\n\nCo-pilots need API integrations. Shadow does not. Slack is already on your screen. Shadow sees what is there, who sent what, and what you did about it.\n\n## The Vision\n\nShadow starts as search. \"What was I doing at 2pm?\" returns the screenshot, the transcript, the context.\n\nOver days, it becomes a behavioral mirror. It reflects how you actually work, not how you think you work. Pattern detection, time allocation, commitment tracking.\n\nOver weeks, it becomes an apprentice that earns trust gradually:\n\n1. **Tell me things I forgot.** Search, playback, summaries.\n2. **Remind me about things coming up.** Meeting prep, commitment tracking.\n3. **Prepare things I'll need.** Assemble context, surface relevant history.\n4. **Do things for me.** Learned procedure replay through the safety-gated Mimicry system. For full computer use, [Ghost OS](https://github.com/ghostwright/ghost-os) provides the action layer.\n\nThe endgame: all of this data flowing to your own infrastructure, your own models. A living understanding of how you work that any AI can query with your permission. Memory graphs connecting episodes, people, projects, and commitments. An MCP server that lets any AI agent access your context. Personal models trained on the richest behavioral dataset that exists: yours.\n\n## Architecture\n\n```\nShadow (macOS menu bar app, Swift + Rust)\n|\n|-- Capture (Swift, Apple-native APIs)\n|   |-- ScreenCaptureKit    per-display H.265, fragmented MP4, sleep/wake recovery\n|   |-- CGEventTap          keystrokes, mouse, scroll, AX enrichment, undo detection\n|   |-- AXUIElement         accessibility tree, browser URLs, window titles\n|   |-- AVFoundation        mic-triggered audio, system audio via SCK\n|   |-- FSEvents            file system + git directory monitoring\n|   +-- NSWorkspace         app switches, sleep/wake, display hot-plug\n|\n|-- Storage (Rust via UniFFI)\n|   |-- MessagePack event log (zstd compressed, hourly rotation)\n|   |-- Tantivy full-text search\n|   |-- CLIP vector embeddings (cosine similarity)\n|   |-- SQLite timeline index (WAL mode)\n|   +-- 3-tier retention (hot / warm / cold, configurable cap)\n|\n|-- Intelligence (Swift + on-device models)\n|   |-- MobileCLIP-S2       image embeddings (CoreML, Neural Engine)\n|   |-- Whisper             transcription (MLX, Apple Silicon)\n|   |-- Qwen 7B/32B         reasoning + summaries (MLX, KV-cache reuse)\n|   |-- Qwen2.5-VL-7B       vision understanding (MLX)\n|   |-- ShowUI-2B           UI grounding + LoRA fine-tuning on your usage\n|   |-- nomic-embed         text embeddings\n|   |-- Episode engine      boundary detection + summarization\n|   |-- Proactive heartbeat fast 10min / deep 30min, push suggestions\n|   |-- Agent runtime       26 tools, streaming UI, task decomposition\n|   +-- Mimicry             procedure learning, safety gates, undo support\n|\n+-- UI (SwiftUI, native macOS)\n    |-- Menu bar            status, mini timeline, pause/resume\n    |-- Search overlay      Spotlight-quality, CLIP + text hybrid\n    |-- Timeline            multi-track scrubber (video + app + audio)\n    |-- Proactive overlay   suggestions, inbox, trust feedback\n    +-- Settings            API keys, models, retention, preferences\n```\n\n## Build From Source\n\n```bash\ngit clone https://github.com/ghostwright/shadow.git\ncd shadow\n\n# Build Rust storage engine and generate Swift bindings\n./scripts/build-rust.sh\n\n# Install Python dependencies and download CLIP models (~190 MB)\npip3 install huggingface_hub open_clip_torch\npython3 scripts/provision-clip-models.py\n\n# Generate Xcode project and build\ncd Shadow \u0026\u0026 xcodegen generate \u0026\u0026 cd ..\nxcodebuild -project Shadow/Shadow.xcodeproj -scheme Shadow -configuration Debug build\n\n# Launch\nopen ~/Library/Developer/Xcode/DerivedData/Shadow-*/Build/Products/Debug/Shadow.app\n```\n\nRequires Apple Silicon (M1 or later), macOS 14+, Xcode 16.4+, Rust via rustup, Python 3.8+, XcodeGen (`brew install xcodegen`). The Qwen 32B model requires 48 GB+ RAM. Grant permissions when prompted. After granting Screen Recording, quit and relaunch.\n\n## Privacy\n\nYour data stays on your machine. Shadow does not phone home. There is no account, no telemetry, no cloud dependency.\n\nPasswords and sensitive fields are detected at the CGEventTap level and excluded before reaching storage. You can pause recording, exclude apps, or delete any time range of data.\n\nCloud LLM features (Claude, GPT) are opt-in with your own API key. When disabled, all intelligence runs locally via MLX on Apple Silicon.\n\nThis is open source. You do not need to trust a privacy policy. Read the code. See [PRIVACY.md](PRIVACY.md) for the full data handling details.\n\n## Contributing\n\nWe need testing across hardware (M1 through M4), smarter episode detection, memory graph construction, MCP server development, new capture tracks (browser extensions, IDE plugins), better proactive analysis models, and documentation. If you are building agents that operate computers, Shadow is the observation layer.\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for setup and guidelines.\n\n## Acknowledgments\n\nShadow exists because Apple Silicon made on-device intelligence practical. The M-series chips, Neural Engine, VideoToolbox hardware encoding, ScreenCaptureKit, and CoreML together make it possible to capture, transcribe, embed, and reason about your entire computer usage without touching the cloud. We believe Apple Silicon is the future of personal AI and Shadow is built entirely around that conviction.\n\n- **Apple** for [MLX](https://github.com/ml-explore/mlx-swift) (on-device ML inference that makes local LLMs and vision models viable), [MobileCLIP](https://github.com/apple/ml-mobileclip) (efficient CLIP for semantic search on the Neural Engine), ScreenCaptureKit, VideoToolbox, and CoreML\n- **[Argmax](https://github.com/argmaxinc/WhisperKit)** for WhisperKit, bringing Whisper transcription to Apple Silicon natively\n- **[Quickwit](https://github.com/quickwit-oss/tantivy)** for Tantivy, the full-text search engine in Rust that powers Shadow's instant search across hundreds of thousands of events\n- **[Mozilla](https://github.com/mozilla/uniffi-rs)** for UniFFI, making our Swift-Rust bridge seamless\n- **[Hugging Face](https://github.com/huggingface/swift-transformers)** for swift-transformers, providing the Hub client and tokenizer infrastructure\n- **[Ghost OS](https://github.com/ghostwright/ghost-os)** (900+ stars), our open-source computer-use engine that provides the action layer where Shadow provides the observation layer\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fghostwright%2Fshadow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fghostwright%2Fshadow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fghostwright%2Fshadow/lists"}