https://github.com/hkuds/catchme

"CatchMe: Make Your AI Agents Truly Personal"
https://github.com/hkuds/catchme
ai-agent clawdbot-plugin llm recall-ai retrieval-systems screen-recorder
Last synced: 2 months ago
JSON representation
"CatchMe: Make Your AI Agents Truly Personal"
Host: GitHub
URL: https://github.com/hkuds/catchme
Owner: HKUDS
License: apache-2.0
Created: 2026-03-30T04:21:59.000Z (3 months ago)
Default Branch: main
Last Pushed: 2026-04-01T03:34:09.000Z (2 months ago)
Last Synced: 2026-04-04T05:34:17.907Z (2 months ago)
Topics: ai-agent, clawdbot-plugin, llm, recall-ai, retrieval-systems, screen-recorder
Language: Python
Homepage: https://hkuds.github.io/CatchMe/
Size: 17.8 MB
Stars: 279
Watchers: 2
Forks: 28
Open Issues: 1
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project

README

          


  中文 · 日本語 · Español · English





  



CatchMe: Make Your AI Agents Truly Personal




  Capture Your Entire Digital Footprint: Lightweight & Vectorless & Powerful.





  

  

  

  

  

  


  

  

  





  Features  · 

  How It Works  · 

  LLM Config  · 

  Get Started  · 

  Cost  · 

  Community



「 Just do your thing. CatchMe captures everything else — stored locally to ensure privacy and security.  」




  



**🦞 Makes Your Agents Truly Personal**. CatchMe ships as an agent-compatible skill for CLI agents (OpenClaw, NanoBot, Claude, Cursor, etc.). Run CatchMe independently. Your agents query memories via CLI commands only.

##

## 🎯 Enrich Your Personal Digital Context

  

    

      


      
💻 Personal Coding Assistant

      "What was I coding in Claude Code today?"



      

        • Code session replay


        • Recall your edited files


        • Trace what you typed

      

    

    

      


      🔍 Personal Deep Research

      "What was I reading about AI yesterday?"



      

        • Web/PDF viewed


        • Search queries typed


        • Reading info tracked

      

    

    

      


      📁 Personal Files Manager

      "Which files did I change today?"



      

        • File changes tracked


        • Docs accessed


        • Edits reviewed

      

    

    

      


      🧩 Digital Life Overview

      "How did I spend my afternoon?"



      

        • App usage tracked


        • Workflows replayed


        • Activities recalled

      

    

  

## ✨ Key Features

### 📹 Always-On Event Capture

- **Event-Driven Recording**: No timer or delays - catch mouse actions with crosshair annotation instantly.

- **Comprehensive Context**: Five recorders track windows, keyboard, clipboard, notifications, and files around mouse actions.

### 🌲 Intelligent Memory Hierarchy

- **Auto-Organization**: Raw streams structure into five tiers: Day → Session → App → Location → Action.

- **Smart Summaries**: LLM summaries at each level, transforming logs into searchable knowledge trees.

### 🔍 Tree-Based Retrieval

- **No Vector Complexity**: Skip embeddings and VDBs — our system uses tree-based reasoning for navigation.

- **Top-Down Search**: LLM reads summaries, selects relevant branches, and drills down to evidence.

### 🤖 Zero-Config Agent Integration

- **One-File Setup**: Drop a single skill file into any AI agent for instant integration.

- **Immediate Access**: CLI-based screen history queries with zero configuration required.

### 🪶 Ultralight & Privacy-First

- **Minimal Footprint**: ~0.2GB runtime RAM with efficient SQLite + FTS5 storage.

- **Local & Offline**: All data stays on your machine with full offline mode via Ollama/vLLM/LM Studio.

### 🖥️ Rich Web Interface

- **Visual Exploration**: Interactive timelines, memory tree navigation, and real-time system monitoring.

- **Natural Conversation**: Chat with your complete digital footprint using natural language.



  



## 💡 CatchMe Architecture

CatchMe transforms raw digital activity into structured, searchable memory through three concurrent stages:

### 🔄 Record → Organize → Reason: Turn digital chaos into queryable memory

**Capture**. Six background recorders silently track your activity. They monitor window focus, keystrokes, mouse movement, screenshots, clipboard, and notifications.

**Index**. Raw events auto-organize into a Hierarchical Activity Tree: Day → Session → App → Location → Action. Each node gets LLM-generated summaries. Fast, meaningful recall without vector embeddings.

**Retrieve**. You ask a question. The LLM traverses your memory tree top-down. It selects relevant nodes and inspects raw data like screenshots or keystrokes. Then synthesizes a precise answer.



  



### 🌲 Hierarchical Activity Tree

The Activity Tree is CatchMe's memory core. It provides structured, multi-level views of your digital life. Browse high-level summaries or dive into granular details.



  



### 🔍 Intelligent Tree Retrieval

CatchMe skips traditional vector search. Instead, the LLM directly navigates your Activity Tree. This enables complex, cross-day reasoning. Precise evidence gathering from raw activity history.



  



**📖 Learn More**: Detailed design insights and technical deep-dive available in our [blog](https://hkuds.github.io/CatchMe/).

## 🧠 LLM Configuration

### **❗️ Data Privacy Notice**

• **100% Local Storage**: All raw data (screenshots, keystrokes, activity trees) stays in ~/data/ and never leaves your machine. 

• **Offline-First Options**: Local LLMs (Ollama, vLLM, LM Studio) enable fully offline operation without any cloud dependency.

• **⚠️Cloud Provider Caution**: If used, cloud APIs will be used to summarize your daily activities. **Untrusted endpoints may expose private data** — review data policies of your provider carefully.

### **📋 Requirements**

• **Multimodal support**: Your model should be able to handle text + images.

• **Context window**: Make sure the context window of your model exceed `max_tokens` limits in `config.json`.

• **Cost control**: For *forced cost control*, set limits via `llm.max_calls` or increase `filter.mouse_cluster_gap` to reduce summarization frequency.

CatchMe requires an LLM for background summarization and intelligent retrieval. Use **catchme init** (in Get Started)for **guided setup** or follow the **manual configuration** steps below.

For cloud API services:

```json

{

    "llm": {

        "provider": "openrouter",

        "api_key": "sk-or-...",

        "api_url": null,

        "model": "google/gemini-3-flash-preview"

    }

}

```

For local/offline operation:

```json

{

    "llm": {

        "provider": "ollama",

        "api_key": null,

        "api_url": null,

        "model": "gemma3:4b"

    }

}

```

Supported LLM Providers

| Provider 
| ------------------------- 
| **OpenRouter** (gateway) 
| **AiHubMix** (gateway) 
| **SiliconFlow** 
| **OpenAI** 
| **Anthropic** 
| **DeepSeek** 
| **Gemini** 
| **Groq** 
| **Mistral** 
| **Moonshot / Kimi** 
| **MiniMax** 
| **Zhipu AI (GLM)** 
| **DashScope (Qwen)** 
| **VolcEngine** 
| **VolcEngine Coding** 
| **BytePlus** 
| **BytePlus Coding** 
| **Ollama** (local) 
| **vLLM** (local) 
| **LM Studio** (local)

| Config name              | Default API URL                                         | Get Key                                                              | | ------------------------ | ------------------------------------------------------- | -------------------------------------------------------------------- | | `openrouter`             | `https://openrouter.ai/api/v1`                          | [openrouter.ai/keys](https://openrouter.ai/keys)                     | | `aihubmix`               | `https://aihubmix.com/v1`                               | [aihubmix.com](https://aihubmix.com)                                 | (gateway) | `siliconflow`            | `https://api.siliconflow.cn/v1`                         | [cloud.siliconflow.cn](https://cloud.siliconflow.cn)                 | | `openai`                 | `https://api.openai.com/v1`                             | [platform.openai.com](https://platform.openai.com/api-keys)          | | `anthropic`              | `https://api.anthropic.com/v1`                          | [console.anthropic.com](https://console.anthropic.com)               | | `deepseek`               | `https://api.deepseek.com/v1`                           | [platform.deepseek.com](https://platform.deepseek.com/api_keys)      | | `gemini`                 | `https://generativelanguage.googleapis.com/v1beta`      | [aistudio.google.com](https://aistudio.google.com/apikey)            | | `groq`                   | `https://api.groq.com/openai/v1`                        | [console.groq.com](https://console.groq.com/keys)                    | | `mistral`                | `https://api.mistral.ai/v1`                             | [console.mistral.ai](https://console.mistral.ai)                     | | `moonshot`               | `https://api.moonshot.ai/v1`                            | [platform.moonshot.cn](https://platform.moonshot.cn)                 | | `minimax`                | `https://api.minimax.io/v1`                             | [platform.minimaxi.com](https://platform.minimaxi.com)               | | `zhipu`                  | `https://open.bigmodel.cn/api/paas/v4`                  | [open.bigmodel.cn](https://open.bigmodel.cn)                         | | `dashscope`              | `https://dashscope.aliyuncs.com/compatible-mode/v1`     | [dashscope.console.aliyun.com](https://dashscope.console.aliyun.com) | | `volcengine`             | `https://ark.cn-beijing.volces.com/api/v3`              | [console.volcengine.com](https://console.volcengine.com)             | | `volcengine_coding_plan` | `https://ark.cn-beijing.volces.com/api/coding/v3`       | [console.volcengine.com](https://console.volcengine.com)             | | `byteplus`               | `https://ark.ap-southeast.bytepluses.com/api/v3`        | [console.byteplus.com](https://console.byteplus.com)                 | | `byteplus_coding_plan`   | `https://ark.ap-southeast.bytepluses.com/api/coding/v3` | [console.byteplus.com](https://console.byteplus.com)                 | | `ollama`                 | `http://localhost:11434/v1`                             | —                                                                    | | `vllm`                   | `http://localhost:8000/v1`                              | —                                                                    | | `lmstudio`               | `http://localhost:1234/v1`                              | —                                                                    |

> Any OpenAI-compatible endpoint works — just set `api_url` and `api_key` directly.

All Configuration Parameters

| Section       | Parameter                  | Default     | Description                                         |

| ------------- | -------------------------- | ----------- | --------------------------------------------------- |

| **web**       | `host`                     | `127.0.0.1` | Dashboard bind address                              |

|               | `port`                     | `8765`      | Dashboard port                                      |

| **llm**       | `provider`                 | —           | LLM provider name (see table above)                 |

|               | `api_key`                  | —           | API key for the provider                            |

|               | `api_url`                  | *(auto)*    | Custom endpoint; auto-set per provider if omitted   |

|               | `model`                    | —           | Model name (provider-specific)                      |

|               | `max_calls`                | `0`         | Max LLM calls per cycle (`0` = unlimited; set to limit costs) |

|               | `max_images_per_cluster`   | `5`         | Max screenshots sent per event cluster              |

| **filter**    | `window_min_dwell`         | `3.0`       | Min window dwell time (sec) before recording        |

|               | `keyboard_cluster_gap`     | `3.0`       | Keyboard event clustering gap (sec)                 |

|               | `mouse_cluster_gap`        | `3.0`       | Time gap (sec) to merge mouse events; **larger values reduce LLM summaries** |

| **summarize** | `language`                 | `en`        | Summary output language (`en`, `zh`, etc.)          |

|               | `max_tokens_l0`–`l3`       | `1200`      | Max tokens per tree level (L0=Action … L3=Session)  |

|               | `temperature`              | `0.4`       | LLM temperature for summarization                   |

|               | `max_workers`              | `2`         | Concurrent summarization workers                    |

|               | `debounce_sec`             | `3.0`       | Debounce before triggering summary                  |

|               | `save_interval_sec`        | `5.0`       | Tree auto-save interval                             |

| **retrieve**  | `max_prompt_chars`         | `42000`     | Max chars in retrieval prompt                       |

|               | `max_iterations`           | `15`        | Max tree traversal iterations                       |

|               | `max_file_chars`           | `8000`      | Max chars from extracted files                      |

|               | `max_select_nodes`         | `7`         | Max nodes selected per iteration                    |

|               | `max_tokens_step`          | `4096`      | Max tokens per retrieval step                       |

|               | `max_tokens_answer`        | `8192`      | Max tokens for final answer                         |

|               | `temperature_select`       | `0.3`       | Temperature for node selection                      |

|               | `temperature_answer`       | `0.5`       | Temperature for answer generation                   |

|               | `temperature_time_resolve` | `0.1`       | Temperature for time resolution                     |

|               | `max_tokens_time_resolve`  | `1000`      | Max tokens for time resolution                      |

## 🚀 Get Started

### 📦 Install

```bash

git clone https://github.com/HKUDS/catchme.git && cd catchme

conda create -n catchme python=3.11 -y && conda activate catchme

pip install -e .

```

> **macOS** — grant *Accessibility*, *Input Monitoring*, *Screen Recording* in System Settings → Privacy & Security

> **Windows** — run as Administrator for global input monitoring

### ⚡ Init

```bash

catchme init                  # interactive setup: provider, API key, llm model

```

### 🔥 Run

```bash

catchme awake                 # start recording

catchme web                   # visualize and chat

# or through cli

catchme ask -- "What am I doing today?"

```

Full CLI Reference

| Command                     | Description                                            |

| --------------------------- | ------------------------------------------------------ |

| `catchme awake`             | Start the recording daemon                             |

| `catchme web [-p PORT]`     | Launch web dashboard (default `http://127.0.0.1:8765`) |

| `catchme ask -- "question"` | Query your activity in natural language                |

| `catchme cost`              | Show LLM token usage (last 10 min / today / all time)  |

| `catchme disk`              | Show storage breakdown & event count                   |

| `catchme ram`               | Show memory usage of running processes                 |

| `catchme init`              | Interactive setup: LLM provider, API key & model       |

## 🦞 CatchMe Makes Your Agents Truly Personal

CatchMe ships as an agent-compatible skill for CLI agents (OpenClaw, NanoBot, Claude, Cursor, etc.).

**🪶 Agent Integration:**

Run CatchMe independently. Your agents query memories via CLI commands only.

```bash

# 1. Start CatchMe yourself

catchme awake

# 2. Give the light skill to your agent

cp CATCHME-light.md ~/.cursor/skills/catchme/SKILL.md

```

**Option B — Full Skill** (agent manages the full CatchMe lifecycle autonomously):

```bash

cp CATCHME-full.md ~/.cursor/skills/catchme/SKILL.md

```

### 🔧 Integrate into your current workflow

```python

from catchme import CatchMe

from catchme.pipelines.retrieve import retrieve

# 1. One-line search — fast keyword lookup over all recorded activity

with CatchMe() as mem:

    for e in mem.search("meeting notes"):

        print(e.timestamp, e.data)

# 2. LLM-powered retrieval — natural language Q&A over your screen history

for step in retrieve("What was I working on this morning?"):

    if step["type"] == "answer":

        print(step["content"])

```

## 📊 Cost & Efficiency

*Benchmarked with **2 hours of intensive, continuous computer use** on MacBook Air M4.*

| Metric                                          | Value                                                                           |

| ----------------------------------------------- | ------------------------------------------------------------------------------- |

| **Runtime RAM**                                 | ~0.2 GB                                                                    |

| **Disk Usage**                                  | ~ 200 MB                                                                        |

| **Token Throughput**                                 | input ~ 6 M , output ~ 0.7 M                                                    |                   |

| **LLM cost** — `qwen-3.5-plus`                  | ~ $0.42 via [Aliyun DashScope](https://home.console.aliyun.com/home/dashboard/) |

| **LLM cost** — `gemini-3-flash-preview`         | ~ $5.00 via [OpenRouter](https://openrouter.ai/models)       

| **Full Retrieval Speed** (depends on question) | 5 - 20s per query using `gemini-3-flash-preview`                                |

## 🚀 Roadmap

CatchMe evolves with community input. Upcoming features include:

**Multi-Device Recording**. Capture and unify GUI activities across all your machines via LAN synchronization.

**Dynamic Clustering**. Adaptive clustering algorithms that better reflect your actual work patterns and flows, reducing unnecessary costs.

**Enhanced Data Utilization**. Unlock deeper insights from screenshots and metadata beyond current processing pipelines.

> 🌟 **Star this repo** to follow our future updates — your interest keeps us motivated!

We welcome contributions of any kind - whether it's a comment, a bug report, a feature idea, or a pull request. See [CONTRIBUTING.md](CONTRIBUTING.md) to get started.

## 🤝 Community

### Acknowledgments !

CatchMe is inspired by these excellent open-source projects:

| Project                                                         | Inspiration                                           |

| --------------------------------------------------------------- | ----------------------------------------------------- |

| [ActivityWatch](https://github.com/ActivityWatch/activitywatch) | Pioneering open-source activity tracking              |

| [Screenpipe](https://github.com/mediar-ai/screenpipe)           | Screen recording infrastructure for AI agents         |

| [Windrecorder](https://github.com/Antonoko/Windrecorder)        | Personal screen recording & search on Windows         |

| [OpenRecall](https://github.com/openrecall/openrecall)          | Open-source alternative to Windows Recall             |

| [Selfspy](https://github.com/selfspy/selfspy)                   | Classic daemon-style activity logging                 |

| [PageIndex](https://github.com/HKUDS/PageIndex)                 | Tree-structured document retrieval without embeddings |

| [MineContext](https://github.com/volcengine/MineContext)        | Proactive context-aware AI partner & screen capture   |

### 🏛️ Ecosystem

CatchMe is part of the **[HKUDS](https://github.com/HKUDS)** agent ecosystem — building the infrastructure layer for personal AI agents:

  

    

      NanoBot


      _{Ultra-Lightweight Personal AI Assistant}

    

    

      CLI-Anything


      _{Making All Software Agent-Native}

    

    

      ClawWork


      _{AI Assistant → AI Coworker Evolution}

    

    

      ClawTeam


      _{Agent Awarm Intelligence for Full Team Automation}

    

  






  Thanks for visiting ✨ CatchMe