https://github.com/k-l-lambda/anthroid
An agentic android app
https://github.com/k-l-lambda/anthroid
Last synced: 3 months ago
JSON representation
An agentic android app
- Host: GitHub
- URL: https://github.com/k-l-lambda/anthroid
- Owner: k-l-lambda
- License: other
- Created: 2025-12-20T14:57:42.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2026-03-09T10:08:01.000Z (4 months ago)
- Last Synced: 2026-03-09T12:50:27.936Z (4 months ago)
- Language: JavaScript
- Size: 99.6 MB
- Stars: 2
- Watchers: 0
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
- Security: SECURITY.md
Awesome Lists containing this project
README
# Anthroid
Anthroid is a native Android app for **mobile agentic workflows**: a Claude Code-style agent that can work with your phoneβs native capabilities (camera, voice, clipboard, notifications, app/URL launching, location/calendar, screenshots, and optional UI automation).
## Demos
### Cross-App Automation
Launch other apps and automate multi-step tasks. Here the agent opens a shopping app, searches for products, and navigates the interface.
### Remote Server Monitoring
Use voice commands to check server status. The agent connects via SSH, runs diagnostics, and reports back.
### Document Analysis & Local File Management
Take a photo or pick from gallery, then ask the agent to analyze. Here it reads insurance documents and creates a comparison report. The agent can also organize, rename, and move files on your device β tasks that are typically cumbersome on mobile due to limited input methods.
## When it helps
- **On-call / incident triage away from your desk**: run quick diagnostics in a terminal, keep a chat thread with context, and resume the conversation later.
- **Field work & device setup**: scan QR codes to avoid retyping tokens/URLs, open links on-device, and copy/paste between apps.
- **Hands-busy workflows**: dictate messages with **offline** speech recognition so you can keep moving even with poor network.
- **"What am I looking at?" troubleshooting**: attach photos/screenshots for visual context, then follow a short step-by-step checklist.
- **Phone-use automation**: copy key info to clipboard, open deep links, launch apps, set reminders via notifications, and combine device context (location/calendar) with instructions.
## Overview
Anthroid (Android + Anthropic) is a mobile implementation of [Claude Code](https://docs.anthropic.com/claude-code), designed around mobile-native input methods and device capabilities.
### Architecture Comparison
Unlike typical mobile AI apps that merely serve as a frontend to remote agent systems, Anthroid runs the **full agent logic locally** on your device:
Typical Mobile AI Apps
Anthroid
```mermaid
flowchart TB
subgraph phone["π± Mobile Device"]
ui["UI Only"]
end
subgraph server["βοΈ Remote Server"]
agent["Agent Logic"]
tools["Tool Executor"]
end
subgraph cloud["βοΈ LLM Provider"]
api["LLM API"]
end
ui <-->|"Every action"| agent
agent --> tools
agent <--> api
```
β Tools run on remote server
β Limited device access
β Network latency for every action
```mermaid
flowchart TB
subgraph phone["π± Mobile Device"]
subgraph runtime["Agent Runtime"]
tools["Tool Executor
Bash / Files / Camera / Voice"]
end
end
subgraph cloud["βοΈ Cloud"]
api["LLM API
(Claude / OpenAI compatible)"]
end
runtime <-->|"Inference requests"| api
```
β
Tools execute locally
β
Full device access
β
Only LLM inference calls remote API
β
Supports any OpenAI-compatible endpoint
This means Anthroid can directly access your camera, files, clipboard, and other device capabilities without round-tripping through a remote server.
**Design goals**
- **Device tools**: access to location, calendar, clipboard, notifications, URL/app launching, and more.
- **Mobile input**: voice dictation (offline `sherpa-onnx`) and camera capture/QR scanning.
- **Agent-style automation**: optional overlay shows what the agent is doing when it operates outside the app.
- **Terminal at hand**: a full bash environment for advanced workflows.
## Features
### Chat Interface
- Native Android chat UI with streaming responses
- **Markdown rendering** - Tables, bold, italic, code blocks, links
- **Conversation history** - Resume past conversations
- Light blue user bubbles, gray assistant bubbles
### Voice Input
- **Offline speech recognition** using sherpa-onnx
- Supports Chinese, English, Japanese, Korean, Cantonese
- Press-and-hold microphone button to speak
- Real-time transcription display
### Camera & Vision
- Take photos to add visual context to messages
- Gallery picker for existing images
- Multiple images per message
- **QR code scanning** with instant text insertion and clipboard copy
### AI Agent Tools
Claude can execute tools on your device:
| Tool | Description |
|------|-------------|
| Bash | Run terminal commands |
| Read/Write/Edit | File operations |
| Glob/Grep | Search files and content |
| Notification | Show Android notifications |
| Clipboard | Read/write clipboard |
| Open URL | Launch browser |
| Launch App | Open installed apps |
| Location | Get GPS coordinates |
| Calendar | Query calendar events |
| Screenshot | Capture device screen |
| Screen Tap/Swipe | UI automation |
### Screen Automation Overlay
When Claude launches other apps or performs actions outside Anthroid:
- **Floating banner** appears at the top of screen showing agent status
- **Streaming text** displays what Claude is currently doing
- **Stop button** to cancel the operation at any time
- **Auto-hides** after task completion (tap to return to Anthroid)
- Requires overlay permission (Draw over other apps)
### OpenClaw Agent Runtime (v1.0)
Local [OpenClaw](https://github.com/k-l-lambda/openclaw) agent with full tool-use, model selection, and context management:
- **pi-embedded-runner** bundled in APK β no external agent server needed
- **60+ Android tools** via MCP bridge (accessibility, camera, clipboard, notifications, etc.)
- **File-based memory** β `MEMORY.md` + `memory/*.md` injected into system prompt, persists across sessions
- **Any LLM provider** β Anthropic, OpenAI-compatible APIs, or proxy endpoints (e.g. PPIO)
### Gateway Integration (Optional)
Connect to an OpenClaw gateway server for distributed agent features:
- **Memory sync** β automatic pull/push of `memory/` files on session start/end (git-based incremental)
- **Profile sync** β `/sync-profile` command pulls agent identity, skills, and config files
- **Remote agent view** β view and interact with OpenClaw sessions and SSH+tmux sessions on remote servers
- **Pending message delivery** β 60s polling for timed messages from gateway agents
- **System notifications** β IM-like notifications for gateway messages when app is backgrounded
- **Gateway Settings UI** β configure host/port/token in Settings, or use `/set-gateway` command / QR code
### Quick Send Candidates
- Frequently used short messages appear as chips above the input bar
- **Tap** to send immediately, **long-press** to insert into input for editing
- Automatically tracked by frequency (threshold: 5 uses)
### Terminal Environment
Built-in Linux terminal for advanced users:
- Full bash shell environment
- Package manager (apt/pkg)
- Node.js, Python, and more available
## Installation
### Download APK
Get the latest release from [GitHub Releases](https://github.com/k-l-lambda/anthroid/releases).
### Build from Source
```bash
# Clone repository
git clone https://github.com/k-l-lambda/anthroid.git
cd anthroid
# Build debug APK
./gradlew assembleDebug
# Install on device
adb install app/build/outputs/apk/debug/anthroid-app_apt-android-7-debug_arm64-v8a.apk
```
### Requirements
- Android 7.0+ (API 24)
- ARM64 device recommended
- ~200MB storage (+ optional 239MB for voice model)
## Setup
### API Configuration
1. Get your Claude API key from [Anthropic Console](https://console.anthropic.com/)
2. In Anthroid, go to **Settings** > **API Configuration**
3. Enter your API key and base URL
Or use QR code for quick setup:
1. Generate QR code with API credentials
2. Open Camera > QR scan mode
3. Scan the QR code
### Voice Input Setup
1. Go to **Settings** > **Components**
2. Download ASR Model (239MB, one-time)
3. Wait for model initialization
4. Microphone button appears in chat
### Gateway Setup (Optional)
Connect to an OpenClaw gateway for memory sync and remote agent features:
1. Go to **Settings** > **Gateway Connection**
2. Enter host, port, and token
3. Enable the toggle
Or use the chat command: `/set-gateway [token]`
Or scan a QR code generated from `tools/qr-generator.html`.
## Architecture
```
ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Anthroid App (Android) β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββ β
β β Kotlin Layer β β
β β ββ Chat UI (Markdown, voice, camera) β β
β β ββ ClaudeViewModel (CLI/API/OpenClaw) β β
β β ββ MCP Server (NanoHTTPD :8765) β β
β β ββ Gateway Manager (WebSocket + sync) β β
β β ββ AndroidTools (60+ device tools) β β
β ββββββββββββββββ¬ββββββββββββββββββββββββββββββ β
β β HTTP localhost:8765 β
β ββββββββββββββββ΄ββββββββββββββββββββββββββββββ β
β β Node.js Layer (OpenClaw Agent) β β
β β ββ pi-embedded-runner (agent runtime) β β
β β ββ android-tools-bridge.mjs (MCP bridge) β β
β β ββ workspace/ (memory, skills, data) β β
β ββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββ β
β β Terminal (Termux fork) β β
β β bash, Node.js, Python, apt/pkg β β
β ββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
βΌ βΌ
βοΈ LLM API βοΈ OpenClaw Gateway
(Claude/OpenAI) (optional, WebSocket)
```
### Key Technologies
- **Kotlin** - Primary language
- **OpenClaw pi-embedded-runner** - Local agent runtime with tool-use
- **CameraX** - Camera capture
- **ML Kit** - QR code scanning
- **sherpa-onnx** - Offline speech recognition
- **Markwon** - Markdown rendering
- **OkHttp** - WebSocket for gateway connection
- **BouncyCastle** - Ed25519 device authentication
## License
GPLv3 - Same license as Termux. See [LICENSE.md](LICENSE.md).
## Credits
- [Termux](https://github.com/termux/termux-app) - Terminal emulator foundation
- [Anthropic](https://anthropic.com) - Claude AI
- [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) - Speech recognition
- [Markwon](https://github.com/noties/Markwon) - Markdown rendering