{"id":48451780,"url":"https://github.com/dcdhameliya/agentdroid","last_synced_at":"2026-04-13T08:00:33.183Z","repository":{"id":349619083,"uuid":"1200355314","full_name":"dcdhameliya/AgentDroid","owner":"dcdhameliya","description":"High-output Android AI agent framework for autonomous UI/UX automation, vision-based reasoning, self-healing tests, and Appium code generation.","archived":false,"fork":false,"pushed_at":"2026-04-13T05:58:59.000Z","size":109,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-13T08:00:25.762Z","etag":null,"topics":["ai-agent","android","appium","automation","autonomous-agents","developer-tools","gemini","mcp","self-healing","testing","vision-llm"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dcdhameliya.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-03T10:09:59.000Z","updated_at":"2026-04-13T05:59:03.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/dcdhameliya/AgentDroid","commit_stats":null,"previous_names":["dcdhameliya/agentdroid"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/dcdhameliya/AgentDroid","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dcdhameliya%2FAgentDroid","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dcdhameliya%2FAgentDroid/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dcdhameliya%2FAgentDroid/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dcdhameliya%2FAgentDroid/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dcdhameliya","download_url":"https://codeload.github.com/dcdhameliya/AgentDroid/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dcdhameliya%2FAgentDroid/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31744404,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-13T06:26:45.479Z","status":"ssl_error","status_checked_at":"2026-04-13T06:26:44.645Z","response_time":93,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agent","android","appium","automation","autonomous-agents","developer-tools","gemini","mcp","self-healing","testing","vision-llm"],"created_at":"2026-04-06T21:01:18.704Z","updated_at":"2026-04-13T08:00:33.171Z","avatar_url":"https://github.com/dcdhameliya.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AgentDroid\n\nAgentDroid is an open-source AI agent framework focused on Android app control, Android emulator automation, and mobile-engineering workflows.\n\nI first created this framework for my personal use while working on a large-scale Android app to automate user flows, UI/UX consistency, and end-to-end behavioral testing. Recently, I enhanced it by integrating concepts and routing from [claw-code](https://github.com/ultraworkers/claw-code).\n\nIt transforms your computer into an AI-powered control center for Android, supporting everything from natural language automation to advanced debugging and script generation.\n\n---\n\n## 🚀 What AgentDroid can do for your Android project\n\nAgentDroid is designed to handle the most tedious parts of Android development and QA. Here is how it can supercharge your project:\n\n- **Multi-Model Intelligence**: Native support for **Gemini 2.0**, **Ollama (local)**, and various AI CLIs (**Codex, Qwen, OpenCode**). You aren't locked into one model; use the best \"brain\" for your specific task.\n- **Vision-Based Reasoning**: Most automation tools fail on \"flat\" UI hierarchies (like Flutter, Compose, or Games). AgentDroid captures and analyzes real-time **screenshots** using Computer Vision, allowing it to \"see\" icons, images, and custom elements that have no XML text labels.\n- **The Script-Writer (Appium Code Gen)**: Transform AI-driven exploration into permanent code. Once the agent successfully completes a task, it can export the entire flow as a production-ready **Appium Python script**. This allows you to use AI for discovery and standard code for your CI/CD pipeline.\n- **Deep-Link Teleportation**: Stop wasting tokens and time clicking through 10 screens. AgentDroid automatically scans your app's manifest via `dumpsys` to discover hidden **Intent Filters** and **Deep Links**, allowing it to \"teleport\" instantly to specific Activities.\n- **Self-Healing Tests**: End the \"broken test\" nightmare. If an existing test script fails due to a UI change (like a renamed ID), the AgentDroid **Healing Agent** runs the script, captures the crash, analyzes the visual state, and proposes the exact code fix to restore your test suite.\n- **Layout Inspector \u0026 Mirroring**: Deep-dive into UI bugs with a single command. It toggles \"Show Layout Bounds\" on the physical device and launches a high-performance `scrcpy` mirror, giving engineers a \"skeleton view\" of the app layout on their desktop.\n- **Action Recorder**: Record complex manual sequences on a physical device and save them as reusable **AI Skills**. You can \"show\" the agent how to perform a corporate login flow once, and it will remember how to do it forever.\n- **MCP Native**: Fully compatible with the **Model Context Protocol**, turning your Android phone into a set of native tools inside **Claude Code**, **Gemini**, and **Codex**.\n\n---\n\n## 📈 Engineering Output \u0026 Resource Sufficiency\n\nAgentDroid is engineered for professional environments where time is the most expensive resource and token efficiency is critical. It moves beyond simple \"prompt-and-wait\" testing to a highly sufficient, autonomous framework:\n\n- **Exponential Output Increase**: Stop writing boilerplate. By converting natural language goals into production-ready **Appium/Pytest code**, AgentDroid allows a single engineer to build and maintain a massive test suite in a fraction of the time. It handles the \"how\" (selectors, waits, transitions), so you focus on the \"what\" (user value).\n- **80% Lower Token Consumption**: Prompt-based testing is notoriously expensive, as models must \"scan\" and \"think\" through every screen transition. AgentDroid's **Deep-Link Teleportation** bypasses manual navigation entirely. By jumping directly to the target Activity, it eliminates redundant reasoning steps, saving up to **80% in token costs** per task.\n- **Sufficient Self-Correction**: Traditional agents fail silently or get stuck in loops. AgentDroid's **Vision-First Healing** and **Logcat Sentinel** provide a sufficient \"immune system\" for your automation. It doesn't just report a failure; it analyzes the visual delta and log signatures to provide an immediate fix, keeping your pipelines green without human intervention.\n- **Cognitive Offloading via Reusable Skills**: Use the AI for the hard part—discovery. Once a complex path is identified, save it as a **Skill**. Subsequent runs execute with zero-token overhead for those segments, ensuring your automation is both high-speed and cost-predictable.\n\n---\n\n## 📥 Installation\n\nYou can install AgentDroid tools into your favorite AI environment.\n\n### 1. Gemini CLI (Native Extension)\n```bash\ngemini extension install https://github.com/dcdhameliya/AgentDroid\n```\n\n### 2. OpenAI Codex CLI (MCP)\nCodex does not use the old `/plugins install` command. Register AgentDroid as an MCP server from a local checkout:\n```bash\ngit clone https://github.com/dcdhameliya/AgentDroid\ncd AgentDroid\npython3 -m venv .venv\n.venv/bin/pip install -e .\ncodex mcp add agentdroid --env PYTHONPATH=\"$(pwd)\" -- \"$(pwd)/.venv/bin/python\" -m agentdroid_mcp.server\ncodex mcp get agentdroid\n```\n\n### 3. Qwen CLI (Native Extension)\n```bash\nqwen extension add https://github.com/dcdhameliya/AgentDroid -y\n```\n\n### 4. Oh-My-OpenAgent (OpenCode CLI)\n```bash\nopencode plugin https://github.com/dcdhameliya/AgentDroid\n```\n\n### 5. Claude Code (MCP)\nAsk Claude:\n`\"Install the MCP server from https://github.com/dcdhameliya/AgentDroid\"`\n\n---\n\n## 📖 Detailed Examples \u0026 Prompts\n\n### 1. Autonomous Task Execution\n**Command**: `agentdroid run \"task\" --provider gemini`\n*   **Prompt**: `\"Open the Settings app, find the 'Battery' section, and tell me the current percentage.\"`\n*   **Description**: The agent will launch the settings app, use its **Vision** to find the Battery menu (even if the XML ID is obscure), click through, and report the status back to the CLI.\n\n### 2. Exporting AI Flows to Code (Production Automation)\n**Command**: `agentdroid run \"task\" --export \u003cfilename\u003e.py`\n*   **Prompt**: `\"Navigate to the profile screen, change the username to 'AgentDroidUser', and save.\"`\n*   **Description**: After the AI successfully navigates and saves the name, it will generate a file (e.g., `update_profile.py`) containing the **Appium Python** code needed to repeat this action in an automated test suite.\n\n### 3. UI/UX Consistency \u0026 Accessibility Audit\n**Command**: `agentdroid run \"audit screen\"`\n*   **Prompt**: `\"Analyze the current screen for UI inconsistencies. Check if all buttons have content descriptions and if the color contrast of the 'Login' button looks correct.\"`\n*   **Description**: The agent captures a screenshot and XML, then uses its **Vision Reasoning** to perform a heuristic audit. It identifies \"missing contentDescription\" in the XML and \"low contrast\" from the image, providing a detailed engineering report.\n\n### 4. Zero-Cost Local Engineering (Ollama)\n**Command**: `agentdroid run \"task\" --local --model llama3`\n*   **Prompt**: `\"Clear the cache for the 'Chrome' app and restart it.\"`\n*   **Description**: This workflow runs entirely on your local machine using **Ollama**. No data leaves your network, and no API tokens are consumed. It's the perfect sufficient tool for internal corporate apps with sensitive data.\n\n### 5. Advanced Hybrid Routing (Claw-Code Integration)\n**Command**: `agentdroid claw-run \"Find the deep link for the advanced display settings\"`\n*   **Description**: This uses the **claw-code** integration to first route the high-level intent through a sophisticated planning layer. It breaks down the task into \"Discover Manifest -\u003e Verify Activity -\u003e Teleport\", ensuring the most direct and token-efficient path is taken.\n\n### 6. Self-Healing a Broken Script\n**Command**: `agentdroid heal path/to/my_test.py`\n*   **Description**: If `my_test.py` crashes because a button moved or an ID changed, the Healing Agent will run the script, capture the `NoSuchElementException`, take a screenshot of the current screen, and say: *\"I see the 'Submit' button now has ID 'btn_login_v2' instead of 'login_button'. Here is the fix...\"*\n\n### 7. Engineering Layout Inspection (Visual Debugging)\n**Command**: `agentdroid inspect`\n*   **Description**: This command is for human engineers. It instantly turns on \"Show Layout Bounds\" on your phone and opens a `scrcpy` window. You can see the margins, padding, and alignment of every UI component in real-time. Press **Enter** in the terminal to turn it off and clean up.\n\n---\n\n## 🛠️ Manual Installation (For Local Dev)\n\n### Prerequisites\n- **Python 3.10+**\n- **Android SDK Platform-Tools** (`adb`)\n- **scrcpy** (optional, for screen mirroring: `brew install scrcpy`)\n\n### Setup\n1. **Clone the repository**:\n   ```bash\n   git clone https://github.com/dcdhameliya/AgentDroid.git\n   cd AgentDroid\n   ```\n\n2. **Initialize Environment**:\n   ```bash\n   python3 -m venv .venv\n   source .venv/bin/activate\n   pip install -e .\n   ```\n\n---\n\n## 📂 Project Structure\n- `cli/`: Main entry point.\n- `integrations/`: Manifests and plugins for AI CLIs (Codex, OmO, MCP).\n- `agent/`: Autonomous runtime, self-healing, and vision logic.\n- `android/`: ADB wrappers and manifest analysis.\n- `mcp/`: Claude/Gemini-compatible MCP server.\n- `tools/`: Script writers, recorders, and teleportation tools.\n- `vendor/`: Integrated libraries (including Claw-Code).\n\n## 📄 License\n\nCopyright (C) 2026 dcdhameliya - Licensed under [GPL-3.0](LICENSE.txt)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdcdhameliya%2Fagentdroid","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdcdhameliya%2Fagentdroid","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdcdhameliya%2Fagentdroid/lists"}