Projects in Awesome Lists tagged with computer-use
A curated list of projects in awesome lists tagged with computer-use .
https://github.com/bytedance/UI-TARS-desktop
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
agent agent-tars browser-use computer-use gui-agent gui-operator mcp mcp-server multimodal tars ui-tars vision vlm
Last synced: 06 Oct 2025
https://github.com/bytedance/ui-tars-desktop
A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
agent browser-use computer-use electron gui-agents mcp mcp-server vision vite vlm
Last synced: 09 Sep 2025
https://github.com/trycua/cua
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).
agent ai-agent apple computer-use computer-use-agent containerization cua desktop-automation hacktoberfest lume macos manus operator swift virtualization virtualization-framework windows windows-sandbox
Last synced: 11 Feb 2026
https://github.com/web-infra-dev/midscene
Driving all platforms UI automation with vision-based model
ai ai-test browser-use computer-use gpt-operator javascript phone-use testing
Last synced: 09 Feb 2026
https://github.com/upsonic/upsonic
The most reliable AI agent framework that supports MCP.
agent agent-framework claude computer-use llms mcp model-context-protocol openai rag reliability
Last synced: 18 Jan 2026
https://github.com/nanobrowser/nanobrowser
Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.
agent ai ai-agents ai-tools automation browser-extension browser-use chrome-extension computer-use gpt-operator javascript manus multi-agent openai opensource operator web-agent web-automation
Last synced: 13 May 2025
https://github.com/simular-ai/Agent-S
Agent S: an open agentic framework that uses computers like a human
agent-computer-interface ai-agents computer-automation computer-use grounding gui-agents in-context-reinforcement-learning memory mllm planning retrieval-augmented-generation
Last synced: 07 May 2025
https://github.com/a9t9/rpa
Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.
anthropic anthropic-claude browser-automation browser-extension computer-use data-driven-tests imacros selenium-ide web-automation web-scraping
Last synced: 16 May 2025
https://github.com/openadaptai/openadapt
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
agents ai-agents ai-agents-framework anthropic computer-use generative-process-automation google-gemini gpt4o huggingface large-action-model large-language-models large-multimodal-models omniparser openai process-automation process-mining python segment-anything transformers ultralytics
Last synced: 18 Jan 2026
https://github.com/showlab/showui
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
agent computer-use gui-agent vision-language-action vision-language-model
Last synced: 13 Sep 2025
https://github.com/OpenAdaptAI/OpenAdapt
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
agents ai-agents ai-agents-framework anthropic computer-use generative-process-automation google-gemini gpt4o huggingface large-action-model large-language-models large-multimodal-models omniparser openai process-automation process-mining python segment-anything transformers ultralytics
Last synced: 05 Apr 2025
https://github.com/thudm/cogagent
An open-sourced end-to-end VLM-based GUI Agent
agent computer-use glm gui-agent vlm
Last synced: 15 May 2025
https://github.com/showlab/ShowUI
Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
agent computer-use gui-agent vision-language-action vision-language-model
Last synced: 02 Oct 2025
https://github.com/microsoft/windowsagentarena
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
agentic ai ai-agent ai-benchmark ai-research computer computer-use desktop-agent windows
Last synced: 15 May 2025
https://github.com/e2b-dev/open-computer-use
Secure AI computer use powered by E2B Desktop Sandbox
agent ai anthropic claude computer-use llm
Last synced: 18 Jun 2025
https://microsoft.github.io/WindowsAgentArena/
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
agentic ai ai-agent ai-benchmark ai-research computer computer-use desktop-agent windows
Last synced: 23 Feb 2025
https://github.com/microsoft/WindowsAgentArena
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
agentic ai ai-agent ai-benchmark ai-research computer computer-use desktop-agent windows
Last synced: 11 Sep 2025
https://github.com/bandarlabs/clickclickclick
A framework to enable autonomous android and computer use using any LLM (local or remote)
agents ai-agents-framework android-automation antrophic computer-use framework gemini generative-ai molmo ollama openai operator python
Last synced: 16 May 2025
https://github.com/testdriverai/testdriverai
Computer-Use SDK for E2E QA Testing
ag agentic-ai agents computer-use e2e e2e-testing javascript test-automation testing testing-tools vitest
Last synced: 11 Feb 2026
https://github.com/inclusionAI/AWorld
Build, evaluate and run General Multi-Agent Assistance with ease
agent-swarm agentic-ai computer-use gym-environment mcp mcp-server phone-use world-model
Last synced: 02 May 2025
https://github.com/baryhuang/mcp-remote-macos-use
The only general AI agent that does NOT requires extra API key, giving you full control on your local and remote MacOs from Claude Desktop App
claude-desktop computer-use general-agent macos macos-use mcp-server
Last synced: 12 Jan 2026
https://github.com/inclusionai/aworld
Build, evaluate and run General Multi-Agent Assistance with ease
agent-swarm agentic-ai computer-use gym-environment mcp mcp-server phone-use world-model
Last synced: 03 Feb 2026
https://github.com/jeffrey-zang/opus
for when your fingers are greasy 🪄
computer-use electron opus react tailwind
Last synced: 30 Jun 2025
https://github.com/open-compass/mmbench-gui
Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent with a hierarchical manner across multiple platforms, including Windows, Linux, macOS, iOS, Android and Web.
benchmark-framework computer-use gui-agent vision-language-model
Last synced: 15 Sep 2025
https://github.com/bytebot-ai/bytebot
A containerized framework for computer use agents with a virtual desktop environment.
ai-agents anthropic computer-use docker llm openai qemu
Last synced: 01 Apr 2025
https://github.com/lvqq/intelli-browser
✨ Use natural language to control your browser, powered by LLM and playwright
anthropic claude claude-3-5-sonnet computer-use e2e e2e-tests playwright
Last synced: 12 Nov 2025
https://github.com/cyberdesk-hq/cyberdesk
Open source virtual desktops for AI agents
ai-agents computer-use fastapi hono kubernetes nextjs terraform virtual-machine
Last synced: 10 Sep 2025
https://github.com/reidbarber/webmarker
Mark web pages for use with vision-language models
claude computer-use computer-using-agent cua gemini gpt4o gpt4v llms operator playwright prompt prompt-engineering qwen-vl set-of-mark som vision-language-model
Last synced: 24 Mar 2025
https://github.com/browser-use/contact-use
✉️ Use the power of browser-use to contact any person or organization... by any means necessary
ai browser-agent browser-use computer-use contact-info-scraper operator sales sales-automation scraping scraping-bot
Last synced: 04 Oct 2025
https://github.com/pnmartinez/simple-computer-use
Open source implementation for computer use, using light OCR models and LLMs. Get Android app in link below.
automation computer-use ocr ollama
Last synced: 21 Jun 2025
https://github.com/SALT-NLP/PopupAttack
Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups
attack claude-3-5-sonnet computer-use llm-agent pop-up vision-language-model
Last synced: 23 Feb 2025
https://github.com/philfung/awesome-computer-use
Curated resources about automated GUI computer-use via LLMs. Highly opinionated, focus is on quality vs quantity.
anthropic anthropic-claude computer-use computer-vision gpt-4-vision gui-agents llm rpa rpa-robotic-process-automation tool-use vision
Last synced: 27 Jan 2026
https://github.com/iris-networks/iris
This is the crud backend for our QA test application
ai automation computer-use qa-automation-test
Last synced: 10 Oct 2025
https://github.com/philfung/computer-use
try Computer Use on your Mac with a few clicks
anthropic claude computer-use large-language-models llms macos multimodal-large-language-models
Last synced: 25 Sep 2025
https://github.com/sawyerhood/computer-use-extension
This is OpenAI's computer use hooked up to a chrome extension.
ai chrome-extension computer-use llm openai
Last synced: 16 Jun 2025
https://github.com/zubax/bro
An LLM computer-using agent (CUA) designed to autonomously perform mundane tasks related to business operations and administration, such as doing accounting, filing paperwork, and submitting applications. The accountant is not your bro, but Bro is.
agent agentic-ai automation computer-use computer-use-agent llm no-code-automation nocode office-automation
Last synced: 10 Oct 2025
https://github.com/ab498/computer-control-mcp
MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.
Last synced: 29 Jun 2025
https://github.com/webhiveos/webhive
Meet WebHive, the AI-powered browser that takes care of tasks for you. No more endless clicks, tell it what you need, and it gets it done.
agent agent-framework assistant chagpt chatgpt-app chatgpt-operator claude computer-use gca gpt gpt-4o langchain llms mcp model-context-protocol openai
Last synced: 13 Apr 2025
https://github.com/archivebox/abx-spec-behaviors
🧩 Proposal to allow user scripts like "expand comments", "hide popups", "fill out this form", etc. to be reusable across pure browser environments, puppeteer, playwright, extensions, AI tools, and many other contexts with minimal adjustment.
abx archivebox automation behaviors browser browsertrix-behaviors claude computer-use crawling digipres ecosystem greasemonkey playwright plugins puppeteer rfp scraping specification tampermonkey tool-use
Last synced: 01 Sep 2025
https://github.com/justmalhar/claude-ubuntu-os
Claude Computer Use API with Ubuntu that enables Claude to interact with and automate desktop environments. It allows seamless command execution through VNC or noVNC, enhancing productivity with secure, containerized workflows with Github Codespaces.
agents ai anthropic claude computer-use github-codespaces-cde large-language-models ubuntu vnc-viewer
Last synced: 13 Apr 2025
https://github.com/e2b-dev/computer-use-app
A web playground for a secure and open source computer use. Powered by E2B.
ai computer-use llama3 open-source qwen
Last synced: 15 Apr 2025
https://github.com/AB498/computer-control-mcp
MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.
Last synced: 17 Jun 2025
https://github.com/cloudycotton/browser-operator
Build your own AI operators like OpenAI
ai anthropic browser browser-agent computer-use javascript nextjs nodejs open-operator openai playwright typescript
Last synced: 22 Mar 2025
https://github.com/lx-0/computer-use-nodejs-demo
🤖 LLM-powered computer control through local and Docker environments. Features VNC integration, automated interactions, and a chat interface for natural language system control.
ai computer-use docker function-calling llm
Last synced: 16 Aug 2025
https://github.com/auto-browse/auto-browse-ts
Auto-Browse: AI Enabled Browser Automation
ai ai-test-generator ai-testing-tool auto-browser automation browser-agent browser-automation browser-use computer-use langchain llm mcp openai playwright test-automation testing
Last synced: 07 May 2025
https://github.com/iris-networks/gpt-agent
Fully self hosted chatgpt agent alternative
browser-use chatgpt-agent computer-use cua
Last synced: 26 Jul 2025
https://github.com/haltakov/browsafex
Web interface for the Gemini 2.5 Computer Use model
agent ai computer-use computer-use-agent
Last synced: 16 Jan 2026
https://github.com/nicholasoxford/computer-use-mac-demo
Anthropic's computer use controlling a Macbook
Last synced: 08 May 2025
https://github.com/anonymitaet/gacua_preview
The World's First Out-of-the-Box Computer Use Agent Powered by Gemini-CLI @openmule
agent ai computer-use gacua mcp
Last synced: 10 Oct 2025
https://github.com/pnmartinez/computer-use-android-app
🎤📱 Control your desktop PC with voice from an Android app! This is an Android client for the Simple Computer Use. Install Simple Computer Use in link below.
automation computer-use ocr ollama voice
Last synced: 30 Aug 2025
https://github.com/presidio-oss/factif-ai
AI-powered computer control for automated testing. FactifAI uses vision models (Claude, GPT-4o, Gemini) to interact with applications naturally - clicking, typing, and verifying results just like a human would.
anthropic automated-testing automation bedrock claude computer-use docker-vnc factif-ai gpt-4o hai human-ai omniparser puppeteer testing
Last synced: 26 Sep 2025
https://github.com/nottelabs/open-operator-evals
Opensource benchmark evaluating web operators/agents performance
ai-agents ai-tools browser-automation browser-use computer-use cua llm notte web-agent
Last synced: 24 Dec 2025
https://github.com/vcaesar/robotgo-pro
RobotGo-Pro, multi langs native cross-platform RPA, GUI automation, Auto test and Computer use
ai auto-test automation computer-use javascript js lua opencv python robot rpa
Last synced: 13 Jan 2026
https://github.com/rajaniraiyn/ccu
Anthropic's Computer Use tools within VSCode
ai anthropic anthropic-claude claude computer-use llm vscode vscodeextension
Last synced: 27 Mar 2025
https://github.com/agent-sandbox/agent-sandbox
Agent-sandbox is an enterprise-grade ai-first, cloud-native runtime environment for AI Agents. Allows Agents to securely run untrusted LLM-generated Code, Browser use, Computer use, and Shell commands etc. with stateful, long-running, multi-session and multi-tenant.
agent agent-sandbox ai-infra ai-sandbox browser-use code-executor computer-use container mcp sandbox
Last synced: 13 Jan 2026
https://github.com/samestrin/chromium-screenshots
Vision AI "Cortex" for Agents. A Playwright-based MCP Server & API that captures screenshots with ground-truth DOM extraction and full auth state injection. Containerized.
ai-agents automation computer-use docker-image dom-extraction headless-chrome llm-tools mcp-server ocr playwright-python python-fastapi scraping screenshot-api vision-ai zero-drift
Last synced: 13 Jan 2026
https://github.com/phact/agentsitter
A babysitter for your AI agents
agents ai browser-use computer-use
Last synced: 29 Jul 2025
https://github.com/ibz-04/raya
Computer use agent
ai-agent ai-agents computer-use computer-use-agent desktop-automation llm-agent open-ai open-source python windows-automation
Last synced: 14 Oct 2025
https://github.com/mubashir1osmani/m4
build custom asics and fpga's using llms.
ai chip-design computer-use gpu hardware-designs llm
Last synced: 22 Aug 2025
https://github.com/osmandkitay/odk-shell
ODK: An open-source AI shell to control your computer with natural language.
ai computer-use local-models python rust tauri
Last synced: 21 Aug 2025
https://github.com/mihonarium/food_ordering_agent
Use an LLM agent to automate ordering food and other items from Deliveroo, Uber Eats, DoorDash, etc.
agent-based agents amazon api assistant assistant-chat-bots assistants-api computer-use computer-using-agent computeruse deliveroo doordash home-assistant llm llm-agents ubereats
Last synced: 29 Jul 2025