An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with computer-use

A curated list of projects in awesome lists tagged with computer-use .

https://github.com/bytedance/UI-TARS-desktop

The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

agent agent-tars browser-use computer-use gui-agent gui-operator mcp mcp-server multimodal tars ui-tars vision vlm

Last synced: 06 Oct 2025

https://github.com/bytedance/ui-tars-desktop

A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.

agent browser-use computer-use electron gui-agents mcp mcp-server vision vite vlm

Last synced: 09 Sep 2025

https://github.com/trycua/cua

Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).

agent ai-agent apple computer-use computer-use-agent containerization cua desktop-automation hacktoberfest lume macos manus operator swift virtualization virtualization-framework windows windows-sandbox

Last synced: 11 Feb 2026

https://github.com/web-infra-dev/midscene

Driving all platforms UI automation with vision-based model

ai ai-test browser-use computer-use gpt-operator javascript phone-use testing

Last synced: 09 Feb 2026

https://github.com/upsonic/upsonic

The most reliable AI agent framework that supports MCP.

agent agent-framework claude computer-use llms mcp model-context-protocol openai rag reliability

Last synced: 18 Jan 2026

https://github.com/nanobrowser/nanobrowser

Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.

agent ai ai-agents ai-tools automation browser-extension browser-use chrome-extension computer-use gpt-operator javascript manus multi-agent openai opensource operator web-agent web-automation

Last synced: 13 May 2025

https://github.com/a9t9/rpa

Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.

anthropic anthropic-claude browser-automation browser-extension computer-use data-driven-tests imacros selenium-ide web-automation web-scraping

Last synced: 16 May 2025

https://github.com/openadaptai/openadapt

Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models

agents ai-agents ai-agents-framework anthropic computer-use generative-process-automation google-gemini gpt4o huggingface large-action-model large-language-models large-multimodal-models omniparser openai process-automation process-mining python segment-anything transformers ultralytics

Last synced: 18 Jan 2026

https://github.com/showlab/showui

[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.

agent computer-use gui-agent vision-language-action vision-language-model

Last synced: 13 Sep 2025

https://github.com/OpenAdaptAI/OpenAdapt

Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models

agents ai-agents ai-agents-framework anthropic computer-use generative-process-automation google-gemini gpt4o huggingface large-action-model large-language-models large-multimodal-models omniparser openai process-automation process-mining python segment-anything transformers ultralytics

Last synced: 05 Apr 2025

https://github.com/thudm/cogagent

An open-sourced end-to-end VLM-based GUI Agent

agent computer-use glm gui-agent vlm

Last synced: 15 May 2025

https://github.com/showlab/ShowUI

Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.

agent computer-use gui-agent vision-language-action vision-language-model

Last synced: 02 Oct 2025

https://github.com/microsoft/windowsagentarena

Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.

agentic ai ai-agent ai-benchmark ai-research computer computer-use desktop-agent windows

Last synced: 15 May 2025

https://github.com/e2b-dev/open-computer-use

Secure AI computer use powered by E2B Desktop Sandbox

agent ai anthropic claude computer-use llm

Last synced: 18 Jun 2025

https://microsoft.github.io/WindowsAgentArena/

Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.

agentic ai ai-agent ai-benchmark ai-research computer computer-use desktop-agent windows

Last synced: 23 Feb 2025

https://github.com/microsoft/WindowsAgentArena

Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.

agentic ai ai-agent ai-benchmark ai-research computer computer-use desktop-agent windows

Last synced: 11 Sep 2025

https://github.com/bandarlabs/clickclickclick

A framework to enable autonomous android and computer use using any LLM (local or remote)

agents ai-agents-framework android-automation antrophic computer-use framework gemini generative-ai molmo ollama openai operator python

Last synced: 16 May 2025

https://github.com/inclusionAI/AWorld

Build, evaluate and run General Multi-Agent Assistance with ease

agent-swarm agentic-ai computer-use gym-environment mcp mcp-server phone-use world-model

Last synced: 02 May 2025

https://github.com/baryhuang/mcp-remote-macos-use

The only general AI agent that does NOT requires extra API key, giving you full control on your local and remote MacOs from Claude Desktop App

claude-desktop computer-use general-agent macos macos-use mcp-server

Last synced: 12 Jan 2026

https://github.com/inclusionai/aworld

Build, evaluate and run General Multi-Agent Assistance with ease

agent-swarm agentic-ai computer-use gym-environment mcp mcp-server phone-use world-model

Last synced: 03 Feb 2026

https://github.com/jeffrey-zang/opus

for when your fingers are greasy 🪄

computer-use electron opus react tailwind

Last synced: 30 Jun 2025

https://github.com/open-compass/mmbench-gui

Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent with a hierarchical manner across multiple platforms, including Windows, Linux, macOS, iOS, Android and Web.

benchmark-framework computer-use gui-agent vision-language-model

Last synced: 15 Sep 2025

https://github.com/bytebot-ai/bytebot

A containerized framework for computer use agents with a virtual desktop environment.

ai-agents anthropic computer-use docker llm openai qemu

Last synced: 01 Apr 2025

https://github.com/lvqq/intelli-browser

✨ Use natural language to control your browser, powered by LLM and playwright

anthropic claude claude-3-5-sonnet computer-use e2e e2e-tests playwright

Last synced: 12 Nov 2025

https://github.com/cyberdesk-hq/cyberdesk

Open source virtual desktops for AI agents

ai-agents computer-use fastapi hono kubernetes nextjs terraform virtual-machine

Last synced: 10 Sep 2025

https://github.com/browser-use/contact-use

✉️ Use the power of browser-use to contact any person or organization... by any means necessary

ai browser-agent browser-use computer-use contact-info-scraper operator sales sales-automation scraping scraping-bot

Last synced: 04 Oct 2025

https://github.com/pnmartinez/simple-computer-use

Open source implementation for computer use, using light OCR models and LLMs. Get Android app in link below.

automation computer-use ocr ollama

Last synced: 21 Jun 2025

https://github.com/SALT-NLP/PopupAttack

Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups

attack claude-3-5-sonnet computer-use llm-agent pop-up vision-language-model

Last synced: 23 Feb 2025

https://github.com/philfung/awesome-computer-use

Curated resources about automated GUI computer-use via LLMs. Highly opinionated, focus is on quality vs quantity.

anthropic anthropic-claude computer-use computer-vision gpt-4-vision gui-agents llm rpa rpa-robotic-process-automation tool-use vision

Last synced: 27 Jan 2026

https://github.com/iris-networks/iris

This is the crud backend for our QA test application

ai automation computer-use qa-automation-test

Last synced: 10 Oct 2025

https://github.com/sawyerhood/computer-use-extension

This is OpenAI's computer use hooked up to a chrome extension.

ai chrome-extension computer-use llm openai

Last synced: 16 Jun 2025

https://github.com/zubax/bro

An LLM computer-using agent (CUA) designed to autonomously perform mundane tasks related to business operations and administration, such as doing accounting, filing paperwork, and submitting applications. The accountant is not your bro, but Bro is.

agent agentic-ai automation computer-use computer-use-agent llm no-code-automation nocode office-automation

Last synced: 10 Oct 2025

https://github.com/ab498/computer-control-mcp

MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.

automation computer-use mcp

Last synced: 29 Jun 2025

https://github.com/webhiveos/webhive

Meet WebHive, the AI-powered browser that takes care of tasks for you. No more endless clicks, tell it what you need, and it gets it done.

agent agent-framework assistant chagpt chatgpt-app chatgpt-operator claude computer-use gca gpt gpt-4o langchain llms mcp model-context-protocol openai

Last synced: 13 Apr 2025

https://github.com/archivebox/abx-spec-behaviors

🧩 Proposal to allow user scripts like "expand comments", "hide popups", "fill out this form", etc. to be reusable across pure browser environments, puppeteer, playwright, extensions, AI tools, and many other contexts with minimal adjustment.

abx archivebox automation behaviors browser browsertrix-behaviors claude computer-use crawling digipres ecosystem greasemonkey playwright plugins puppeteer rfp scraping specification tampermonkey tool-use

Last synced: 01 Sep 2025

https://github.com/justmalhar/claude-ubuntu-os

Claude Computer Use API with Ubuntu that enables Claude to interact with and automate desktop environments. It allows seamless command execution through VNC or noVNC, enhancing productivity with secure, containerized workflows with Github Codespaces.

agents ai anthropic claude computer-use github-codespaces-cde large-language-models ubuntu vnc-viewer

Last synced: 13 Apr 2025

https://github.com/e2b-dev/computer-use-app

A web playground for a secure and open source computer use. Powered by E2B.

ai computer-use llama3 open-source qwen

Last synced: 15 Apr 2025

https://github.com/AB498/computer-control-mcp

MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.

automation computer-use mcp

Last synced: 17 Jun 2025

https://github.com/lx-0/computer-use-nodejs-demo

🤖 LLM-powered computer control through local and Docker environments. Features VNC integration, automated interactions, and a chat interface for natural language system control.

ai computer-use docker function-calling llm

Last synced: 16 Aug 2025

https://github.com/iris-networks/gpt-agent

Fully self hosted chatgpt agent alternative

browser-use chatgpt-agent computer-use cua

Last synced: 26 Jul 2025

https://github.com/haltakov/browsafex

Web interface for the Gemini 2.5 Computer Use model

agent ai computer-use computer-use-agent

Last synced: 16 Jan 2026

https://github.com/nicholasoxford/computer-use-mac-demo

Anthropic's computer use controlling a Macbook

anthropic claude computer-use

Last synced: 08 May 2025

https://github.com/anonymitaet/gacua_preview

The World's First Out-of-the-Box Computer Use Agent Powered by Gemini-CLI @openmule

agent ai computer-use gacua mcp

Last synced: 10 Oct 2025

https://github.com/pnmartinez/computer-use-android-app

🎤📱 Control your desktop PC with voice from an Android app! This is an Android client for the Simple Computer Use. Install Simple Computer Use in link below.

automation computer-use ocr ollama voice

Last synced: 30 Aug 2025

https://github.com/presidio-oss/factif-ai

AI-powered computer control for automated testing. FactifAI uses vision models (Claude, GPT-4o, Gemini) to interact with applications naturally - clicking, typing, and verifying results just like a human would.

anthropic automated-testing automation bedrock claude computer-use docker-vnc factif-ai gpt-4o hai human-ai omniparser puppeteer testing

Last synced: 26 Sep 2025

https://github.com/nottelabs/open-operator-evals

Opensource benchmark evaluating web operators/agents performance

ai-agents ai-tools browser-automation browser-use computer-use cua llm notte web-agent

Last synced: 24 Dec 2025

https://github.com/vcaesar/robotgo-pro

RobotGo-Pro, multi langs native cross-platform RPA, GUI automation, Auto test and Computer use

ai auto-test automation computer-use javascript js lua opencv python robot rpa

Last synced: 13 Jan 2026

https://github.com/rajaniraiyn/ccu

Anthropic's Computer Use tools within VSCode

ai anthropic anthropic-claude claude computer-use llm vscode vscodeextension

Last synced: 27 Mar 2025

https://github.com/agent-sandbox/agent-sandbox

Agent-sandbox is an enterprise-grade ai-first, cloud-native runtime environment for AI Agents. Allows Agents to securely run untrusted LLM-generated Code, Browser use, Computer use, and Shell commands etc. with stateful, long-running, multi-session and multi-tenant.

agent agent-sandbox ai-infra ai-sandbox browser-use code-executor computer-use container mcp sandbox

Last synced: 13 Jan 2026

https://github.com/samestrin/chromium-screenshots

Vision AI "Cortex" for Agents. A Playwright-based MCP Server & API that captures screenshots with ground-truth DOM extraction and full auth state injection. Containerized.

ai-agents automation computer-use docker-image dom-extraction headless-chrome llm-tools mcp-server ocr playwright-python python-fastapi scraping screenshot-api vision-ai zero-drift

Last synced: 13 Jan 2026

https://github.com/phact/agentsitter

A babysitter for your AI agents

agents ai browser-use computer-use

Last synced: 29 Jul 2025

https://github.com/mubashir1osmani/m4

build custom asics and fpga's using llms.

ai chip-design computer-use gpu hardware-designs llm

Last synced: 22 Aug 2025

https://github.com/osmandkitay/odk-shell

ODK: An open-source AI shell to control your computer with natural language.

ai computer-use local-models python rust tauri

Last synced: 21 Aug 2025

https://github.com/mihonarium/food_ordering_agent

Use an LLM agent to automate ordering food and other items from Deliveroo, Uber Eats, DoorDash, etc.

agent-based agents amazon api assistant assistant-chat-bots assistants-api computer-use computer-using-agent computeruse deliveroo doordash home-assistant llm llm-agents ubereats

Last synced: 29 Jul 2025