{"id":21633888,"url":"https://github.com/philfung/awesome-computer-use","last_synced_at":"2026-01-27T23:51:31.469Z","repository":{"id":260517112,"uuid":"881515035","full_name":"philfung/awesome-computer-use","owner":"philfung","description":"Curated resources about automated GUI computer-use via LLMs.  Highly opinionated, focus is on quality vs quantity.","archived":false,"fork":false,"pushed_at":"2024-11-19T21:36:05.000Z","size":25,"stargazers_count":19,"open_issues_count":1,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-03-16T04:01:43.188Z","etag":null,"topics":["anthropic","anthropic-claude","computer-use","computer-vision","gpt-4-vision","gui-agents","llm","rpa","rpa-robotic-process-automation","tool-use","vision"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/philfung.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-31T18:17:22.000Z","updated_at":"2025-03-09T16:47:14.000Z","dependencies_parsed_at":null,"dependency_job_id":"795af250-a913-4a46-82de-01afe807b583","html_url":"https://github.com/philfung/awesome-computer-use","commit_stats":null,"previous_names":["philfung/awesome-computer-use"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/philfung%2Fawesome-computer-use","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/philfung%2Fawesome-computer-use/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/philfung%2Fawesome-computer-use/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/philfung%2Fawesome-computer-use/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/philfung","download_url":"https://codeload.github.com/philfung/awesome-computer-use/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244320265,"owners_count":20434088,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anthropic","anthropic-claude","computer-use","computer-vision","gpt-4-vision","gui-agents","llm","rpa","rpa-robotic-process-automation","tool-use","vision"],"created_at":"2024-11-25T03:14:26.647Z","updated_at":"2026-01-27T23:51:31.464Z","avatar_url":"https://github.com/philfung.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# \u003cimg src=\"https://github.com/user-attachments/assets/d167261f-b73f-42af-bddf-0ec2ed1b21d9\" height=\"150\"/\u003e Awesome Computer Use \nCurated list of papers + libraries related to computer GUI use via LLMs.\\\nHighly opinionated, focus on quality vs quantity.\n\n## Demos\n* Try [computer use on your Mac](https://github.com/philfung/computer-use) in one click.\n\n## Frameworks\n* [Openwork](https://github.com/accomplish-ai/openwork) - MIT-licensed, open alternative to Anthropic's Cowork with multi-LLM support for browser automation.\n  \n## Papers\n* [WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning](https://arxiv.org/html/2411.02337v1) (*Tsinghua U*) (11/24)\n* [Anthropic Claude Computer Use API](https://docs.anthropic.com/en/docs/build-with-claude/computer-use) (*Anthropic*) (10/24)\n* [OmniParser for Pure Vision Based GUI Agent](https://microsoft.github.io/OmniParser/) ([code](https://github.com/microsoft/OmniParser)) (*Microsoft*) (08/24)\n* [ECLAIR: Enterprise sCaLe AI for woRkflows](https://hazyresearch.stanford.edu/blog/2024-05-18-eclair)([code](https://github.com/HazyResearch/eclair-agents)) (*Stanford U*) (05/24)\n* [OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments](https://os-world.github.io/) ([code](https://github.com/xlang-ai/OSWorld)) (*HKU*) (05/24)\n* [Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs](https://arxiv.org/abs/2404.05719) ([code](https://github.com/apple/ml-ferret/tree/main/ferretui)) (*Apple*) (04/24)\n* [SeeAct: GPT-4V(ision) is a Generalist Web Agent, if Grounded](https://osu-nlp-group.github.io/SeeAct/) ([code](https://github.com/OSU-NLP-Group/SeeAct)) (*OSU*) (01/24)\n* [CogAgent: A Visual Language Model for GUI Agents](https://github.com/THUDM/CogVLM2) (*ZhiPu*)(12/23)\n* [AppAgent: Multimodal Agents as Smartphone Users](https://appagent-official.github.io/) ([code](https://github.com/mnotgod96/AppAgent)) (*TenCent*) (12/23)\n* [SoM : Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V](https://som-gpt4v.github.io/) ([code](https://github.com/microsoft/SoM)) (*Microsoft*) (10/23)\n\n# Talks\n* [LLMs as Computer Users: An Overview](https://www.figma.com/deck/rsWK4sRl0dOahG59bfMhql)\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fphilfung%2Fawesome-computer-use","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fphilfung%2Fawesome-computer-use","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fphilfung%2Fawesome-computer-use/lists"}