{"id":20156244,"url":"https://github.com/web-infra-dev/midscene","last_synced_at":"2026-04-08T04:02:46.699Z","repository":{"id":251833002,"uuid":"832452447","full_name":"web-infra-dev/midscene","owner":"web-infra-dev","description":"AI-powered, vision-driven UI automation for every platform.","archived":false,"fork":false,"pushed_at":"2026-04-02T12:07:18.000Z","size":437487,"stargazers_count":12488,"open_issues_count":87,"forks_count":925,"subscribers_count":66,"default_branch":"main","last_synced_at":"2026-04-03T01:15:27.047Z","etag":null,"topics":["ai","ai-test","browser-use","computer-use","gpt-operator","javascript","phone-use","testing"],"latest_commit_sha":null,"homepage":"https://midscenejs.com","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/web-infra-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2024-07-23T04:03:50.000Z","updated_at":"2026-04-02T21:19:04.000Z","dependencies_parsed_at":"2024-09-18T11:17:29.668Z","dependency_job_id":"aa86cd47-9218-40ea-9d34-4148f6d0a293","html_url":"https://github.com/web-infra-dev/midscene","commit_stats":{"total_commits":196,"total_committers":16,"mean_commits":12.25,"dds":0.6071428571428572,"last_synced_commit":"1e76905b5f57b0230f760090f7b59ef45a7d7bd7"},"previous_names":["web-infra-dev/midscene"],"tags_count":206,"template":false,"template_full_name":null,"purl":"pkg:github/web-infra-dev/midscene","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/web-infra-dev%2Fmidscene","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/web-infra-dev%2Fmidscene/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/web-infra-dev%2Fmidscene/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/web-infra-dev%2Fmidscene/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/web-infra-dev","download_url":"https://codeload.github.com/web-infra-dev/midscene/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/web-infra-dev%2Fmidscene/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31539230,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-07T16:28:08.000Z","status":"online","status_checked_at":"2026-04-08T02:00:06.127Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ai-test","browser-use","computer-use","gpt-operator","javascript","phone-use","testing"],"created_at":"2024-11-13T23:38:09.675Z","updated_at":"2026-04-08T04:02:46.658Z","avatar_url":"https://github.com/web-infra-dev.png","language":"TypeScript","readme":"\u003cp align=\"center\"\u003e\n  \u003cimg alt=\"Midscene.js\"  width=\"260\" src=\"https://github.com/user-attachments/assets/f60de3c1-dd6f-4213-97a1-85bf7c6e79e4\"\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003eMidscene.js\u003c/h1\u003e\n\u003cdiv align=\"center\"\u003e\n\nEnglish | [简体中文](./README.zh.md)\n\n\u003cstrong\u003eOfficial Website\u003c/strong\u003e: \u003ca href=\"https://midscenejs.com/\"\u003ehttps://midscenejs.com/\u003c/a\u003e\n\n\u003ca href=\"https://trendshift.io/repositories/12524\" target=\"_blank\"\u003e\u003cimg src=\"https://trendshift.io/api/badge/repositories/12524\" alt=\"web-infra-dev%2Fmidscene | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"/\u003e\u003c/a\u003e\n\n\u003c/div\u003e\n\n\u003cp align=\"center\"\u003e\n  AI-powered, vision-driven UI automation for every platform.\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://www.npmjs.com/package/@midscene/web\"\u003e\u003cimg src=\"https://img.shields.io/npm/v/@midscene/web?style=flat-square\u0026color=00a8f0\" alt=\"npm version\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://huggingface.co/ByteDance-Seed/UI-TARS-1.5-7B\"\u003e\u003cimg src=\"https://img.shields.io/badge/UI%20TARS%20Models-yellow\" alt=\"hugging face model\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://npm-compare.com/@midscene/web/#timeRange=THREE_YEARS\"\u003e\u003cimg src=\"https://img.shields.io/npm/dm/@midscene/web.svg?style=flat-square\u0026color=00a8f0\" alt=\"downloads\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/web-infra-dev/midscene/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/License-MIT-blue.svg?style=flat-square\u0026color=00a8f0\" alt=\"License\" /\u003e\n  \u003ca href=\"https://discord.gg/2JyBHxszE4\"\u003e\u003cimg src=\"https://img.shields.io/discord/1328277792730779648?style=flat-square\u0026color=7289DA\u0026label=Discord\u0026logo=discord\u0026logoColor=white\" alt=\"discord\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://x.com/midscene_ai\"\u003e\u003cimg src=\"https://img.shields.io/twitter/follow/midscene_ai?style=flat-square\" alt=\"twitter\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://deepwiki.com/web-infra-dev/midscene\"\u003e\n    \u003cimg alt=\"Ask DeepWiki.com\" src=\"https://devin.ai/assets/deepwiki-badge.png\" style=\"height: 18px; vertical-align: middle;\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n## 📣 Midscene Skills is here!\n\nUse [Midscene Skills](https://github.com/web-infra-dev/midscene-skills) to control any platform with [OpenClaw](https://github.com/OpenClaw/OpenClaw) \n\n## Showcases\n\n* [Web Automation - Automatically register the GitHub form in a web browser and pass all field validations](https://midscenejs.com/showcases#web)\n* [iOS Automation - Meituan coffee order](https://midscenejs.com/showcases#ios)\n* [iOS Automation - Auto-like the first @midscene_ai tweet](https://midscenejs.com/showcases#ios)\n* [Android Automation - DCar: Xiaomi SU7 specs](https://midscenejs.com/showcases#android)\n* [Android Automation - Booking a hotel for Christmas](https://midscenejs.com/showcases#android)\n* [MCP Integration - Midscene MCP UI prepatch release](https://midscenejs.com/showcases#mcp)\n* [robotic arm + vision + voice for in-vehicle testing](https://midscenejs.com/showcases#community-showcases)\n\n## 💡 Features\n\n### Write Automation with Natural Language\n- Describe your goals and steps, and Midscene will plan and operate the user interface for you.\n- Use Javascript SDK or YAML to write your automation script.\n\n### Web \u0026 Mobile App \u0026 Any Interface\n- **Web Automation**: Either integrate with [Puppeteer](https://midscenejs.com/integrate-with-puppeteer), [Playwright](https://midscenejs.com/integrate-with-playwright) or use [Bridge Mode](https://midscenejs.com/bridge-mode) to control your desktop browser.\n- **Android Automation**: Use [Javascript SDK](https://midscenejs.com/android-getting-started) with adb to control your local Android device.\n- **iOS Automation**: Use [Javascript SDK](https://midscenejs.com/ios-getting-started) with WebDriverAgent to control your local iOS devices and simulators.\n- **Any Interface Automation**: Use [Javascript SDK](https://midscenejs.com/integrate-with-any-interface) to control your own interface.\n\n### For Developers\n- **Three kinds of APIs**:\n  - [Interaction API](https://midscenejs.com/api#interaction-methods): interact with the user interface.\n  - [Data Extraction API](https://midscenejs.com/api#data-extraction): extract data from the user interface and dom.\n  - [Utility API](https://midscenejs.com/api#more-apis): utility functions like `aiAssert()`, `aiLocate()`, `aiWaitFor()`.\n- **MCP**: Midscene provides MCP services that expose atomic Midscene Agent actions as MCP tools so upper-layer agents can inspect and operate UIs with natural language. [Docs](https://midscenejs.com/mcp)\n- [**Caching for Efficiency**](https://midscenejs.com/caching): Replay your script with cache and get the result faster.\n- **Debugging Experience**: Midscene.js offers a visualized replay back report file, a built-in playground, and a Chrome Extension to simplify the debugging process. These are the tools most developers truly need.\n\n\n## 👉 Zero-code Quick Experience\n\n- **[Chrome Extension](https://midscenejs.com/quick-experience)**: Start in-browser experience immediately through [the Chrome Extension](https://midscenejs.com/quick-experience), without writing any code.\n- **[Android Playground](https://midscenejs.com/android-getting-started)**: There is also a built-in Android playground to control your local Android device.\n- **[iOS Playground](https://midscenejs.com/ios-getting-started)**: There is also a built-in iOS playground to control your local iOS device.\n\n## ✨ Driven by Visual Language Model\n\nMidscene.js is all-in on the pure-vision route for UI actions: element localization and interactions are based on screenshots only. It supports visual-language models like `Qwen3-VL`, `Doubao-1.6-vision`, `gemini-3-pro`, and `UI-TARS`. For data extraction and page understanding, you can still opt in to include DOM when needed.\n\n* Pure-vision localization for UI actions; the DOM extraction mode is removed.\n* Works across web, mobile, desktop, and even `\u003ccanvas\u003e` surfaces.\n* Far fewer tokens by skipping DOM for actions, which cuts cost and speeds up runs.\n* DOM can still be included for data extraction and page understanding when needed.\n* Strong open-source options for self-hosting.\n\nRead more about [Model Strategy](https://midscenejs.com/model-strategy)\n\n\n\n## 📄 Resources \n\n* Official Website: [https://midscenejs.com](https://midscenejs.com/)\n* Documentation: [https://midscenejs.com](https://midscenejs.com/)\n* Sample Projects: [https://github.com/web-infra-dev/midscene-example](https://github.com/web-infra-dev/midscene-example)\n* API Reference: [https://midscenejs.com/api](https://midscenejs.com/api)\n* GitHub: [https://github.com/web-infra-dev/midscene](https://github.com/web-infra-dev/midscene)\n\n## 🤝 Community\n\n* [Discord](https://discord.gg/2JyBHxszE4)\n* [Follow us on X](https://x.com/midscene_ai)\n* [Lark Group(飞书交流群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=693v0991-a6bb-4b44-b2e1-365ca0d199ba)\n\n## 🌟 Awesome Midscene\n\nCommunity projects that extend Midscene.js capabilities:\n\n* [midscene-ios](https://github.com/lhuanyu/midscene-ios) - iOS Mirror automation support for Midscene\n* [midscene-pc](https://github.com/Mofangbao/midscene-pc) - PC operation device for Windows, macOS, and Linux\n* [midscene-pc-docker](https://github.com/Mofangbao/midscene-pc-docker) - Docker image with Midscene-PC server pre-installed\n* [Midscene-Python](https://github.com/Python51888/Midscene-Python) - Python SDK for Midscene automation\n* [midscene-java](https://github.com/Master-Frank/midscene-java) by @Master-Frank - Java SDK for Midscene automation\n* [midscene-java](https://github.com/alstafeev/midscene-java) by @alstafeev - Java SDK for Midscene automation\n\n\n## 📝 Credits\n\nWe would like to thank the following projects:\n\n- [Rsbuild](https://github.com/web-infra-dev/rsbuild) and [Rslib](https://github.com/web-infra-dev/rslib) for the build tool.\n- [UI-TARS](https://github.com/bytedance/ui-tars) for the open-source agent model UI-TARS.\n- [Qwen-VL](https://github.com/QwenLM/Qwen-VL) for the open-source VL model Qwen-VL.\n- [scrcpy](https://github.com/Genymobile/scrcpy) and [yume-chan](https://github.com/yume-chan) allow us to control Android devices with browser.\n- [appium-adb](https://github.com/appium/appium-adb) for the javascript bridge of adb.\n- [appium-webdriveragent](https://github.com/appium/WebDriverAgent) for the javascript operate XCTest。\n- [YADB](https://github.com/ysbing/YADB) for the yadb tool which improves the performance of text input.\n- [libnut-core](https://github.com/nut-tree/libnut-core) for the cross-platform native keyboard and mouse control.\n- [Puppeteer](https://github.com/puppeteer/puppeteer) for browser automation and control.\n- [Playwright](https://github.com/microsoft/playwright) for browser automation and control and testing.\n\n## 📖 Citation\n\nIf you use Midscene.js in your research or project, please cite:\n\n```bibtex\n@software{Midscene.js,\n  author = {Xiao Zhou, Tao Yu, YiBing Lin},\n  title = {Midscene.js: Your AI Operator for Web, Android, iOS, Automation \u0026 Testing.},\n  year = {2025},\n  publisher = {GitHub},\n  url = {https://github.com/web-infra-dev/midscene}\n}\n```\n\n## ✨ Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=web-infra-dev/midscene\u0026type=Date)](https://www.star-history.com/#web-infra-dev/midscene\u0026Date)\n\n\n## 📝 License\n\nMidscene.js is [MIT licensed](https://github.com/web-infra-dev/midscene/blob/main/LICENSE).\n\n---\n\n\u003cdiv align=\"center\"\u003e\n  If this project helps you or inspires you, please give us a star\n\u003c/div\u003e\n","funding_links":[],"categories":["A01_文本生成_文本对话","TypeScript","HarmonyOS","Repos","\u003ca name=\"TypeScript\"\u003e\u003c/a\u003eTypeScript","Uncategorized","HTML"],"sub_categories":["大语言对话模型及数据","Windows Manager","Uncategorized"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fweb-infra-dev%2Fmidscene","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fweb-infra-dev%2Fmidscene","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fweb-infra-dev%2Fmidscene/lists"}