{"id":47725903,"url":"https://github.com/aadya940/orbit","last_synced_at":"2026-04-23T03:04:44.972Z","repository":{"id":344538374,"uuid":"1181549470","full_name":"aadya940/orbit","owner":"aadya940","description":"Building Blocks to automate desktop workflows end-to-end using AI","archived":false,"fork":false,"pushed_at":"2026-04-17T01:47:06.000Z","size":8867,"stargazers_count":4,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-04-17T02:40:53.896Z","etag":null,"topics":["agentic-ai","ai","ai-agent","automation","browser-automation","computer-use","cua","desktop-automation","operating-system","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aadya940.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-14T09:43:31.000Z","updated_at":"2026-04-17T01:46:43.000Z","dependencies_parsed_at":"2026-04-14T07:01:13.891Z","dependency_job_id":null,"html_url":"https://github.com/aadya940/orbit","commit_stats":null,"previous_names":["aadya940/orbit"],"tags_count":31,"template":false,"template_full_name":null,"purl":"pkg:github/aadya940/orbit","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aadya940%2Forbit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aadya940%2Forbit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aadya940%2Forbit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aadya940%2Forbit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aadya940","download_url":"https://codeload.github.com/aadya940/orbit/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aadya940%2Forbit/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32163853,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-23T02:19:40.750Z","status":"ssl_error","status_checked_at":"2026-04-23T02:17:55.737Z","response_time":53,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-ai","ai","ai-agent","automation","browser-automation","computer-use","cua","desktop-automation","operating-system","python"],"created_at":"2026-04-02T20:26:01.732Z","updated_at":"2026-04-23T03:04:44.933Z","avatar_url":"https://github.com/aadya940.png","language":"Python","readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"logo.png\" alt=\"Orbit logo\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cstrong\u003eAutonomous agents are demos. Controlled agents are products.\u003c/strong\u003e\n\u003c/p\u003e\n\n---\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://youtu.be/nll7Mmzwh00\"\u003e\n    \u003cimg src=\"demo_preview.svg\" width=\"720\" alt=\"Watch Orbit in action\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n---\n\n## The problem\n\nAI agents can use computers now.\n\nBut in practice:\n- they loop\n- they click the wrong thing\n- they get stuck on simple steps\n- they're impossible to steer mid-task\n\nMost frameworks either hide everything in a black box, or hand you raw tools with no structure.\n\nNeither works in production.\n\n\n## Orbit\n\nNatural language controls the screen.  \nPython controls the flow.\n\nInstead of one monolithic agent, Orbit breaks execution into **independent steps**:\n\n`Do` · `Read` · `Check` · `Navigate` · `Fill`\n\nEach step runs its own model, has its own budget, and returns typed output. All steps share context.\n\n\n## Why this matters\n\n- Use a cheap model for simple clicks, a powerful one for complex reasoning\n- Cap LLM calls per step , nothing runs forever\n- Inject guidance mid-execution when the agent is struggling\n- Extract structured data directly into Pydantic models\n- Toggle `planner=False` for low-latency direct execution\n\nThis turns agents from **demos into usable systems**.\n\n\n## Key difference\n\nMost agents see pixels.\n\n**Orbit sees the UI.**\n\nIt reads the OS accessibility tree , screenshots only when needed, no DOM hacks. Works across desktop apps and browsers with lower token usage.\n\n\n## Quickstart\n\n```bash\npip install orbit-cua\n```\n\n```python\nfrom dotenv import load_dotenv\nload_dotenv()\n\nfrom orbit import Agent\nimport asyncio\n\nasync def main():\n    result = await Agent(\n        task=\"Open Chrome and go to Wikipedia\",\n        llm=\"gemini-3-pro-preview\",\n        verbose=True,\n    ).run()\n    print(result.status)\n\nasyncio.run(main())\n```\n\nSet your API key , Orbit supports any model via [LiteLLM](https://docs.litellm.ai/):\n\n```bash\nexport GEMINI_API_KEY=\"your-key\"   # or OPENAI_API_KEY / ANTHROPIC_API_KEY\n```\n\n\n## Composable SDK\n\nWhen you need precision, drop to the SDK:\n\n```python\nfrom dotenv import load_dotenv\nload_dotenv()\n\n\nfrom orbit import Do, Read, Check, Navigate, session\nfrom pydantic import BaseModel\nimport asyncio\n\nclass Product(BaseModel):\n    name: str\n    price: float\n    in_stock: bool\n\nclass ProductList(BaseModel):\n    products: list[Product]\n\nasync def main():\n    action_model = \"gemini-3-flash-preview\"\n\n    async with session() as s:\n        await Navigate(\n            \"https://www.amazon.com/s?k=mechanical+keyboard\",\n            session=s, llm=action_model, max_steps=30, planner=False,\n            extra_info=\"Avoid bookmark bar links; use direct navigation tools first.\",\n            verbose=True,\n        ).run()\n\n        if await Check(\n            \"The current page is a Captcha page and `Continue Shopping` button is visible\",\n            session=s, llm=action_model, max_steps=30, planner=False,\n        ).check():\n            await Do(\n                \"Click `Continue Shopping`, then solve the Captcha.\",\n                session=s, llm=action_model, max_steps=30,\n            ).run()\n\n        products = await Read(\n            \"All search results\",\n            schema=ProductList,\n            session=s, llm=action_model, max_steps=30, verbose=True,\n        ).run()\n\n        cheapest = min(products.output.products, key=lambda p: p.price)\n\n        await Do(f\"click on '{cheapest.name}'\", session=s, llm=action_model, max_steps=30).run()\n\n        if await Check(\"Add to Cart button is visible\", session=s, llm=action_model, max_steps=30).check():\n            await Do(\"click Add to Cart\", session=s, llm=action_model, max_steps=30).run()\n\nasyncio.run(main())\n```\n\n\n## The idea\n\nAgents shouldn't be one giant prompt.\n\nThey should be composable systems.\n\nOrbit gives you:\n- **verbs** instead of prompts\n- **steps** instead of guesswork\n- **control** instead of hope\n\n\n## Custom actions\n\nBuild reusable, domain-specific actions by subclassing `BaseActionAgent`:\n\n```python\nfrom dotenv import load_dotenv\nload_dotenv()\n\nfrom orbit import BaseActionAgent, Navigate, session\nfrom pydantic import BaseModel\nimport asyncio\n\nclass ProductList(BaseModel):\n    products: list[dict]\n\nclass ReadTopProducts(BaseActionAgent):\n    def __init__(self, category: str, **kw):\n        super().__init__(max_steps=12, planner=False, **kw)\n        self.category = category\n\n    def task_prompt(self) -\u003e str:\n        return (\n            f\"Read top products for '{self.category}' from the current page. \"\n            \"Extract name, price, and stock status only. Do not click or navigate.\"\n        )\n\n    def output_schema(self):\n        return ProductList\n\nasync def main():\n    async with session() as s:\n        await Navigate(\"https://www.amazon.com/s?k=mechanical+keyboard\", session=s).run()\n        result = await ReadTopProducts(\n            category=\"mechanical keyboard\",\n            session=s, llm=\"gemini-3-flash-preview\", verbose=True,\n        ).run()\n        print(result.output.products[:3])\n\nasyncio.run(main())\n```\n\n## Examples\n\nSee [`examples/`](examples/) for full end-to-end scripts, including a LinkedIn Easy Apply bot that applies to jobs autonomously.\n\n## Installation\n\n## Install from source\n\n\u003cdetails\u003e\n\u003csummary\u003eBuild from source (requires Rust)\u003c/summary\u003e\n\n```bash\ngit clone --recurse-submodules https://github.com/aadya940/orbit.git\ncd orbit\n\ncd oculos \u0026\u0026 cargo build --release \u0026\u0026 cd ..\nmkdir -p orbit/_bin\n\n# Linux/macOS\ncp oculos/target/release/oculos orbit/_bin/oculos\n\n# Windows\ncopy oculos\\target\\release\\oculos.exe orbit\\_bin\\oculos.exe\n\npip install .\n```\n\nAlso requires Tk GUI toolkit for tkinter python.\n```\nsudo apt install python3-tk\n```\n\nmacOS users: grant accessibility permissions as described [here](https://github.com/huseyinstif/oculos?tab=readme-ov-file#macos-grant-accessibility-permission).\n\n\u003c/details\u003e\n\n\n## Support matrix\n\n| OS | Architectures |\n|---|---|\n| **Windows** | x86-64 (`win_amd64`) |\n| **Linux** | x86-64 (`manylinux`) |\n| **macOS** | Intel + Apple Silicon (`universal2`) |\n\n| Python | 3.10 · 3.11 · 3.12 · 3.13 |\n|---|---|\n\n\n## Safety\n\nNo permanent file deletion , destructive operations go to Trash/Recycle Bin. Disk writes require explicit human approval via a configurable callback.\n\n\n## License\n\nApache 2.0 , Special thanks to [OculOS](https://github.com/huseyinstif/oculos) and the open-source packages that make this possible.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faadya940%2Forbit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faadya940%2Forbit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faadya940%2Forbit/lists"}