https://github.com/onkernel/ts-stagehand-google-cua-agent
A Kernel application that demonstrates Computer Use Agent (CUA) capabilities using Google's Gemini 2.5 model with Stagehand for browser automation.
https://github.com/onkernel/ts-stagehand-google-cua-agent
Last synced: 4 months ago
JSON representation
A Kernel application that demonstrates Computer Use Agent (CUA) capabilities using Google's Gemini 2.5 model with Stagehand for browser automation.
- Host: GitHub
- URL: https://github.com/onkernel/ts-stagehand-google-cua-agent
- Owner: onkernel
- Created: 2025-10-08T03:09:30.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-10-08T03:14:46.000Z (4 months ago)
- Last Synced: 2025-10-08T05:34:36.214Z (4 months ago)
- Language: TypeScript
- Size: 8.79 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Kernel TypeScript SDK + Stagehand + Gemini Computer Use Agent
A Kernel application that demonstrates Computer Use Agent (CUA) capabilities using Google's Gemini 2.5 model with Stagehand for browser automation.
https://github.com/user-attachments/assets/d683f527-be61-4551-9745-1144db088127
## What It Does
This app uses [Gemini 2.5's computer use model](https://blog.google/technology/google-deepmind/gemini-computer-use-model/) capabilities to autonomously navigate websites and complete tasks. The example task searches for Kernel's company page on YCombinator and writes a blog post about their product.
## Setup
1. **Copy the environment file:**
```bash
cp .env-example .env
```
2. **Add your API keys to `.env`:**
- `KERNEL_API_KEY` - Get from [Kernel dashboard](https://dashboard.onkernel.com/sign-in)
- `GOOGLE_API_KEY` - Get from [Google AI Studio](https://aistudio.google.com/apikey)
- `OPENAI_API_KEY` - Get from [OpenAI platform](https://platform.openai.com/api-keys)
## Running Locally
Execute the script directly with tsx:
```bash
npx tsx index.ts
```
This runs the agent without a Kernel invocation context and provides the browser live view URL for debugging.
## Deploying to Kernel
1. **Deploy the application:**
```bash
kernel deploy index.ts --env-file .env
```
2. **Invoke the action:**
```bash
kernel invoke ts-stagehand-google-cua-agent google-cua-agent-task
```
The action creates a Kernel-managed browser and associates it with the invocation for tracking and monitoring.
## Documentation
- [Kernel Documentation](https://docs.onkernel.com/quickstart)
- [Kernel Stagehand Guide](https://www.onkernel.com/docs/integrations/stagehand)
- [Gemini 2.5 CUA Stagehand Example](https://github.com/browserbase/stagehand/blob/main/examples/cua-example.ts)