An open API service indexing awesome lists of open source software.

https://github.com/kyungw00k/stealth-wright

Silent browser automation CLI with stealth capabilities
https://github.com/kyungw00k/stealth-wright

crawler go playwright stealth-automation

Last synced: 24 days ago
JSON representation

Silent browser automation CLI with stealth capabilities

Awesome Lists containing this project

README

          

# sw (Stealth Wright)

> Silent browser automation CLI with stealth capabilities

## Overview

**sw** (Stealth Wright) is a browser automation CLI that provides:

- **Playwright CLI-compatible UX** - Familiar commands and output format
- **Stealth Mode** - Built-in bot detection evasion via seleniumbase-go
- **Element References** - ARIA snapshot with inline `[ref=eN]` addressing
- **Session Management** - Multiple isolated browser sessions via daemon
- **AI-Friendly** - Skill system for coding agents

## Installation

### Homebrew

```bash
brew install kyungw00k/cli/sw
```

### From Source

```bash
# Clone the repository
git clone https://github.com/kyungw00k/stealth-wright.git
cd stealth-wright

# Build
go build -o build/sw ./cmd/sw

# Install to ~/.local/bin
make install
```

## Quick Start

```bash
# Open browser
sw open https://example.com

# Take snapshot (shows ARIA tree with element references)
sw snapshot

# Interact using [ref=eN] references
sw click e5
sw fill e1 "user@example.com"

# Or use semantic locators (no snapshot needed)
sw click --role button --text "Sign In"
sw fill --label "Email" "user@example.com"

# Annotated screenshot (overlays ref numbers on elements)
sw screenshot --annotate

# Close browser
sw close
```

## Semantic Locators

sw supports two ways to target elements:

### 1. Ref-based (via snapshot)

Run `sw snapshot` first to assign `[ref=eN]` references to all visible elements, then use them in commands:

```bash
sw snapshot
# → - textbox "Email" [ref=e1]
# → - textbox "Password" [ref=e2]
# → - button "Sign In" [ref=e3]

sw fill e1 "user@example.com"
sw fill e2 "secret"
sw click e3
```

### 2. Semantic (no snapshot needed)

Target elements directly by ARIA role, visible text, label, or placeholder:

```bash
sw fill --label "Email" "user@example.com"
sw fill --placeholder "Password" "secret"
sw click --role button --text "Sign In"
```

Use `sw find` to discover elements before interacting:

```bash
sw find --role button
# → ### Found 3 element(s)
# → - [ref=e3] "Sign In"
# → - [ref=e7] "Register"
# → - [ref=e12] "Forgot password"
```

### Annotated Screenshot

Capture a screenshot with `[eN]` ref numbers overlaid on every element — useful for multimodal AI models:

```bash
sw screenshot --annotate
```

---

## Snapshot Format

Snapshots use Playwright-compatible ARIA tree format with inline `[ref=eN]` element references:

```yaml
- heading "Example Domain" [level=1] [ref=e1]
- paragraph [ref=e2]: This domain is for use in illustrative examples.
- link "More information..." [ref=e3]
```

Use the ref value to target elements in subsequent commands:

```bash
sw click e3
sw fill e1 "search query"
```

## Commands

### Navigation

```bash
sw open [url] # Open browser and navigate
sw close # Close browser session
sw goto # Navigate to URL
sw go-back # Go back
sw go-forward # Go forward
sw reload # Reload page
```

### Snapshot & Screenshot

```bash
sw snapshot # Generate ARIA snapshot with element refs
sw screenshot # Take screenshot
sw screenshot --annotate # Screenshot with ref number overlays on elements
sw screenshot --full-page # Full page screenshot
sw screenshot --filename out.png # Save to specific file
sw pdf [filename] # Save page as PDF
```

### Interaction

Ref-based (requires prior `sw snapshot`):
```bash
sw click [button] # Click element (button: left/right/middle)
sw dblclick # Double-click element
sw hover # Hover over element
sw fill # Fill input field
sw check # Check checkbox/radio
sw uncheck # Uncheck checkbox
```

Semantic (no snapshot needed — target by ARIA role, text, or label):
```bash
sw click --role button --text "Submit"
sw click --label "Close dialog"
sw fill --label "Email" "user@example.com"
sw fill --placeholder "Search..." "query"
sw hover --role link --text "About"
sw check --label "Remember me"

# Find elements matching criteria
sw find --role button
sw find --text "Sign In"
sw find --label "Username"
sw find --placeholder "Password"
```

Flags for semantic targeting:
```
--role ARIA role: button, link, textbox, checkbox, heading, etc.
--text Visible text content (partial match)
--label aria-label attribute (partial match)
--placeholder Input placeholder (partial match)
--exact Require exact string match
```

Other interaction commands:
```bash
sw type # Type into focused element
sw press # Press key (e.g. Enter, Tab)
sw keydown # Key down event
sw keyup # Key up event
sw select # Select dropdown option
sw upload # Upload file
sw drag # Drag and drop
sw eval # Evaluate JavaScript
```

### Mouse

```bash
sw mousemove <x> <y> # Move mouse
sw mousedown <x> <y> # Mouse button down
sw mouseup <x> <y> # Mouse button up
sw mousewheel <x> <y> # Mouse wheel scroll
```

### Dialogs

```bash
sw dialog-accept [text] # Accept dialog (with optional input)
sw dialog-dismiss # Dismiss dialog
```

### Tabs

```bash
sw tab-list # List open tabs
sw tab-new [url] # Open new tab
sw tab-close [index] # Close tab
sw tab-select <index> # Switch to tab
```

### Cookies

```bash
sw cookie-list # List all cookies
sw cookie-get <name> # Get cookie by name
sw cookie-set <name> <value> [--domain] [--path] [--expires] [--httpOnly] [--secure] [--sameSite]
sw cookie-delete <name> # Delete cookie
sw cookie-clear # Clear all cookies
```

### Local Storage

```bash
sw localstorage-list # List all entries
sw localstorage-get <key> # Get value
sw localstorage-set <key> <value> # Set value
sw localstorage-delete <key> # Delete entry
sw localstorage-clear # Clear all entries
```

### Session Storage

```bash
sw sessionstorage-list # List all entries
sw sessionstorage-get <key> # Get value
sw sessionstorage-set <key> <value> # Set value
sw sessionstorage-delete <key> # Delete entry
sw sessionstorage-clear # Clear all entries
```

### State & Data

```bash
sw state-save [path] # Save browser state to file
sw state-load [path] # Load browser state from file
sw delete-data # Clear all browser data
sw resize <width> <height> # Resize browser window
```

### Video Recording

```bash
sw video-start # Start recording (saves WebM to current directory)
sw video-stop # Stop recording and save file
```

### Developer Tools

```bash
sw install-browser [--browser <type>] # Install browser binaries (chromium/firefox/webkit)
sw devtools-start # Open browser DevTools (headed+Chromium only)
```

### Session Management

```bash
sw list # List all sessions
sw kill-all # Kill all browser sessions
sw close-all # Close all browser sessions
sw daemon start # Start daemon manually
sw daemon stop # Stop daemon
sw daemon status # Check daemon status
```

## Global Options

```bash
-s, --session <name> Session name (default: "default")
-b, --browser <type> Browser: chromium, firefox, webkit
--headed Run in headed mode
--persistent Persist profile to disk
--profile <path> Custom profile directory
--config <file> Config file path
--stealth Enable stealth mode (default: true)
--no-stealth Disable stealth mode
```

## Stealth Features

Stealth mode is powered by [seleniumbase-go](https://github.com/kyungw00k/seleniumbase-go) which launches Chrome with fingerprint evasion arguments:

- Hide WebDriver automation markers
- Randomized browser fingerprints
- Custom User-Agent strings
- WebGL and Canvas fingerprint spoofing
- WebRTC leak prevention

## Architecture

```
sw (Stealth Wright)

├── CLI (cobra) # Command-line interface
├── Daemon (Unix socket) # Background browser process
├── Browser Abstraction Layer # Pluggable browser backends
│ └── seleniumbase-go (current) # Playwright + stealth via seleniumbase-go
└── Snapshot Generator # ARIA tree with [ref=eN] annotations
```

### Protocol

Communication between CLI and Daemon uses JSON-RPC 2.0 over Unix sockets.

```json
{
"jsonrpc": "2.0",
"id": 1,
"method": "click",
"params": {"ref": "e5"}
}
```

## AI Skill System

sw supports the [Agent Skills](https://vercel.com/docs/agent-resources/skills) open standard — a single skill works across 40+ AI coding agents (Claude Code, Cursor, Codex, opencode, GitHub Copilot, and more).

### Install via npx skills (recommended)

```bash
npx skills add kyungw00k/stealth-wright
```

This installs the `sw` skill to all detected AI tools on your system (Claude Code, Codex, opencode, etc.) using symlinks to a shared canonical location.

### Install manually (no Node.js required)

```bash
# Install sw skill files to .claude/skills/sw/ in the current project
sw install --skills
```

### Example Usage with Claude Code

```
User: Use sw to open example.com and click the first link

Claude:
1. sw open https://example.com
2. sw snapshot
→ - link "More information..." [ref=e3]
3. sw click e3
```

## Comparison

| Feature | Playwright CLI | sw |
|---------|---------------|-----|
| Daemon Sessions | ✅ | ✅ |
| Element References (`[ref=eN]`) | ✅ | ✅ |
| ARIA Snapshot Format | ✅ | ✅ |
| Semantic Locators (`--role/--text/--label`) | ✅ | ✅ |
| Annotated Screenshots | ❌ | ✅ |
| Stealth Mode | ❌ | ✅ |
| Video Recording | ✅ | ✅ |
| AI Skills | ✅ | ✅ |
| Go Native Binary | ❌ | ✅ |

## Development

```bash
# Build
go build -o build/sw ./cmd/sw

# Run unit tests
go test ./...

# Run integration tests (requires built binary)
go build -o build/sw ./cmd/sw
go test -tags integration -v ./test/...

# Lint
go vet ./...

# Format
go fmt ./...
```

## Project Structure

```
sw/
├── cmd/sw/main.go # CLI entry point (cobra commands)
├── internal/
│ ├── browser/ # Browser interface abstraction
│ ├── client/ # Daemon JSON-RPC client
│ ├── daemon/ # Daemon server + command handlers
│ ├── drivers/seleniumbase/ # seleniumbase-go driver implementation
│ ├── session/ # Session lifecycle management
│ └── snapshot/ # ARIA snapshot + [ref=eN] annotation
├── pkg/protocol/ # JSON-RPC request/response types
├── skills/sw/SKILL.md # AI agent skill file
├── test/ # Integration tests
└── README.md
```

## Claude Code Skill

A Claude Code skill is available for AI agents to use stealth-wright automatically.

```
/plugin marketplace add kyungw00k/skills
/plugin install cli-tools@kyungw00k-skills
```

## License

MIT