https://github.com/browsemake/browser-cli
AI agent browser automation tool that just works
https://github.com/browsemake/browser-cli
ai ai-agents browser-automation cli command-line-tool llm playwright
Last synced: 6 days ago
JSON representation
AI agent browser automation tool that just works
- Host: GitHub
- URL: https://github.com/browsemake/browser-cli
- Owner: browsemake
- License: mit
- Created: 2025-07-01T21:08:55.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2026-04-12T02:31:32.000Z (2 months ago)
- Last Synced: 2026-04-12T04:24:17.382Z (2 months ago)
- Topics: ai, ai-agents, browser-automation, cli, command-line-tool, llm, playwright
- Language: JavaScript
- Homepage:
- Size: 83 KB
- Stars: 78
- Watchers: 0
- Forks: 4
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-cli-apps-in-a-csv - Browser CLI - AI agent browser automation tool. (<a name="ai"></a>AI / ChatGPT)
- awesome-cli-apps - Browser CLI - AI agent browser automation tool. (<a name="ai"></a>AI / ChatGPT)
README
Browser CLI
[](https://discord.gg/N7crMvEX)
[](https://x.com/intent/user?screen_name=browse_make)
`br` is a command line tool used by any capable LLM agent, like ChatGPT, [Claude Code](https://github.com/anthropics/claude-code) or [Gemini CLI](https://github.com/google-gemini/gemini-cli).
https://www.npmjs.com/package/@browsemake/browser-cli
## Why Broswer CLI?
- **Just works**: simply browser automation, coding not required, leave the rest workflow to the most powerful LLM agent
- **AI first**: designed for LLM agent, readable view from HTML, and error hint
- **Secure**: can be run locally, no credential passed to LLM
- **Robust**: browser persisted progress across session, and track history action for replay
## Install
```bash
npm install -g @browsemake/browser-cli
```
## Usage
Type instruction to AI agent (Gemini CLI / Claude Code / ChatGPT):
```
> You have browser automation tool 'br', use it to go to amazon to buy me a basketball
```
Use command line directly by human:
```bash
br start
br goto https://github.com/
```
## Demos
Grocery (Go to Amazon and buy me a basketball)
Navigate to GitHub repo:
Print invoice
Download bank account statement
Search for job posting
## Features
- **Browser Action**: Comprehensive action for browser automation (navigation, click, etc.)
- **LLM friendly output**: LLM friendly command output with error correction hint
- **Daemon mode**: Always-on daemon mode so it lives across multiple LLM sessions
- **Structured web page view**: Accessibility tree view for easier LLM interpretation than HTML
- **Secret management**: Secret management to isolate password from LLM
- **History tracking**: History tracking for replay and scripting
## Command
### Start the daemon
```bash
br start
```
If starting the daemon fails (for example due to missing Playwright browsers),
the CLI prints the error output so you can diagnose the issue.
### Navigate to a URL
```bash
br goto https://example.com
```
### Click an element
```bash
br click "button.submit"
```
Commands that accept a CSS selector (like `click`, `fill`, `scrollIntoView`, `type`) can also accept a numeric ID. These IDs are displayed in the output of `br view-tree` and allow for direct interaction with elements identified in the tree.
### Scroll element into view
```bash
br scrollIntoView "#footer"
```
### Scroll to percentage of page
```bash
br scrollTo 50
```
### Fill an input field
```bash
br fill "input[name='q']" "search text"
```
### Fill an input field with a secret
```bash
MY_SECRET="top-secret" br fill-secret "input[name='password']" MY_SECRET
```
When retrieving page HTML with `br view-html`, any text provided via
`fill-secret` is masked to avoid exposing secrets.
### Type text into an input
```bash
br type "input[name='q']" "search text"
```
### Press a key
```bash
br press Enter
```
### Scroll next/previous chunk
```bash
br nextChunk
br prevChunk
```
### View page HTML
```bash
br view-html
```
### View action history
```bash
br history
```
### Clear action history
```bash
br clear-history
```
### Capture a screenshot
```bash
br screenshot
```
### View accessibility and DOM tree
```bash
br view-tree
```
Outputs a hierarchical tree combining accessibility roles with DOM element
information. It also builds an ID-to-XPath map for quick element lookup.
### List open tabs
```bash
br tabs
```
### Switch to a tab by index
```bash
br switch-tab 1
```
### Stop the daemon
```bash
br stop
```
The daemon runs a headless Chromium browser and exposes a small HTTP API. The CLI communicates with it to perform actions like navigation and clicking elements.