https://github.com/auto-browse/auto-browse-ts

Auto-Browse: AI Enabled Browser Automation
https://github.com/auto-browse/auto-browse-ts

ai ai-test-generator ai-testing-tool auto-browser automation browser-agent browser-automation browser-use computer-use langchain llm mcp openai playwright test-automation testing

Last synced: about 2 months ago
JSON representation

Auto-Browse: AI Enabled Browser Automation

Host: GitHub
URL: https://github.com/auto-browse/auto-browse-ts
Owner: auto-browse
Created: 2025-03-28T16:23:47.000Z (3 months ago)
Default Branch: main
Last Pushed: 2025-04-09T02:50:31.000Z (3 months ago)
Last Synced: 2025-05-07T21:46:57.295Z (about 2 months ago)
Topics: ai, ai-test-generator, ai-testing-tool, auto-browser, automation, browser-agent, browser-automation, browser-use, computer-use, langchain, llm, mcp, openai, playwright, test-automation, testing
Language: TypeScript
Homepage: https://auto-browse.com/
Size: 750 KB
Stars: 5
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # Auto-Browse: AI Enabled Browser Automation

**Auto Browse** is the easiest way to connect your AI agents with the browser using natural language.

[![Auto-Browse Launch Video](https://img.youtube.com/vi/VxJg3RRShoY/maxresdefault.jpg)](https://youtu.be/VxJg3RRShoY)

🎥 [Watch the launch video](https://youtu.be/VxJg3RRShoY)

## Quick start

An AI-powered browser automation agent for automating browser tasks and Write Playwright tests that enables natural language interactions with web pages.

## Examples

Check out our [TypeScript BDD Example Repository](https://github.com/auto-browse/auto-browse-typescript-bdd-example) to see a complete implementation using Auto Browse with BDD testing patterns.

## Installation

```bash

npm install @auto-browse/auto-browse

```

## ⚠️ Important: Playwright Version Requirements

> **Note:** Auto Browse currently requires specific versions of Playwright. This requirement will be relaxed in future versions.

### Required Versions

```bash

"@playwright/test": "1.52.0-alpha-1743011787000"

"playwright": "1.52.0-alpha-1743011787000"

```

### Version Conflicts

If you're using Auto Browse alongside an existing Playwright setup, you must upgrade to these specific versions. Here's how to handle common issues:

1. **Installation Conflicts**

   ```bash

   npm install --legacy-peer-deps

   ```

   This flag helps resolve peer dependency conflicts during installation.

2. **Multiple Playwright Versions**

   - Remove existing Playwright installations

   - Clear npm cache if needed: `npm cache clean --force`

   - Reinstall with the required versions

3. **Project Compatibility**

   - Update your project's Playwright configuration

   - Ensure your existing tests are compatible with the alpha version

   - Consider using a separate test environment if needed

> 🔄 Future releases will support a wider range of Playwright versions. Subscribe to our GitHub repository for updates.

## Configuration

Auto Browse requires environment variables for the LLM (Language Model) configuration. Create a `.env` file in your project root:

```env

# OpenAI (default)

OPENAI_API_KEY=your_openai_api_key_here

LLM_PROVIDER=openai  # Optional, defaults to openai

AUTOBROWSE_LLM_MODEL=gpt-4o-mini  # Optional, defaults to gpt-4o-mini

# Google AI

GOOGLE_API_KEY=your_google_key_here

LLM_PROVIDER=google

AUTOBROWSE_LLM_MODEL=gemini-2.0-flash-lite

# Azure OpenAI

AZURE_OPENAI_API_KEY=your_azure_key_here

AZURE_OPENAI_ENDPOINT=https://your-endpoint.openai.azure.com/

AZURE_OPENAI_API_VERSION=2024-12-01-preview

AZURE_OPENAI_API_DEPLOYMENT_NAME=your-deployment-name

LLM_PROVIDER=azure

# Anthropic

ANTHROPIC_API_KEY=your_anthropic_key_here

LLM_PROVIDER=anthropic

AUTOBROWSE_LLM_MODEL=claude-3

# Google Vertex AI

GOOGLE_APPLICATION_CREDENTIALS=path/to/credentials.json

LLM_PROVIDER=vertex

# Ollama

BASE_URL=http://localhost:11434  # Optional, defaults to this value

LLM_PROVIDER=ollama

AUTOBROWSE_LLM_MODEL=llama3.1

```

You can find an example configuration in `example.env`.

### Environment Variables

| Variable                           | Description                            | Default                  | Required For |

| ---------------------------------- | -------------------------------------- | ------------------------ | ------------ |

| `LLM_PROVIDER`                     | LLM provider to use                    | `openai`                 | No           |

| `AUTOBROWSE_LLM_MODEL`             | The LLM model to use                   | `gpt-4o-mini`            | No           |

| `OPENAI_API_KEY`                   | OpenAI API key                         | -                        | OpenAI       |

| `GOOGLE_API_KEY`                   | Google AI API key                      | -                        | Google AI    |

| `AZURE_OPENAI_API_KEY`             | Azure OpenAI API key                   | -                        | Azure        |

| `AZURE_OPENAI_ENDPOINT`            | Azure OpenAI endpoint URL              | -                        | Azure        |

| `AZURE_OPENAI_API_VERSION`         | Azure OpenAI API version               | `2024-12-01-preview`     | Azure        |

| `AZURE_OPENAI_API_DEPLOYMENT_NAME` | Azure OpenAI deployment name           | -                        | Azure        |

| `ANTHROPIC_API_KEY`                | Anthropic API key                      | -                        | Anthropic    |

| `GOOGLE_APPLICATION_CREDENTIALS`   | Path to Google Vertex credentials file | -                        | Vertex AI    |

| `BASE_URL`                         | Ollama API endpoint                    | `http://localhost:11434` | No           |

## Supported LLM Providers

Auto Browse supports multiple LLM providers:

- OpenAI (default) - GPT-4 and compatible models

- Google AI - Gemini models

- Azure OpenAI - GPT models on Azure

- Anthropic - Claude models

- Google Vertex AI - PaLM and Gemini models

- Ollama - Run models locally

## Usage

### Standalone Mode (Without Playwright Test)

Auto Browse can also be used outside of Playwright test context. Here's a complete form automation example:

```typescript

import { auto } from "@auto-browse/auto-browse";

async function main() {

	try {

		// Navigate to the form

		await auto("go to https://httpbin.org/forms/post");

		// Take a snapshot to analyze the page structure

		await auto("take a snapshot");

		// Fill out the form

		await auto('type "John Doe" in the customer name field');

		await auto('select "Large" for size');

		await auto('select "Mushroom" for topping');

		await auto('check "cheese" in extras');

		// Submit the form

		await auto("click the Order button");

		// Take a snapshot of the response page

		await auto("take a snapshot of the response page");

	} catch (error) {

		console.error("Error:", error);

	}

}

// Run the script

main().catch(console.error);

```

In standalone mode, Auto Browse automatically:

- Manages browser lifecycle

- Creates and configures pages

- Handles cleanup

To run standalone scripts:

```bash

npx ts-node your-script.ts

```

### Playwright Test Mode

```typescript

import { test, expect } from "@playwright/test";

import { auto } from "@auto-browse/auto-browse";

test("example test", async ({ page }) => {

	await page.goto("https://example.com");

	// Get text using natural language

	const headerText = await auto("get the header text", { page });

	// Type in an input using natural language

	await auto('type "Hello World" in the search box', { page });

	// Click elements using natural language

	await auto("click the login button", { page });

});

```

### Auto-Detection Mode

The package automatically detects the current page context, so you can skip passing the page parameter:

```typescript

import { test, expect } from "@playwright/test";

import { auto } from "@auto-browse/auto-browse";

test("simplified example", async ({ page }) => {

	await page.goto("https://example.com");

	// No need to pass page parameter

	const headerText = await auto("get the header text");

	await auto('type "Hello World" in the search box');

	await auto("click the login button");

});

```

### BDD Mode with Playwright-BDD

Auto Browse seamlessly integrates with [playwright-bdd](https://github.com/vitalets/playwright-bdd) for behavior-driven development. This allows you to write expressive feature files and implement steps using natural language commands.

#### Example Feature File

```gherkin

# features/homepage.feature

Feature: Playwright Home Page

  Scenario: Check title

    Given navigate to https://playwright.dev

    When click link "Get started"

    Then assert title "Installation"

```

#### Step Definitions

```typescript

import { auto } from "@auto-browse/auto-browse";

import { Given, When as aistep, Then } from "./fixtures";

// Generic step that handles any natural language action

aistep(/^(.*)$/, async ({ page }, action: string) => {

	await auto(action, { page });

});

```

#### Setup Requirements

1. Install dependencies:

```bash

npm install --save-dev @playwright/test @cucumber/cucumber playwright-bdd

```

2. Configure `playwright.config.ts`:

```typescript

import { PlaywrightTestConfig } from "@playwright/test";

const config: PlaywrightTestConfig = {

	testDir: "./features",

	use: {

		baseURL: "https://playwright.dev"

	}

};

export default config;

```

This integration enables:

- Natural language test scenarios

- Reusable step definitions

- Cucumber reporter integration

- Built-in Playwright context management

### Supported Actions

1. **Clicking Elements**

   ```typescript

   await auto("click the submit button");

   await auto("click the link that says Learn More");

   ```

2. **Typing Text**

   ```typescript

   await auto('type "username" in the email field');

   await auto('enter "password123" in the password input');

   ```

## Features

Core Features:

- Natural language commands for browser automation

- AI-powered computer and browser agent

- Automate any browser task

- Automatic page/context detection

- TypeScript support

- Playwright test integration

- Zero configuration required

Supported Operations:

- Page Navigation (goto URL, back, forward)

- Element Interactions (click, type, hover, drag-and-drop)

- Form Handling (select options, file uploads, form submission)

- Visual Verification (snapshots, screenshots, PDF export)

- Keyboard Control (key press, text input)

- Wait and Timing Control

- Assertions and Validation

## Best Practices

1. **Be Descriptive**

   ```typescript

   // Good

   await auto("click the submit button in the login form");

   // Less Clear

   await auto("click submit");

   ```

2. **Use Quotes for Input Values**

   ```typescript

   // Good

   await auto('type "John Doe" in the name field');

   // Not Recommended

   await auto("type John Doe in the name field");

   ```

3. **Leverage Existing Labels**

   - Use actual labels and text from your UI in commands

   - Maintain good accessibility practices in your app for better automation

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Thanks to Playwright Team for creating Playwright MCP and Playwright BDD.

## License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/auto-browse/auto-browse-ts

Awesome Lists containing this project

README