https://github.com/lila-team/ai-locators
AI Locators for Playwright. Built by the Lila team.
https://github.com/lila-team/ai-locators
ai llm locators node playwright python
Last synced: 3 months ago
JSON representation
AI Locators for Playwright. Built by the Lila team.
- Host: GitHub
- URL: https://github.com/lila-team/ai-locators
- Owner: lila-team
- License: mit
- Created: 2025-03-04T14:28:49.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-03-10T14:26:04.000Z (4 months ago)
- Last Synced: 2025-03-27T02:21:43.896Z (3 months ago)
- Topics: ai, llm, locators, node, playwright, python
- Language: JavaScript
- Homepage: https://lila.dev
- Size: 81.1 KB
- Stars: 8
- Watchers: 0
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# AI Locators for Playwright
By [
](https://github.com/lila-team/lila)
[](https://www.npmjs.com/package/ai-locators)
[](https://pypi.org/project/ai-locators/)
[](https://twitter.com/lila__dev)
[](https://discord.gg/kZ7TEmxH)
AI-powered selectors for Playwright, available for both Python and Node.js. These packages allow you to use natural language descriptions to locate elements on a webpage using LLM (Large Language Model) technology.
```javascript
// 👎 Complex XPath with multiple conditions
page.locator("//div[contains(@class, 'header')]//button[contains(@class, 'login') and not(@disabled) and contains(text(), 'Sign In')]");// 😎 Using ai-locators
page.locator("ai=the login button in the header that says Sign In");
```Why?
* `ai-locators` do not require maintenance
* native integration with Playwright⚠️ **Warning**: This package is currently experimental and not intended for production use. It may have:
- Unpredictable behavior
- Performance overhead from LLM calls
- Potential security implicationsWe recommend using it for prototyping and testing purposes only.
## Supported Models
`ai-locators` works with flagship models for now. Smaller models proved not to be powerful enough for the selector generation task.
| Model Name | Test Badge |
|------------|------------|
| Sonnet 3.5 |  |
| Sonnet 3.7 |  |
| GPT-4o |  |
| Google Gemini 2.0 Flash 001 |  |
| Meta LLaMA 3.3 70B Instruct |  |Any model with a compatible AI interface can be used with ai-locators, but the models listed above have been thoroughly tested and are known to work well with the package.
## Node.js Package
### Installation
```bash
npm install ai-locators
```### Usage
```javascript
const { chromium } = require('playwright');
const { registerAISelector } = require('ai-locators');const apiKey = process.env.OPENAI_API_KEY;
const baseUrl = process.env.OPENAI_BASE_URL;
const model = "gpt-4o";(async () => {
const browser = await chromium.launch({
headless: false,
args: ["--disable-web-security"] // Disable CORS to make LLM request. Use at own risk.
});
const page = await browser.newPage();
await registerAISelector({
apiKey: apiKey,
model: model,
baseUrl: baseUrl,
});
console.log("Registered AI selector");
// Navigate to a page
await page.goto("https://playwright.dev/")
// Use the AI selector with natural language
const element = page.locator("ai=get started button")
await element.click();
console.log("Clicked get started button");
await browser.close();
})();
```## Python Package
### Installation
```bash
pip install ai-locators
```### Usage
```python
from playwright.sync_api import sync_playwright
from ai_locators import register_ai_selectorapi_key = os.getenv("OPENAI_API_KEY")
base_url = os.getenv("OPENAI_BASE_URL")
model = "gpt-4o"with sync_playwright() as p:
# Need to disable web security for browser to make LLM requests work
browser = p.chromium.launch(headless=False, args=["--disable-web-security"]) # Disable CORS to make LLM request. Use at own risk.
page = browser.new_page()
# Register the AI selector
register_ai_selector(p, api_key, base_url, model)
# Navigate to a page
page.goto("https://playwright.dev/")
# Use the AI selector with natural language
element = page.locator("ai=get started button")
element.click()
browser.close()
```## Custom Prefix
You can customize the prefix used for AI selectors. By default, it's `ai=`, but you can change it to anything you prefer.
### In Node.js
```javascript
await registerAISelector({
apiKey: "...",
baseUrl: "...",
model: "...",
selectorPrefix: "find" // Now you can use "find=the login button"
});
```### In Python
```python
register_ai_selector(p,
api_key="...",
base_url="...",
model="...",
selector_prefix="find" # Now you can use "find=the login button"
)
```## Plug in your LLM
The packages work with any OpenAI-compatible LLM endpoint. You just need to pass `model`, `api_key` and `base_url` when registering the selector.
For example:
### In Node.js
```javascript
// OpenAI
await registerAISelector({
apiKey: "sk-...",
baseUrl: "https://api.openai.com/v1",
model: "gpt-4"
});// Anthropic
await registerAISelector({
apiKey: "sk-ant-...",
baseUrl: "https://api.anthropic.com/v1",
model: "claude-3-sonnet-20240229"
});// Ollama
await registerAISelector({
apiKey: "ollama", // not used but required
baseUrl: "http://localhost:11434/v1",
model: "llama2"
});// Basically any OpenAI compatible endpoint
```### In Python
```python
# OpenAI
register_ai_selector(p,
api_key="sk-...",
base_url="https://api.openai.com/v1",
model="gpt-4"
)# Anthropic
register_ai_selector(p,
api_key="sk-ant-...",
base_url="https://api.anthropic.com/v1",
model="claude-3-sonnet-20240229"
)# Ollama
register_ai_selector(p,
api_key="ollama", # not used but required
base_url="http://localhost:11434/v1",
model="llama2"
)# Basically any OpenAI compatible endpoint
```## How it works
`ai-locators` uses the custom selector engine feature from Playwright: https://playwright.dev/docs/extensibility
Each time a locator needs to be resolved, an LLM call is used to generate the appropriate selector.## Best practices
### Narrowing Down Selectors
For better performance and reliability, it's recommended to first locate a known container element using standard selectors, then use the AI selector within that container. This approach:
- Reduces the search space for the AI
- Improves accuracy by providing more context
- Reduces LLM token usage
- Results in faster element location