https://github.com/simonw/llm-evals-plugin

Run evals using LLM
https://github.com/simonw/llm-evals-plugin

Last synced: about 1 year ago
JSON representation

Run evals using LLM

Host: GitHub
URL: https://github.com/simonw/llm-evals-plugin
Owner: simonw
Created: 2024-04-20T18:28:49.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-04-21T15:49:00.000Z (about 2 years ago)
Last Synced: 2025-04-06T05:39:47.427Z (about 1 year ago)
Language: Python
Size: 6.84 KB
Stars: 22
Watchers: 6
Forks: 0
Open Issues: 8
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # llm-evals-plugin

[![PyPI](https://img.shields.io/pypi/v/llm-evals-plugin.svg)](https://pypi.org/project/llm-evals-plugin/)

[![Changelog](https://img.shields.io/github/v/release/simonw/llm-evals-plugin?include_prereleases&label=changelog)](https://github.com/simonw/llm-evals-plugin/releases)

[![Tests](https://github.com/simonw/llm-evals-plugin/actions/workflows/test.yml/badge.svg)](https://github.com/simonw/llm-evals-plugin/actions/workflows/test.yml)

[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/llm-evals-plugin/blob/main/LICENSE)

Run evals against prompts using LLM

**Very early alpha**: everything is likely to change.

## Installation

Install this plugin in the same environment as [LLM](https://llm.datasette.io/).

```bash

llm install llm-evals-plugin

```

## Usage

See [this issue comment](https://github.com/simonw/llm-evals-plugin/issues/1#issuecomment-2067916371).

## Development

To set up this plugin locally, first checkout the code. Then create a new virtual environment:

```bash

cd llm-evals-plugin

python3 -m venv venv

source venv/bin/activate

```

Now install the dependencies and test dependencies:

```bash

llm install -e '.[test]'

```

To run the tests:

```bash

pytest

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/simonw/llm-evals-plugin

Awesome Lists containing this project

README