https://github.com/dluo96/lawful-good

Benchmark for assessing legal capabilities of LLM agents
https://github.com/dluo96/lawful-good

Last synced: about 1 month ago
JSON representation

Benchmark for assessing legal capabilities of LLM agents

Host: GitHub
URL: https://github.com/dluo96/lawful-good
Owner: dluo96
Created: 2024-10-23T22:17:51.000Z (6 months ago)
Default Branch: main
Last Pushed: 2024-12-16T17:30:36.000Z (4 months ago)
Last Synced: 2025-01-30T13:18:13.805Z (3 months ago)
Language: Python
Size: 892 KB
Stars: 0
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome_ai_agents - Lawful-Good - Benchmark for assessing legal capabilities of LLM agents (Building / Benchmarks)
awesome_ai_agents - Lawful-Good - Benchmark for assessing legal capabilities of LLM agents (Building / Benchmarks)

README

# Lawful Good

## Get Started
Clone the repo:
```bash
git clone git@github.com:dluo96/lawful-good.git
```

Install the `inspect_ai` and `inspect_evals` Python packages:
```bash
pip install inspect_ai
pip install git+https://github.com/UKGovernmentBEIS/inspect_evals
```

Install the package locally
```bash
cd lawful-good
pip install -e .
```

### Example Evaluation with Ollama
- Install [ollama](https://ollama.com/download). On Linux, simply run
```bash
curl -fsSL https://ollama.com/install.sh | sh
```
- Run (say) Llama2 3B:
```
ollama run llama3.2
```
This will automatically pull (download) Llama2 3B for you and then run.
- To run Inspect evals with Ollama, you will need to install the `openai` package:
```
pip install openai
```
- Run the example evaluation script:
```bash
inspect eval lg/lawful_good.py --model ollama/llama3.2
```
- NOTE: if you receive the error `PermissionError: [Errno 13] Permission denied: '/run/user/1000'`, try setting the following environment variable:
```bash
export XDG_RUNTIME_DIR=/tmp
```
and re-running.

### Analyse evaluation runs with Inspect view
```shell
inspect view --log-dir lg/logs
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dluo96/lawful-good

Awesome Lists containing this project

README