Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dluo96/lawful-good
Benchmark for assessing legal capabilities of LLM agents
https://github.com/dluo96/lawful-good
Last synced: 8 days ago
JSON representation
Benchmark for assessing legal capabilities of LLM agents
- Host: GitHub
- URL: https://github.com/dluo96/lawful-good
- Owner: dluo96
- Created: 2024-10-23T22:17:51.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-12-16T17:30:36.000Z (about 2 months ago)
- Last Synced: 2025-01-06T00:31:23.613Z (about 1 month ago)
- Language: Python
- Size: 892 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome_ai_agents - Lawful-Good - Benchmark for assessing legal capabilities of LLM agents (Building / Benchmarks)
- awesome_ai_agents - Lawful-Good - Benchmark for assessing legal capabilities of LLM agents (Building / Benchmarks)
README
# Lawful Good
## Get Started
Clone the repo:
```bash
git clone [email protected]:dluo96/lawful-good.git
```Install the `inspect_ai` and `inspect_evals` Python packages:
```bash
pip install inspect_ai
pip install git+https://github.com/UKGovernmentBEIS/inspect_evals
```Install the package locally
```bash
cd lawful-good
pip install -e .
```### Example Evaluation with Ollama
- Install [ollama](https://ollama.com/download). On Linux, simply run
```bash
curl -fsSL https://ollama.com/install.sh | sh
```
- Run (say) Llama2 3B:
```
ollama run llama3.2
```
This will automatically pull (download) Llama2 3B for you and then run.
- To run Inspect evals with Ollama, you will need to install the `openai` package:
```
pip install openai
```
- Run the example evaluation script:
```bash
inspect eval lg/lawful_good.py --model ollama/llama3.2
```
- NOTE: if you receive the error `PermissionError: [Errno 13] Permission denied: '/run/user/1000'`, try setting the following environment variable:
```bash
export XDG_RUNTIME_DIR=/tmp
```
and re-running.### Analyse evaluation runs with Inspect view
```shell
inspect view --log-dir lg/logs
```