Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nkalupahana/cs8395-test-suite
https://github.com/nkalupahana/cs8395-test-suite
Last synced: 3 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/nkalupahana/cs8395-test-suite
- Owner: nkalupahana
- License: agpl-3.0
- Created: 2023-09-24T02:51:40.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-12-17T00:08:46.000Z (about 1 year ago)
- Last Synced: 2024-11-10T01:42:24.966Z (2 months ago)
- Language: Python
- Size: 20.5 KB
- Stars: 0
- Watchers: 2
- Forks: 13
- Open Issues: 18
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Test Suite Runner
*Made for Fall 2023 CS 8395-08*This is a test suite runner for LLM assessments.
## Installation
```
pip3 install .
```## Usage
```
usage: main.py [-h] --repos REPOS [--model MODEL] [--config-view] [--tags TAGS [TAGS ...]]Run LLM test suites
options:
-h, --help show this help message and exit
--repos REPOS .repos file to import
--model MODEL Model to run all test suites against
--config-view Without running tests, get config breakdown of all repos
--tags TAGS [TAGS ...]
Use repos that have one or more of these tags (space separated)
```An example `.repos` file can be viewed at `sample.repos` (vcstool format).
## Test Suite Configuration
When test suite commands are run, they are passed a `--model` parameter, which contains the model they should use. Here is an example usage, which all test suites should implement some version of at a minimum:
```py
import sys
from llm_test_helpers import get_llm, get_argsargs = get_args(sys.argv)
llm = get_llm(args.model)
```Test suites are also required to have a `config.json` file:
```json
{
"name": "name",
"model": "default model to run, see llm_test_helpers/__init__.py for implemented options",
"run_test": "Command that is run first, for generating content (e.g. python3 run_tests.py)",
"run_score": "Command that is run second, for scoring generated content (optional if one command does it all)",
"tags": ["any", "tags", "here"],
...
}
```This file can have optional keys, which can store prompts and the like that can be read like this:
```py
raw_prompt = json.loads(open("config.json").read())["prompt"]
prompt = PromptTemplate.from_template(raw_prompt)
```Using `PromptTemplate` allows for the use of variables in the prompt, while still allowing it to be stored as a string in the config file.
Finally, the commands you provide are required to create an `output.json` file when they're done running. This file should, at a minimum, contain an `output` key with a score from 0 - 100. They can also contain subscores in additional keys. Outputs are summarized on the command line, and written to `output.json` in the repo directory.