Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/CYXup6/ADL-2023-Final
https://github.com/CYXup6/ADL-2023-Final
Last synced: 14 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/CYXup6/ADL-2023-Final
- Owner: CYXup6
- Created: 2023-12-27T14:46:49.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-01-03T08:43:19.000Z (11 months ago)
- Last Synced: 2024-08-01T18:28:01.532Z (3 months ago)
- Language: Python
- Size: 14.4 MB
- Stars: 2
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ADL 2023 final project
This repository provides an overview of the project's structure and serves as a guide on how to execute the experiments in our work *"Can LLMs Do It All? Towards Crafting Multi-Agent LLM Evaluators by LLMs"*. This project is built on top of [ChatEval](https://github.com/thunlp/ChatEval) [1].- Reference
1. [ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate](https://arxiv.org/abs/2308.07201)## Project Structure
The main folders and files are structured as follows:
```
.
├─ ...
├─ agentverse
│ ├─ ...
│ └─ tasks
│ ├─ ...
│ └─ llm_eval
│ ├─ data
│ │ ├─ ...
│ │ └─ faireval
│ │ └─ preprocessed_data
│ │ └─ test.json //This file contains the dataset(Vicuna Bench & responses)
│ └─ multi_role
│ └─ only_static_assign
│ ├─ ...
│ └─ faireval
│ ├─ ...
│ └─ two_turns_sequential
│ └─ two_different_role
│ └─ calc_score_comparison
│ └─ gpt_35_0301
│ └─ config.yaml //This file contains the personas for multi-agents and experimental settings
├─ llm_eval.py //This is the main function for experiments.
└─ scripts
└─ llm_eval
├─ ...
└─ multi_role
└─ only_static_assign
├─ ...
└─ faireval
├─ ...
└─ two_turns_sequential
└─ two_different_role
├─ calc_score_comparison
│ └─ gpt_35_0301.sh //This is the script for running the experiments
└─ calc_score_comparison_reverse
└─ gpt_35_0301.sh //This is the script for running the experiments with reverse responses to conduct position calibration
```## Description of how to run the code
In our work, all experiments are carried out using the `gpt_35_0301.sh`. However, we alter the `config.yaml` which under `./agentverse/tasks/llm_eval/multi_role/only_static_assign/faireval/two_turns_sequential/two_different_role/calc_score_comparison/gpt_35_0301` for various experimental setups and distinct personas.## Customized Persona
- First, switch to the `customize_persona` directory.
```bash
cd ./customize_persona
```- Then, run the following command to generate the customized personas.
```bash
bash ./scripts/generate_persona.sh
```- Finally, run the experiment with the generated personas as follows.
```bash
bash ./scripts/llm_eval.sh
```- Note that we need to alter the `config.yaml` under `./agentverse/tasks/llm_eval/multi_role/only_static_assign/faireval/two_turns_sequential/two_different_role/calc_score_comparison/gpt_35_0301` directory for various experimental setups and distinct personas.