https://github.com/zhuohaoyu/freeeval

Last synced: 5 months ago
JSON representation

Host: GitHub
URL: https://github.com/zhuohaoyu/freeeval
Owner: zhuohaoyu
Created: 2024-03-16T03:01:59.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-08-01T17:22:55.000Z (10 months ago)
Last Synced: 2024-08-02T09:59:45.875Z (10 months ago)
Language: Python
Size: 3.13 MB
Stars: 5
Watchers: 1
Forks: 2
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

**FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models**

------

Overview •
Quick Start •
Docs •
Paper •
Citation

## Overview

FreeEval is a modular and extensible framework for conducting trustworthy and efficient automatic evaluations of large language models (LLMs). The toolkit unifies various evaluation approaches, including dataset-based evaluators, reference-based metrics, and LLM-based evaluators, within a transparent and reproducible framework. FreeEval incorporates meta-evaluation techniques such as human evaluation and data contamination detection to enhance the reliability of evaluation results. The framework is built on a high-performance infrastructure that enables efficient large-scale evaluations across multi-node, multi-GPU clusters, supporting both open-source and proprietary LLMs. With its focus on modularity, trustworthiness, and efficiency, FreeEval aims to provide researchers with a standardized and comprehensive platform for gaining deeper insights into the capabilities and limitations of LLMs.

## Quick Start

To get started, first clone the repository and setup the enviroment:

```bash
git clone https://github.com/WisdomShell/FreeEval.git
cd FreeEval
pip install -r requirements.txt
```

All our evaluation pipelines are configured with JSON configs, including all the details and hyper-parameters.
For an example, you could run ARC-Challenge with LLaMA-2 7B Chat with:

```bash
python run.py -c ./config/examples/arcc.json
```

## Docs

For more detailed usage, please refer to our [docs](https://freeeval.readthedocs.io/).

## Citation

✨ If you find our work helpful, please consider citing with:

```bibtex

@article{yu2024freeeval,
title={FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models},
author={Yu, Zhuohao and Gao, Chang and Yao, Wenjin and Wang, Yidong and Zeng, Zhengran and Ye, Wei and Wang, Jindong and Zhang, Yue and Zhang, Shikun},
journal={arXiv preprint arXiv:2404.06003},
year={2024}
}

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zhuohaoyu/freeeval

Awesome Lists containing this project

README