An open API service indexing awesome lists of open source software.

https://github.com/paraskevi-kivroglou/hackathon-llamaeval

LlamaEval is a rapid prototype developed during a hackathon to provide a user-friendly dashboard for evaluating and comparing Llama models using the TogetherAI API.
https://github.com/paraskevi-kivroglou/hackathon-llamaeval

ai-benchmarks evaluation-metrics llama3 llms llms-benchmarking streamlit togetherai

Last synced: 4 months ago
JSON representation

LlamaEval is a rapid prototype developed during a hackathon to provide a user-friendly dashboard for evaluating and comparing Llama models using the TogetherAI API.

Awesome Lists containing this project

README

          

# Llama Impact Hackathon

## LlamaEval: Quick Evaluation Dashboard
LlamaEval is a rapid prototype developed during a hackathon to provide a user-friendly dashboard for evaluating and comparing Llama models using the TogetherAI API.

Features

Model Selection: Choose from various Llama models available through TogetherAI.
Benchmark Tasks: Evaluate models on predefined tasks such as question answering and text summarization.
Performance Metrics: View accuracy, BLEU scores, and other relevant metrics for each model.
User-Friendly Interface: Simple web interface for inputting prompts and viewing results.
Quick Comparison: Easily compare the performance of different Llama models side-by-side.

Note

For the prototype, we kept the size of the benchmark small. In later, steps we plan to iterate on top of real-world datasets.

Installation

1.Clone the repository

2.Install dependencies

bash
pip install -r requirements.txt
3. Set up your TogetherAI API key:

bash
export TOGETHERAI_API_KEY=your_api_key_here

Usage

Run the application:

bash
streamlit run app.py

Open your web browser and navigate to http://localhost:8501.
Select a Llama model, choose a benchmark task, and input your prompt.
View the results and performance metrics on the dashboard.

Future Development

- Custom dataset uploads

- Support for additional AI models

- Advanced visualization of performance metrics

- Integration with other AI model providers

Contributors
- [Paraskevi Kivroglou] [https://www.linkedin.com/in/paraskevi-kivroglou-925881292/]
- [Mayank Varshney] [LinkedIn : https://www.linkedin.com/in/varsh-mayank/]
- [Amina Asif] [LinkedIn : https://www.linkedin.com/in/amina-work/] [GitHub:https://github.com/AminaAsif9]

License

This project is licensed under the MIT License - see the LICENSE file for details.