https://github.com/paraskevi-kivroglou/hackathon-llamaeval
LlamaEval is a rapid prototype developed during a hackathon to provide a user-friendly dashboard for evaluating and comparing Llama models using the TogetherAI API.
https://github.com/paraskevi-kivroglou/hackathon-llamaeval
ai-benchmarks evaluation-metrics llama3 llms llms-benchmarking streamlit togetherai
Last synced: 4 months ago
JSON representation
LlamaEval is a rapid prototype developed during a hackathon to provide a user-friendly dashboard for evaluating and comparing Llama models using the TogetherAI API.
- Host: GitHub
- URL: https://github.com/paraskevi-kivroglou/hackathon-llamaeval
- Owner: Paraskevi-KIvroglou
- License: mit
- Created: 2024-11-04T19:09:43.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-11-10T23:02:52.000Z (over 1 year ago)
- Last Synced: 2025-10-04T23:53:18.089Z (9 months ago)
- Topics: ai-benchmarks, evaluation-metrics, llama3, llms, llms-benchmarking, streamlit, togetherai
- Language: Python
- Homepage: https://llama-eval-container-app.niceglacier-ab6723e0.westeurope.azurecontainerapps.io
- Size: 66.8 MB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Llama Impact Hackathon
## LlamaEval: Quick Evaluation Dashboard
LlamaEval is a rapid prototype developed during a hackathon to provide a user-friendly dashboard for evaluating and comparing Llama models using the TogetherAI API.
Features
Model Selection: Choose from various Llama models available through TogetherAI.
Benchmark Tasks: Evaluate models on predefined tasks such as question answering and text summarization.
Performance Metrics: View accuracy, BLEU scores, and other relevant metrics for each model.
User-Friendly Interface: Simple web interface for inputting prompts and viewing results.
Quick Comparison: Easily compare the performance of different Llama models side-by-side.
Note
For the prototype, we kept the size of the benchmark small. In later, steps we plan to iterate on top of real-world datasets.
Installation
1.Clone the repository
2.Install dependencies
bash
pip install -r requirements.txt
3. Set up your TogetherAI API key:
bash
export TOGETHERAI_API_KEY=your_api_key_here
Usage
Run the application:
bash
streamlit run app.py
Open your web browser and navigate to http://localhost:8501.
Select a Llama model, choose a benchmark task, and input your prompt.
View the results and performance metrics on the dashboard.
Future Development
- Custom dataset uploads
- Support for additional AI models
- Advanced visualization of performance metrics
- Integration with other AI model providers
Contributors
- [Paraskevi Kivroglou] [https://www.linkedin.com/in/paraskevi-kivroglou-925881292/]
- [Mayank Varshney] [LinkedIn : https://www.linkedin.com/in/varsh-mayank/]
- [Amina Asif] [LinkedIn : https://www.linkedin.com/in/amina-work/] [GitHub:https://github.com/AminaAsif9]
License
This project is licensed under the MIT License - see the LICENSE file for details.