https://github.com/pyladiesams/eval-llm-based-apps-jan2025
Create an evaluation framework for your LLM based app. Incorporate it into your test suite. Lay the monitoring foundation.
https://github.com/pyladiesams/eval-llm-based-apps-jan2025
llm llm-eval llm-evals llm-evaluation-framework llm-evaluation-metrics llm-monitoring llm-test llm-testing llmops llms workshop
Last synced: 28 days ago
JSON representation
Create an evaluation framework for your LLM based app. Incorporate it into your test suite. Lay the monitoring foundation.
- Host: GitHub
- URL: https://github.com/pyladiesams/eval-llm-based-apps-jan2025
- Owner: pyladiesams
- License: mit
- Created: 2024-12-11T15:59:52.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-05-06T09:43:34.000Z (about 1 month ago)
- Last Synced: 2025-05-06T10:51:04.099Z (about 1 month ago)
- Topics: llm, llm-eval, llm-evals, llm-evaluation-framework, llm-evaluation-metrics, llm-monitoring, llm-test, llm-testing, llmops, llms, workshop
- Language: Jupyter Notebook
- Homepage:
- Size: 11.6 MB
- Stars: 7
- Watchers: 1
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Evaluating LLM-based applications
## Workshop description
It is so easy and quick to build a shiny PoC using LLMs and it is so hard to turn it into a production-grade LLM application. To succeed you need a robust evaluation framework, which you are going to use during the development and post-deployment of your LLM based app.This workshop focuses on understanding evaluation-driven development and architecture of a LLM based app, building an evaluation framework for a LLM based app, establishing a test suite with evals and laying the monitoring foundations for it. All of it by leveraging Python OSS libraries.
## Requirements
### General requirements
* basic Python knowledge
* basic understanding of ML testing
* basic understanding of ML monitoring### Optional requirements
* [uv](https://docs.astral.sh/uv/) for dependency management
* Google account if you want to use [Google Colab](https://colab.research.google.com/)## Usage
### with uv
Run the following code:
```bash
git clone https://github.com/pyladiesams/eval-llm-based-apps-jan2025.git
cd eval-llm-based-apps-jan2025# create and activate venv, install dependencies
uv sync
```
### with Google Colab
1. Visit [Google Colab](https://colab.research.google.com/)
2. In the top left corner, select "File" → "Open Notebook"
3. Under "GitHub", enter the URL of the repo of this workshop
4. Select one of the notebooks within the repo.
5. At the top of the notebook, add a Code cell and run the following code:
```bash
!git clone https://github.com/pyladiesams/eval-llm-based-apps-jan2025.git
%cd eval-llm-based-apps-jan2025
!pip install -r requirements.txt
```
## Video record
Re-watch [this YouTube stream](https://www.youtube.com/live/phpQ5hmC08E?feature=shared)## Credits
This workshop was set up by @pyladiesams and @una-gal## Appendix
### Pre-Commit Hooks
To ensure our code looks beautiful, PyLadies Amsterdam uses pre-commit hooks. You can enable them by running `pre-commit install`.