{"id":23170653,"url":"https://github.com/pyladiesams/eval-llm-based-apps-jan2025","last_synced_at":"2025-05-12T23:11:56.595Z","repository":{"id":267655634,"uuid":"901946734","full_name":"pyladiesams/eval-llm-based-apps-jan2025","owner":"pyladiesams","description":"Create an evaluation framework for your LLM based app. Incorporate it into your test suite. Lay the monitoring foundation.","archived":false,"fork":false,"pushed_at":"2025-05-06T09:43:34.000Z","size":12209,"stargazers_count":7,"open_issues_count":0,"forks_count":5,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-06T10:51:04.099Z","etag":null,"topics":["llm","llm-eval","llm-evals","llm-evaluation-framework","llm-evaluation-metrics","llm-monitoring","llm-test","llm-testing","llmops","llms","workshop"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pyladiesams.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-12-11T15:59:52.000Z","updated_at":"2025-05-06T09:43:37.000Z","dependencies_parsed_at":"2024-12-11T17:19:19.787Z","dependency_job_id":"1859fb44-db6f-4fc3-96c9-308bf93f5b22","html_url":"https://github.com/pyladiesams/eval-llm-based-apps-jan2025","commit_stats":null,"previous_names":["pyladiesams/eval-llm-based-apps-jan2025"],"tags_count":0,"template":false,"template_full_name":"pyladiesams/workshoptopic-mmmYYYY","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyladiesams%2Feval-llm-based-apps-jan2025","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyladiesams%2Feval-llm-based-apps-jan2025/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyladiesams%2Feval-llm-based-apps-jan2025/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyladiesams%2Feval-llm-based-apps-jan2025/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pyladiesams","download_url":"https://codeload.github.com/pyladiesams/eval-llm-based-apps-jan2025/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253837456,"owners_count":21971984,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["llm","llm-eval","llm-evals","llm-evaluation-framework","llm-evaluation-metrics","llm-monitoring","llm-test","llm-testing","llmops","llms","workshop"],"created_at":"2024-12-18T04:14:25.470Z","updated_at":"2025-05-12T23:11:56.587Z","avatar_url":"https://github.com/pyladiesams.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Evaluating LLM-based applications\n\n## Workshop description\nIt is so easy and quick to build a shiny PoC using LLMs and it is so hard to turn it into a production-grade LLM application. To succeed you need a robust evaluation framework, which you are going to use during the development and post-deployment of your LLM based app.\n\nThis workshop focuses on understanding evaluation-driven development and architecture of a LLM based app, building an evaluation framework for a LLM based app, establishing a test suite with evals and laying the monitoring foundations for it. All of it by leveraging Python OSS libraries.\n\n## Requirements\n### General requirements\n* basic Python knowledge\n* basic understanding of ML testing\n* basic understanding of ML monitoring\n\n### Optional requirements\n* [uv](https://docs.astral.sh/uv/) for dependency management\n* Google account if you want to use [Google Colab](https://colab.research.google.com/)\n\n## Usage\n\n### with uv\nRun the following code:\n```bash\ngit clone https://github.com/pyladiesams/eval-llm-based-apps-jan2025.git\ncd eval-llm-based-apps-jan2025\n\n# create and activate venv, install dependencies\nuv sync\n```\n### with Google Colab\n1. Visit [Google Colab](https://colab.research.google.com/)\n2. In the top left corner, select \"File\" \u0026#8594; \"Open Notebook\"\n3. Under \"GitHub\", enter the URL of the repo of this workshop\n4. Select one of the notebooks within the repo.\n5. At the top of the notebook, add a Code cell and run the following code:\n```bash\n!git clone https://github.com/pyladiesams/eval-llm-based-apps-jan2025.git\n%cd eval-llm-based-apps-jan2025\n!pip install -r requirements.txt\n```\n## Video record\nRe-watch [this YouTube stream](https://www.youtube.com/live/phpQ5hmC08E?feature=shared)\n\n## Credits\nThis workshop was set up by @pyladiesams and @una-gal\n\n## Appendix\n### Pre-Commit Hooks\nTo ensure our code looks beautiful, PyLadies Amsterdam uses pre-commit hooks. You can enable them by running `pre-commit install`. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpyladiesams%2Feval-llm-based-apps-jan2025","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpyladiesams%2Feval-llm-based-apps-jan2025","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpyladiesams%2Feval-llm-based-apps-jan2025/lists"}