{"id":13426977,"url":"https://github.com/evidentlyai/evidently","last_synced_at":"2025-05-13T15:03:50.408Z","repository":{"id":36994699,"uuid":"315977578","full_name":"evidentlyai/evidently","owner":"evidentlyai","description":"Evidently is ​​an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.","archived":false,"fork":false,"pushed_at":"2025-04-16T22:47:44.000Z","size":290092,"stargazers_count":6050,"open_issues_count":208,"forks_count":666,"subscribers_count":47,"default_branch":"main","last_synced_at":"2025-04-17T09:27:06.103Z","etag":null,"topics":["data-drift","data-quality","data-science","data-validation","generative-ai","hacktoberfest","html-report","jupyter-notebook","llm","llmops","machine-learning","mlops","model-monitoring","pandas-dataframe"],"latest_commit_sha":null,"homepage":"https://discord.gg/xZjKRaNp8b","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/evidentlyai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-11-25T15:20:08.000Z","updated_at":"2025-04-17T08:53:20.000Z","dependencies_parsed_at":"2023-12-20T07:26:44.254Z","dependency_job_id":"e9960926-ddce-4357-a4d7-7f4ef1695f84","html_url":"https://github.com/evidentlyai/evidently","commit_stats":{"total_commits":2116,"total_committers":79,"mean_commits":26.78481012658228,"dds":0.7188090737240076,"last_synced_commit":"5f8716897d31036d3424e6a9d61e7b1662aebf5c"},"previous_names":[],"tags_count":103,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evidentlyai%2Fevidently","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evidentlyai%2Fevidently/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evidentlyai%2Fevidently/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/evidentlyai%2Fevidently/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/evidentlyai","download_url":"https://codeload.github.com/evidentlyai/evidently/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249826709,"owners_count":21330674,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-drift","data-quality","data-science","data-validation","generative-ai","hacktoberfest","html-report","jupyter-notebook","llm","llmops","machine-learning","mlops","model-monitoring","pandas-dataframe"],"created_at":"2024-07-31T00:01:49.980Z","updated_at":"2025-04-22T11:28:01.787Z","avatar_url":"https://github.com/evidentlyai.png","language":"Jupyter Notebook","readme":"\u003ch1 align=\"center\"\u003eEvidently\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\u003cb\u003eAn open-source framework to evaluate, test and monitor ML and LLM-powered systems.\u003c/b\u003e\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://pepy.tech/project/evidently\" target=\"_blank\"\u003e\u003cimg src=\"https://pepy.tech/badge/evidently\" alt=\"PyPi Downloads\"\u003e\u003c/a\u003e\n\u003ca href=\"https://github.com/evidentlyai/evidently/blob/main/LICENSE\" target=\"_blank\"\u003e\u003cimg src=\"https://img.shields.io/github/license/evidentlyai/evidently\" alt=\"License\"\u003e\u003c/a\u003e\n\u003ca href=\"https://pypi.org/project/evidently/\" target=\"_blank\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/evidently\" alt=\"PyPi\"\u003e\u003c/a\u003e\n\n![Evidently](/images/gh_header.png)\n\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://docs.evidentlyai.com\"\u003eDocumentation\u003c/a\u003e\n  |\n  \u003ca href=\"https://discord.gg/xZjKRaNp8b\"\u003eDiscord Community\u003c/a\u003e\n  |\n  \u003ca href=\"https://evidentlyai.com/blog\"\u003eBlog\u003c/a\u003e\n  |\n  \u003ca href=\"https://twitter.com/EvidentlyAI\"\u003eTwitter\u003c/a\u003e\n  |\n  \u003ca href=\"https://www.evidentlyai.com/register\"\u003eEvidently Cloud\u003c/a\u003e\n\u003c/p\u003e\n\n# :bar_chart: What is Evidently?\n\nEvidently is an open-source Python library to evaluate, test, and monitor ML and LLM systems—from experiments to production.\n\n* 🔡 Works with tabular and text data.\n* ✨ Supports evals for predictive and generative tasks, from classification to RAG.\n* 📚 100+ built-in metrics from data drift detection to LLM judges.\n* 🛠️ Python interface for custom metrics.\n* 🚦 Both offline evals and live monitoring.\n* 💻 Open architecture: easily export data and integrate with existing tools.\n\nEvidently is very modular. You can start with one-off evaluations or host a full monitoring service.\n\n## 1. Reports and Test Suites\n\n**Reports** compute and summarize various data, ML and LLM quality evals.\n* Start with Presets and built-in metrics or customize.\n* Best for experiments, exploratory analysis and debugging.\n* View interactive Reports in Python or export as JSON, Python dictionary, HTML, or view in monitoring UI.\n\nTurn any Report into a **Test Suite** by adding pass/fail conditions.\n* Best for regression testing, CI/CD checks, or data validation.\n* Zero setup option: auto-generate test conditions from the reference dataset.\n* Simple syntax to set test conditions as `gt` (greater than), `lt` (less than), etc.\n\n| Reports |\n|--|\n|![Report example](https://github.com/evidentlyai/docs/blob/eb1630cdd80d31d55921ff4d34fc7b5e6e9c9f90/images/concepts/report_test_preview.gif)|\n\n## 2. Monitoring Dashboard\n\n**Monitoring UI** service helps visualize metrics and test results over time.\n\nYou can choose:\n* Self-host the open-source version. [Live demo](https://demo.evidentlyai.com).\n* Sign up for [Evidently Cloud](https://www.evidentlyai.com/register) (Recommended).\n\nEvidently Cloud offers a generous free tier and extra features like dataset and user management, alerting, and no-code evals. [Compare OSS vs Cloud](https://docs.evidentlyai.com/faq/oss_vs_cloud).\n\n| Dashboard |\n|--|\n|![Dashboard example](https://github.com/evidentlyai/docs/blob/eb1630cdd80d31d55921ff4d34fc7b5e6e9c9f90/images/dashboard_llm_tabs.gif)|\n\n# :woman_technologist: Install Evidently\n\nTo install from PyPI:\n\n```sh\npip install evidently\n```\nTo install Evidently using conda installer, run:\n\n```sh\nconda install -c conda-forge evidently\n```\n\n# :arrow_forward: Getting started\n\n## Reports\n\n### LLM evals\n\n\u003e This is a simple Hello World. Check the Tutorials for more: [LLM evaluation](https://docs.evidentlyai.com/quickstart_llm).\n\nImport the necessary components:\n\n```python\nimport pandas as pd\nfrom evidently import Report\nfrom evidently import Dataset, DataDefinition\nfrom evidently.descriptors import Sentiment, TextLength, Contains\nfrom evidently.presets import TextEvals\n```\n\nCreate a toy dataset with questions and answers.\n\n```python\neval_df = pd.DataFrame([\n    [\"What is the capital of Japan?\", \"The capital of Japan is Tokyo.\"],\n    [\"Who painted the Mona Lisa?\", \"Leonardo da Vinci.\"],\n    [\"Can you write an essay?\", \"I'm sorry, but I can't assist with homework.\"]],\n                       columns=[\"question\", \"answer\"])\n```\n\nCreate an Evidently Dataset object and add `descriptors`: row-level evaluators. We'll check for sentiment of each response, its length and whether it contains words indicative of denial.\n\n```python\neval_dataset = Dataset.from_pandas(pd.DataFrame(eval_df),\ndata_definition=DataDefinition(),\ndescriptors=[\n    Sentiment(\"answer\", alias=\"Sentiment\"),\n    TextLength(\"answer\", alias=\"Length\"),\n    Contains(\"answer\", items=['sorry', 'apologize'], mode=\"any\", alias=\"Denials\")\n])\n```\n\nYou can view the dataframe with added scores:\n\n```python\neval_dataset.as_dataframe()\n```\n\nTo get a summary Report to see the distribution of scores:\n\n```python\nreport = Report([\n    TextEvals()\n])\n\nmy_eval = report.run(eval_dataset)\nmy_eval\n# my_eval.json()\n# my_eval.dict()\n```\nYou can also choose other evaluators, including LLM-as-a-judge and configure pass/fail conditions.\n\n### Data and ML evals\n\n\u003e This is a simple Hello World. Check the Tutorials for more: [Tabular data](https://docs.evidentlyai.com/quickstart_ml).\n\nImport the Report, evaluation Preset and toy tabular dataset.\n\n```python\nimport pandas as pd\nfrom sklearn import datasets\n\nfrom evidently import Report\nfrom evidently.presets import DataDriftPreset\n\niris_data = datasets.load_iris(as_frame=True)\niris_frame = iris_data.frame\n```\n\nRun the **Data Drift** evaluation preset that will test for shift in column distributions. Take the first 60 rows of the dataframe as \"current\" data and the following as reference.  Get the output in Jupyter notebook:\n\n```python\nreport = Report([\n    DataDriftPreset(method=\"psi\")\n],\ninclude_tests=\"True\")\nmy_eval = report.run(iris_frame.iloc[:60], iris_frame.iloc[60:])\nmy_eval\n```\n\nYou can also save an HTML file. You'll need to open it from the destination folder.\n\n```python\nmy_eval.save_html(\"file.html\")\n```\n\nTo get the output as JSON or Python dictionary:\n```python\nmy_eval.json()\n# my_eval.dict()\n```\nYou can choose other Presets, create Reports from indiviudal Metrics and configure pass/fail conditions.\n\n## Monitoring dashboard\n\n\u003e This launches a demo project in the locally hosted Evidently UI. Sign up for [Evidently Cloud](https://docs.evidentlyai.com/docs/setup/cloud) to instantly get a managed version with additional features.\n\nRecommended step: create a virtual environment and activate it.\n```\npip install virtualenv\nvirtualenv venv\nsource venv/bin/activate\n```\n\nAfter installing Evidently (`pip install evidently`), run the Evidently UI with the demo projects:\n```\nevidently ui --demo-projects all\n```\n\nVisit **localhost:8000** to access the UI.\n\n# 🚦 What can you evaluate?\n\nEvidently has 100+ built-in evals. You can also add custom ones.\n\nHere are examples of things you can check:\n\n|                           |                          |\n|:-------------------------:|:------------------------:|\n| **🔡 Text descriptors**   | **📝 LLM outputs**       |\n| Length, sentiment, toxicity, language, special symbols, regular expression matches, etc. | Semantic similarity, retrieval relevance, summarization quality, etc. with model- and LLM-based evals. |\n| **🛢 Data quality**       | **📊 Data distribution drift** |\n| Missing values, duplicates, min-max ranges, new categorical values, correlations, etc. | 20+ statistical tests and distance metrics to compare shifts in data distribution. |\n| **🎯 Classification**     | **📈 Regression**        |\n| Accuracy, precision, recall, ROC AUC, confusion matrix, bias, etc. | MAE, ME, RMSE, error distribution, error normality, error bias, etc. |\n| **🗂 Ranking (inc. RAG)** | **🛒 Recommendations**   |\n| NDCG, MAP, MRR, Hit Rate, etc. | Serendipity, novelty, diversity, popularity bias, etc. |\n\n\n# :computer: Contributions\nWe welcome contributions! Read the [Guide](CONTRIBUTING.md) to learn more.\n\n# :books: Documentation\nFor more examples, refer to a complete \u003ca href=\"https://docs.evidentlyai.com\"\u003eDocumentation\u003c/a\u003e.\n\n# :white_check_mark: Discord Community\nIf you want to chat and connect, join our [Discord community](https://discord.gg/xZjKRaNp8b)!\n","funding_links":[],"categories":["The List of AI Testing Tools","Jupyter Notebook","📡 Monitoring \u0026 Observability","🎯 Tool Categories","Data Validation","Model Serving and Monitoring","Observability","Tools","Python","Evaluation and Monitoring","LLM Applications","Traditional Data","其他_机器学习与深度学习","🚀 MLOps","交互式小部件和可视化","\u003ca id=\"tools\"\u003e\u003c/a\u003e🛠️ Tools","Visual Analysis and Debugging","Model Monitoring","Visualization","LLMOps","Orchestration","Interactive Widgets \u0026 Visualization","Deployment","Quality Assurance","LLM Testing / Monitoring","Observability and Monitoring","Observability \u0026 Monitoring","Table of Contents"],"sub_categories":["19. Evidently AI","🔍 ML Observability Stack","Synthetic Data","Drift","Tools \u0026 Projects","General-Purpose Machine Learning","Tools","Model Lifecycle","Observability","Application Framework","NLP","Ranking/Recommender","Resources","Frameworks and Libraries"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevidentlyai%2Fevidently","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fevidentlyai%2Fevidently","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fevidentlyai%2Fevidently/lists"}