https://github.com/evidentlyai/evidently

Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
https://github.com/evidentlyai/evidently

data-drift data-quality data-science data-validation generative-ai hacktoberfest html-report jupyter-notebook llm llmops machine-learning mlops model-monitoring pandas-dataframe

Last synced: 6 months ago
JSON representation

Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.

Host: GitHub
URL: https://github.com/evidentlyai/evidently
Owner: evidentlyai
License: apache-2.0
Created: 2020-11-25T15:20:08.000Z (almost 5 years ago)
Default Branch: main
Last Pushed: 2025-04-16T22:47:44.000Z (6 months ago)
Last Synced: 2025-04-17T09:27:06.103Z (6 months ago)
Topics: data-drift, data-quality, data-science, data-validation, generative-ai, hacktoberfest, html-report, jupyter-notebook, llm, llmops, machine-learning, mlops, model-monitoring, pandas-dataframe
Language: Jupyter Notebook
Homepage: https://discord.gg/xZjKRaNp8b
Size: 277 MB
Stars: 6,050
Watchers: 47
Forks: 666
Open Issues: 208
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

awesome-jupyter - Evidently - Interactive reports to analyze machine learning models during validation or production monitoring. (Visualization)
awesome-data-quality - evidently - analyze and track data and ML model output quality. (Table of Contents / Frameworks and Libraries)
awesome-open-data-centric-ai - Evidently - source framework to evaluate, test and monitor ML models in production. | ![GitHub stars](https://img.shields.io/github/stars/evidentlyai/evidently?style=social) | <a href="https://github.com/evidentlyai/evidently/blob/main/LICENSE"><img src="https://img.shields.io/github/license/evidentlyai/evidently" height="15"/></a> | (Observability and Monitoring)
AwesomeResponsibleAI - Evidently
awesome-production-machine-learning - Evidently - Evidently is an open-source framework to evaluate, test and monitor ML and LLM-powered systems. (Evaluation and Monitoring)
Awesome-LLM - Evidently - source framework to evaluate, test and monitor ML and LLM-powered systems. (LLM Applications)
awesome-data-quality - Evidently - An open-source ML monitoring framework for data drift detection. (2021) (Traditional Data / Tools & Projects)
awesome-mlops - Evidently
StarryDivineSky - evidentlyai/evidently
jimsghstars - evidentlyai/evidently - Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics. (Jupyter Notebook)
Awesome-LLMOps - Evidently - source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics. ![Stars](https://img.shields.io/github/stars/evidentlyai/evidently.svg?style=flat&color=green) ![Contributors](https://img.shields.io/github/contributors/evidentlyai/evidently?color=green) ![LastCommit](https://img.shields.io/github/last-commit/evidentlyai/evidently?color=green) (Orchestration / Application Framework)
awesome-jupyter-resources - GitHub - 20% open · ⏱️ 16.08.2022): (交互式小部件和可视化)
awesome-safety-critical-ai - `evidentlyai/evidently` - source ML and LLM observability framework (<a id="tools"></a>🛠️ Tools / Model Lifecycle)
awesome-safety-critical-ai - `evidentlyai/evidently` - source ML and LLM observability framework (<a id="tools"></a>🛠️ Tools / Model Lifecycle)
awesome-ai-testing-tools - https://github.com/evidentlyai/evidently
awesome-ai-testing-tools - https://github.com/evidentlyai/evidently
awesome-mlops - Evidently
awesome-llmops - Evidently - source framework to evaluate, test and monitor ML and LLM-powered systems. | ![GitHub Badge](https://img.shields.io/github/stars/evidentlyai/evidently.svg?style=flat-square) | (LLMOps / Observability)
best-of-jupyter - GitHub - 47% open · ⏱️ 23.10.2025): (Interactive Widgets & Visualization)

README

          
Evidently


An open-source framework to evaluate, test and monitor ML and LLM-powered systems.










![Evidently](/images/gh_header.png)





  Documentation

  |

  Discord Community

  |

  Blog

  |

  Twitter

  |

  Evidently Cloud



# :bar_chart: What is Evidently?

Evidently is an open-source Python library to evaluate, test, and monitor ML and LLM systems—from experiments to production.

* 🔡 Works with tabular and text data.

* ✨ Supports evals for predictive and generative tasks, from classification to RAG.

* 📚 100+ built-in metrics from data drift detection to LLM judges.

* 🛠️ Python interface for custom metrics.

* 🚦 Both offline evals and live monitoring.

* 💻 Open architecture: easily export data and integrate with existing tools.

Evidently is very modular. You can start with one-off evaluations or host a full monitoring service.

## 1. Reports and Test Suites

**Reports** compute and summarize various data, ML and LLM quality evals.

* Start with Presets and built-in metrics or customize.

* Best for experiments, exploratory analysis and debugging.

* View interactive Reports in Python or export as JSON, Python dictionary, HTML, or view in monitoring UI.

Turn any Report into a **Test Suite** by adding pass/fail conditions.

* Best for regression testing, CI/CD checks, or data validation.

* Zero setup option: auto-generate test conditions from the reference dataset.

* Simple syntax to set test conditions as `gt` (greater than), `lt` (less than), etc.

| Reports |

|--|

|![Report example](https://github.com/evidentlyai/docs/blob/eb1630cdd80d31d55921ff4d34fc7b5e6e9c9f90/images/concepts/report_test_preview.gif)|

## 2. Monitoring Dashboard

**Monitoring UI** service helps visualize metrics and test results over time.

You can choose:

* Self-host the open-source version. [Live demo](https://demo.evidentlyai.com).

* Sign up for [Evidently Cloud](https://www.evidentlyai.com/register) (Recommended).

Evidently Cloud offers a generous free tier and extra features like dataset and user management, alerting, and no-code evals. [Compare OSS vs Cloud](https://docs.evidentlyai.com/faq/oss_vs_cloud).

| Dashboard |

|--|

|![Dashboard example](https://github.com/evidentlyai/docs/blob/eb1630cdd80d31d55921ff4d34fc7b5e6e9c9f90/images/dashboard_llm_tabs.gif)|

# :woman_technologist: Install Evidently

To install from PyPI:

```sh

pip install evidently

```

To install Evidently using conda installer, run:

```sh

conda install -c conda-forge evidently

```

# :arrow_forward: Getting started

## Reports

### LLM evals

> This is a simple Hello World. Check the Tutorials for more: [LLM evaluation](https://docs.evidentlyai.com/quickstart_llm).

Import the necessary components:

```python

import pandas as pd

from evidently import Report

from evidently import Dataset, DataDefinition

from evidently.descriptors import Sentiment, TextLength, Contains

from evidently.presets import TextEvals

```

Create a toy dataset with questions and answers.

```python

eval_df = pd.DataFrame([

    ["What is the capital of Japan?", "The capital of Japan is Tokyo."],

    ["Who painted the Mona Lisa?", "Leonardo da Vinci."],

    ["Can you write an essay?", "I'm sorry, but I can't assist with homework."]],

                       columns=["question", "answer"])

```

Create an Evidently Dataset object and add `descriptors`: row-level evaluators. We'll check for sentiment of each response, its length and whether it contains words indicative of denial.

```python

eval_dataset = Dataset.from_pandas(pd.DataFrame(eval_df),

data_definition=DataDefinition(),

descriptors=[

    Sentiment("answer", alias="Sentiment"),

    TextLength("answer", alias="Length"),

    Contains("answer", items=['sorry', 'apologize'], mode="any", alias="Denials")

])

```

You can view the dataframe with added scores:

```python

eval_dataset.as_dataframe()

```

To get a summary Report to see the distribution of scores:

```python

report = Report([

    TextEvals()

])

my_eval = report.run(eval_dataset)

my_eval

# my_eval.json()

# my_eval.dict()

```

You can also choose other evaluators, including LLM-as-a-judge and configure pass/fail conditions.

### Data and ML evals

> This is a simple Hello World. Check the Tutorials for more: [Tabular data](https://docs.evidentlyai.com/quickstart_ml).

Import the Report, evaluation Preset and toy tabular dataset.

```python

import pandas as pd

from sklearn import datasets

from evidently import Report

from evidently.presets import DataDriftPreset

iris_data = datasets.load_iris(as_frame=True)

iris_frame = iris_data.frame

```

Run the **Data Drift** evaluation preset that will test for shift in column distributions. Take the first 60 rows of the dataframe as "current" data and the following as reference.  Get the output in Jupyter notebook:

```python

report = Report([

    DataDriftPreset(method="psi")

],

include_tests="True")

my_eval = report.run(iris_frame.iloc[:60], iris_frame.iloc[60:])

my_eval

```

You can also save an HTML file. You'll need to open it from the destination folder.

```python

my_eval.save_html("file.html")

```

To get the output as JSON or Python dictionary:

```python

my_eval.json()

# my_eval.dict()

```

You can choose other Presets, create Reports from indiviudal Metrics and configure pass/fail conditions.

## Monitoring dashboard

> This launches a demo project in the locally hosted Evidently UI. Sign up for [Evidently Cloud](https://docs.evidentlyai.com/docs/setup/cloud) to instantly get a managed version with additional features.

Recommended step: create a virtual environment and activate it.

```

pip install virtualenv

virtualenv venv

source venv/bin/activate

```

After installing Evidently (`pip install evidently`), run the Evidently UI with the demo projects:

```

evidently ui --demo-projects all

```

Visit **localhost:8000** to access the UI.

# 🚦 What can you evaluate?

Evidently has 100+ built-in evals. You can also add custom ones.

Here are examples of things you can check:

|                           |                          |

|:-------------------------:|:------------------------:|

| **🔡 Text descriptors**   | **📝 LLM outputs**       |

| Length, sentiment, toxicity, language, special symbols, regular expression matches, etc. | Semantic similarity, retrieval relevance, summarization quality, etc. with model- and LLM-based evals. |

| **🛢 Data quality**       | **📊 Data distribution drift** |

| Missing values, duplicates, min-max ranges, new categorical values, correlations, etc. | 20+ statistical tests and distance metrics to compare shifts in data distribution. |

| **🎯 Classification**     | **📈 Regression**        |

| Accuracy, precision, recall, ROC AUC, confusion matrix, bias, etc. | MAE, ME, RMSE, error distribution, error normality, error bias, etc. |

| **🗂 Ranking (inc. RAG)** | **🛒 Recommendations**   |

| NDCG, MAP, MRR, Hit Rate, etc. | Serendipity, novelty, diversity, popularity bias, etc. |

# :computer: Contributions

We welcome contributions! Read the [Guide](CONTRIBUTING.md) to learn more.

# :books: Documentation

For more examples, refer to a complete Documentation.

# :white_check_mark: Discord Community

If you want to chat and connect, join our [Discord community](https://discord.gg/xZjKRaNp8b)!

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/evidentlyai/evidently

Awesome Lists containing this project

README

Evidently