https://github.com/kbeaugrand/KernelMemory.Evaluation

This repository contains the code for the evaluation of the Knowledge Management (KM) system. The evaluation is based on the following metrics:
https://github.com/kbeaugrand/KernelMemory.Evaluation

Last synced: 6 months ago
JSON representation

This repository contains the code for the evaluation of the Knowledge Management (KM) system. The evaluation is based on the following metrics:

Host: GitHub
URL: https://github.com/kbeaugrand/KernelMemory.Evaluation
Owner: kbeaugrand
License: mit
Created: 2024-11-23T11:17:25.000Z (8 months ago)
Default Branch: main
Last Pushed: 2024-12-26T08:45:42.000Z (7 months ago)
Last Synced: 2024-12-26T09:30:46.735Z (7 months ago)
Language: C#
Size: 69.3 KB
Stars: 3
Watchers: 1
Forks: 1
Open Issues: 9
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
- Codeowners: .github/CODEOWNERS

Awesome Lists containing this project

awesome-semantickernel - KernelMemory.Evaluation

README

        ## KM Evaluation

[![Build & Test](https://github.com/kbeaugrand/KernelMemory.Evaluation/actions/workflows/build_tests.yml/badge.svg)](https://github.com/kbeaugrand/KernelMemory.Evaluation/actions/workflows/build_test.yml)

[![Create Release](https://github.com/kbeaugrand/KernelMemory.Evaluation/actions/workflows/publish.yml/badge.svg)](https://github.com/kbeaugrand/KernelMemory.Evaluation/actions/workflows/publish.yml)

[![Version](https://img.shields.io/github/v/release/kbeaugrand/KernelMemory.Evaluation)](https://img.shields.io/github/v/release/kbeaugrand/KernelMemory.Evaluation)

[![License](https://img.shields.io/github/license/kbeaugrand/KernelMemory.Evaluation)](https://img.shields.io/github/v/release/kbeaugrand/KernelMemory.Evaluation)

This repository contains the code for the evaluation of the Knowledge Management (KM) system. The evaluation is based on the following metrics:

- **Faithfulness**: Ensuring the generated text accurately represents the source information.

- **Answer Relevancy**: Assessing the pertinence of the answer in relation to the query.

- **Context Recall**: Measuring the proportion of relevant context retrieved.

- **Context Precision**: Evaluating the accuracy of the retrieved context.

- **Context Relevancy**: Determining the relevance of the provided context to the query.

- **Context Entity Recall**: Checking the retrieval of key entities within the context.

- **Answer Semantic Similarity**: Comparing the semantic similarity between the generated answer and the expected answer.

- **Answer Correctness**: Verifying the factual correctness of the generated answers.

## Usage

### Test set generation

To evaluate the KM, you must first create a test set containing the queries and the expected answers. 

Since this is a manual process, this might be fastidious for large datasets. 

To help you with this task, we provide a generator that creates a test set from a given KM memory and index. 

```csharp

using Microsoft.KernelMemory.Evaluation;

var testSetGenerator = new TestSetGeneratorBuilder(memoryBuilder.Services)

                            .AddEvaluatorKernel(kernel)

                            .Build();

var distribution = new Distribution

{

    Simple = .5f,

    Reasoning = .16f,

    MultiContext = .17f,

    Conditioning = .17f

};

var testSet = testSetGenerator.GenerateTestSetsAsync(index: "default", count: 10, retryCount: 3, distribution: distribution);

await foreach (var test in testSet)

{

    Console.WriteLine(test.Question);

}

```

### Evaluation

To evaluate the KM, you can use the following code:

```csharp

var evaluation = new TestSetEvaluatorBuilder()

                            .AddEvaluatorKernel(kernel)

                            .WithMemory(memoryBuilder.Build())

                            .Build();

var results = evaluation.EvaluateTestSetAsync(index: "default", await testSet.ToArrayAsync());

await foreach (var result in results)

{

    Console.WriteLine($"Faithfulness: {result.Metrics.Faithfulness}, ContextRecall: {result.Metrics.ContextRecall}");

}

```

## Credits

This project is an implementation of [RAGAS: Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines](https://github.com/explodinggradients/ragas?tab=readme-ov-file).

## License

This project is licensed under the [MIT License](LICENSE).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kbeaugrand/KernelMemory.Evaluation

Awesome Lists containing this project

README