https://github.com/explodinggradients/ragas

Supercharge Your LLM Application Evaluations 🚀
https://github.com/explodinggradients/ragas

evaluation llm llmops

Last synced: 3 months ago
JSON representation

Supercharge Your LLM Application Evaluations 🚀

Host: GitHub
URL: https://github.com/explodinggradients/ragas
Owner: explodinggradients
License: apache-2.0
Created: 2023-05-08T17:48:04.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2025-03-27T18:59:05.000Z (4 months ago)
Last Synced: 2025-03-28T10:09:42.336Z (4 months ago)
Topics: evaluation, llm, llmops
Language: Python
Homepage: https://docs.ragas.io
Size: 40.4 MB
Stars: 8,604
Watchers: 42
Forks: 873
Open Issues: 338
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-gpt - https://github.com/explodinggradients/ragas
awesome-llm - Ragas
awesome-llm - Ragas
awesome-ml-python-packages - ragas
Awesome-LLMs-Datasets - https://github.com/explodinggradients/ragas
ai-game-devtools - Ragas
StarryDivineSky - explodinggradients/ragas - 根据问题衡量答案与上下文的事实一致性。Context_precision - 衡量检索到的上下文与问题的相关性，传达检索管道的质量。Answer_relevancy - 衡量答案与问题的相关性。Context_recall - 衡量检索器检索回答问题所需的所有必要信息的能力。 (A01_文本生成_文本对话 / 大语言对话模型及数据)
Awesome-LLM - Ragas - a framework that helps you evaluate your Retrieval Augmented Generation (RAG) pipelines. (LLM Evaluation:)
awesome-production-machine-learning - Ragas - Ragas is a framework to evaluate RAG pipelines. (Evaluation and Monitoring)
awesome-LLM-resources - ragas
awesome-ai-papers - [ragas - community/rageval)\] (NLP / 3. Pretraining)
awesome-ai-papers - [ragas
alan_awesome_llm - ragas
alan_awesome_llm - ragas
Awesome-LLMOps - ragas - commit/explodinggradients/ragas?color=green) (Runtime / Evaluation)
awesome-safety-critical-ai - `explodinggradients/ragas` - driven insights for LLM apps (<a id="tools"></a>🛠️ Tools / Model Testing & Validation)
awesome-safety-critical-ai - `explodinggradients/ragas` - driven insights for LLM apps (<a id="tools"></a>🛠️ Tools / Model Testing & Validation)

README

        


  





  Supercharge Your LLM Application Evaluations 🚀





    

        

    

    

            

    

    

        

    

    

        

    

    

        

    





    


        Documentation |

        Quick start |

        Join Discord 

    


Objective metrics, intelligent test generation, and data-driven insights for LLM apps

Ragas is your ultimate toolkit for evaluating and optimizing Large Language Model (LLM) applications. Say goodbye to time-consuming, subjective assessments and hello to data-driven, efficient evaluation workflows.

Don't have a test dataset ready? We also do production-aligned test set generation.

## Key Features

- 🎯 Objective Metrics: Evaluate your LLM applications with precision using both LLM-based and traditional metrics.

- 🧪 Test Data Generation: Automatically create comprehensive test datasets covering a wide range of scenarios.

- 🔗 Seamless Integrations: Works flawlessly with popular LLM frameworks like LangChain and major observability tools.

- 📊 Build feedback loops: Leverage production data to continually improve your LLM applications.

## :shield: Installation

Pypi: 

```bash

pip install ragas

```

Alternatively, from source:

```bash

pip install git+https://github.com/explodinggradients/ragas

```

## :fire: Quickstart

### Evaluate your RAG with Ragas metrics

This is 4 main lines:

```python

from ragas.metrics import LLMContextRecall, Faithfulness, FactualCorrectness

from langchain_openai.chat_models import ChatOpenAI

from ragas.llms import LangchainLLMWrapper

evaluator_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o"))

metrics = [LLMContextRecall(), FactualCorrectness(), Faithfulness()]

results = evaluate(dataset=eval_dataset, metrics=metrics, llm=evaluator_llm)

```

Find the complete RAG Evaluation Quickstart here: [https://docs.ragas.io/en/latest/getstarted/rag_evaluation/](https://docs.ragas.io/en/latest/getstarted/rag_evaluation/)

🖱️Click to see preview of RESULTS

| user_input | retrieved_contexts | response | reference | context_recall | factual_correctness | faithfulness |

|------------|---------------------|----------|-----------|-----------------|---------------------|---------------|

| What are the global implications of the USA Supreme Court ruling on abortion? | "- In 2022, the USA Supreme Court ... - The ruling has created a chilling effect ..." | The global implications ... Here are some potential implications: | The global implications ... Additionally, the ruling has had an impact beyond national borders ... | 1 | 0.47 | 0.516129 |

| Which companies are the main contributors to GHG emissions ... ? | "- Fossil fuel companies ... - Between 2010 and 2020, human mortality ..." | According to the Carbon Majors database ... Here are the top contributors: | According to the Carbon Majors database ... Additionally, between 2010 and 2020, human mortality ... | 1 | 0.11 | 0.172414 |

| Which private companies in the Americas are the largest GHG emitters ... ? | "The private companies responsible ... The largest emitter amongst state-owned companies ..." | According to the Carbon Majors database, the largest private companies ... | The largest private companies in the Americas ... | 1 | 0.26 | 0 |

### Generate a test dataset for comprehensive RAG evaluation

What if you don't have the data for folks asking questions when they interact with your RAG system? 

Ragas can help by generating [synthetic test set generation](https://docs.ragas.io/en/latest/getstarted/rag_testset_generation/) -- where you can seed it with your data and control the difficulty, variety, and complexity. 

## 🫂 Community

If you want to get more involved with Ragas, check out our [discord server](https://discord.gg/5qGUJ6mh7C). It's a fun community where we geek out about LLM, Retrieval, Production issues, and more.

## Contributors

```yml

+----------------------------------------------------------------------------+

|     +----------------------------------------------------------------+     |

|     | Developers: Those who built with `ragas`.                      |     |

|     | (You have `import ragas` somewhere in your project)            |     |

|     |     +----------------------------------------------------+     |     |

|     |     | Contributors: Those who make `ragas` better.       |     |     |

|     |     | (You make PR to this repo)                         |     |     |

|     |     +----------------------------------------------------+     |     |

|     +----------------------------------------------------------------+     |

+----------------------------------------------------------------------------+

```

We welcome contributions from the community! Whether it's bug fixes, feature additions, or documentation improvements, your input is valuable.

1. Fork the repository

2. Create your feature branch (git checkout -b feature/AmazingFeature)

3. Commit your changes (git commit -m 'Add some AmazingFeature')

4. Push to the branch (git push origin feature/AmazingFeature)

5. Open a Pull Request

## 🔍 Open Analytics

At Ragas, we believe in transparency. We collect minimal, anonymized usage data to improve our product and guide our development efforts.

✅ No personal or company-identifying information

✅ Open-source data collection [code](./src/ragas/_analytics.py)

✅ Publicly available aggregated [data](https://github.com/explodinggradients/ragas/issues/49)

To opt-out, set the `RAGAS_DO_NOT_TRACK` environment variable to `true`.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/explodinggradients/ragas

Awesome Lists containing this project

README