Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/langchain-ai/langchain-benchmarks
🦜💯 Flex those feathers!
https://github.com/langchain-ai/langchain-benchmarks
benchmark-framework benchmarking langchain langchain-python llm llms
Last synced: 1 day ago
JSON representation
🦜💯 Flex those feathers!
- Host: GitHub
- URL: https://github.com/langchain-ai/langchain-benchmarks
- Owner: langchain-ai
- License: mit
- Created: 2023-08-10T21:31:11.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-21T20:47:16.000Z (2 months ago)
- Last Synced: 2024-12-13T17:23:34.435Z (8 days ago)
- Topics: benchmark-framework, benchmarking, langchain, langchain-python, llm, llms
- Language: Python
- Homepage: https://langchain-ai.github.io/langchain-benchmarks/
- Size: 16.2 MB
- Stars: 234
- Watchers: 8
- Forks: 47
- Open Issues: 16
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: security.md
Awesome Lists containing this project
- StarryDivineSky - langchain-ai/langchain-benchmarks
README
# 🦜💯 LangChain Benchmarks
[![Release Notes](https://img.shields.io/github/release/langchain-ai/langchain-benchmarks)](https://github.com/langchain-ai/langchain-benchmarks/releases)
[![CI](https://github.com/langchain-ai/langchain-benchmarks/actions/workflows/ci.yml/badge.svg)](https://github.com/langchain-ai/langchain-benchmarks/actions/workflows/ci.yml)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai)
[![](https://dcbadge.vercel.app/api/server/6adMQxSpJS?compact=true&style=flat)](https://discord.gg/6adMQxSpJS)
[![Open Issues](https://img.shields.io/github/issues-raw/langchain-ai/langchain-benchmarks)](https://github.com/langchain-ai/langchain-benchmarks/issues)[📖 Documentation](https://langchain-ai.github.io/langchain-benchmarks/index.html)
A package to help benchmark various LLM related tasks.
The benchmarks are organized by end-to-end use cases, and
utilize [LangSmith](https://smith.langchain.com/) heavily.We have several goals in open sourcing this:
- Showing how we collect our benchmark datasets for each task
- Showing what the benchmark datasets we use for each task is
- Showing how we evaluate each task
- Encouraging others to benchmark their solutions on these tasks (we are always looking for better ways of doing things!)## Benchmarking Results
Read some of the articles about benchmarking results on our blog.
* [Agent Tool Use](https://blog.langchain.dev/benchmarking-agent-tool-use/)
* [Query Analysis in High Cardinality Situations](https://blog.langchain.dev/high-cardinality/)
* [RAG on Tables](https://blog.langchain.dev/benchmarking-rag-on-tables/)
* [Q&A over CSV data](https://blog.langchain.dev/benchmarking-question-answering-over-csv-data/)### Tool Usage (2024-04-18)
See [tool usage docs](https://langchain-ai.github.io/langchain-benchmarks/notebooks/tool_usage/benchmark_all_tasks.html) to recreate!
![download](https://github.com/langchain-ai/langchain-benchmarks/assets/3205522/0da33de8-e03f-49cf-bd48-e9ff945828a9)
Explore Agent Traces on LangSmith:
* [Relational Data](https://smith.langchain.com/public/22721064-dcf6-4e42-be65-e7c46e6835e7/d)
* [Tool Usage (1-tool)](https://smith.langchain.com/public/ac23cb40-e392-471f-b129-a893a77b6f62/d)
* [Tool Usage (26-tools)](https://smith.langchain.com/public/366bddca-62b3-4b6e-849b-a478abab73db/d)
* [Multiverse Math](https://smith.langchain.com/public/983faff2-54b9-4875-9bf2-c16913e7d489/d)## Installation
To install the packages, run the following command:
```bash
pip install -U langchain-benchmarks
```All the benchmarks come with an associated benchmark dataset stored in [LangSmith](https://smith.langchain.com). To take advantage of the eval and debugging experience, [sign up](https://smith.langchain.com), and set your API key in your environment:
```bash
export LANGCHAIN_API_KEY=ls-...
```## Repo Structure
The package is located within [langchain_benchmarks](./langchain_benchmarks/). Check out the [docs](https://langchain-ai.github.io/langchain-benchmarks/index.html) for information on how to get starte.
The other directories are legacy and may be moved in the future.
## Archived
Below are archived benchmarks that require cloning this repo to run.
- [CSV Question Answering](https://github.com/langchain-ai/langchain-benchmarks/tree/main/archived/csv-qa)
- [Extraction](https://github.com/langchain-ai/langchain-benchmarks/tree/main/archived/extraction)
- [Q&A over the LangChain docs](https://github.com/langchain-ai/langchain-benchmarks/tree/main/archived/langchain-docs-benchmarking)
- [Meta-evaluation of 'correctness' evaluators](https://github.com/langchain-ai/langchain-benchmarks/tree/main/archived/meta-evals)## Related
- For cookbooks on other ways to test, debug, monitor, and improve your LLM applications, check out the [LangSmith docs](https://docs.smith.langchain.com/)
- For information on building with LangChain, check out the [python documentation](https://python.langchain.com/docs/get_started/introduction) or [JS documentation](https://js.langchain.com/docs/get_started/introduction)