An open API service indexing awesome lists of open source software.

https://github.com/evilfreelancer/rrr-benchmark

A benchmark for evaluating large language models (LLMs) on Russian dialogue routing tasks. Models are tested on their ability to select the correct route ID and explain their reasoning based on message history and available routes.
https://github.com/evilfreelancer/rrr-benchmark

Last synced: 4 months ago
JSON representation

A benchmark for evaluating large language models (LLMs) on Russian dialogue routing tasks. Models are tested on their ability to select the correct route ID and explain their reasoning based on message history and available routes.

Awesome Lists containing this project