https://github.com/RUCAIBox/R1-Searcher

R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
https://github.com/RUCAIBox/R1-Searcher

Last synced: 3 months ago
JSON representation

R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

StarryDivineSky - RUCAIBox/R1-Searcher - Searcher 是一个利用强化学习来激励大型语言模型（LLMs）搜索能力的项目。它旨在提升LLMs在需要外部知识检索的任务中的表现。该项目的核心思想是训练LLM学会更有效地利用搜索引擎，从而获取更准确和全面的信息。具体而言，R1-Searcher 通过强化学习奖励LLM生成高质量的搜索查询，并根据搜索结果的质量调整LLM的行为。项目名称中的 "R1" 代表 "检索第一" 的原则。该项目提供了一个框架，可以方便地集成不同的LLMs和搜索引擎。通过这种方式，R1-Searcher 能够显著提高LLMs在知识密集型任务中的准确性和可靠性。它为研究如何增强LLMs的外部知识获取能力提供了一个有价值的工具和方法。项目代码和相关资源可以在 GitHub 仓库 RUCAIBox/R1-Searcher 中找到。 (A01_文本生成_文本对话 / 大语言对话模型及数据)
awesome-hacking-lists - RUCAIBox/R1-Searcher - R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning (Python)

ecosyste.ms