{"id":27818083,"url":"https://github.com/GAIR-NLP/DeepResearcher","last_synced_at":"2025-05-01T15:40:42.624Z","repository":{"id":285893478,"uuid":"959140268","full_name":"GAIR-NLP/DeepResearcher","owner":"GAIR-NLP","description":"Scaling Deep Research via Reinforcement Learning in Real-world Environments.","archived":false,"fork":false,"pushed_at":"2025-04-13T04:38:34.000Z","size":13436,"stargazers_count":184,"open_issues_count":6,"forks_count":17,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-13T05:27:21.715Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GAIR-NLP.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-04-02T10:31:03.000Z","updated_at":"2025-04-13T04:38:37.000Z","dependencies_parsed_at":"2025-04-03T08:41:14.660Z","dependency_job_id":null,"html_url":"https://github.com/GAIR-NLP/DeepResearcher","commit_stats":null,"previous_names":["gair-nlp/deepresearcher"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GAIR-NLP%2FDeepResearcher","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GAIR-NLP%2FDeepResearcher/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GAIR-NLP%2FDeepResearcher/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GAIR-NLP%2FDeepResearcher/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GAIR-NLP","download_url":"https://codeload.github.com/GAIR-NLP/DeepResearcher/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251901567,"owners_count":21662406,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-05-01T15:40:36.082Z","updated_at":"2025-05-01T15:40:42.617Z","avatar_url":"https://github.com/GAIR-NLP.png","language":"Python","funding_links":[],"categories":["A01_文本生成_文本对话","🤖 Deep Research Systems"],"sub_categories":["大语言对话模型及数据","🌐 Open-Source Deep Research Implementations"],"readme":"# DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments\n\nThis is the official repository for [DeepResearcher](https://arxiv.org/abs/2504.03160).\n## 📝 Introduction\n\nDeepResearcher is the first comprehensive framework for end-to-end training of LLM-based deep research agents through scaling reinforcement learning (RL) in real-world environments with authentic web search interactions. Our qualitative analysis reveals emergent **cognitive behaviors** from end-to-end RL training, including the ability to formulate plans, cross-validate information from multiple sources, engage in self-reflection to redirect research, and maintain honesty when unable to find definitive answers.\n\n\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"images/case_1.png\" id=\"framework-icon\" style=\"display:inline-block; width:46.55%; margin-right:5px;\"\u003e\n    \u003cimg src=\"images/case_2.png\" id=\"framework-icon\" style=\"display:inline-block; width:43.45%;\"\u003e\n\u003c/p\u003e\n\n\n## 📋 Table of Contents\n\n- [Introduction](#-introduction)\n- [Model](#-Model)\n- [Performance](#-performance)\n- [Get started](#-get-started)\n- [Acknowledgement](#-Acknowledgement)\n- [Citation](#✍️-citation)\n\n\n\n\n## 🤖 Model\nDeepResearcher is now available on huggingface-hub:\n| Model Name | HF Checkpoint                                                | Size                                                    |\n| ---------- | ------------------------------------------------------------ | :------: |\n| DeepResearcher-7b     | [🤗 GAIR/DeepResearcher-7b](https://huggingface.co/GAIR/DeepResearcher-7b) | **7B** \n\n\n## 🏆 Performance\n\nExtensive experiments on open-domain research tasks demonstrate that DeepResearcher achieves substantial improvements of up to 28.9 points over prompt engineering-based baselines and up to 7.2 points over RAG-based RL agents. Our qualitative analysis reveals emergent cognitive behaviors from end-to-end RL training, including the ability to formulate plans, cross-validate information from multiple sources, engage in self-reflection to redirect research, and maintain honesty when unable to find definitive answers. Our results highlight that end-to-end training in real-world web environments is not merely an implementation detail but a fundamental requirement for developing robust research capabilities aligned with real-world applications.\n\n\u003cp align=\"center\"\u003e \u003cimg src=\"images/performance.png\" id=\"performance-icon\"\u003e       \u003c/p\u003e\n\n\u003cp align=\"center\"\u003e \u003cimg src=\"images/scaling.png\" id=\"performance-icon\"\u003e       \u003c/p\u003e\n\n\n## 🚀 Get Started\n\n### Package Installation\n\nTo begin using this repo, you need to install the required dependencies. You can do this by running the following command:\n\n```bash\ngit clone https://github.com/GAIR-NLP/DeepResearcher.git \nconda create -n deepresearcher python=3.10 \nconda activate deepresearcher\ncd DeepResearcher\npip3 install torch==2.4.0 --index-url https://download.pytorch.org/whl/cu124\npip3 install flash-attn --no-build-isolation\npip3 install -e .\npip3 install -r requirements.txt\n```\n\n### Start ray before training and inference\nWe use ray to train model, befor start ray you should set ```PET_NODE_RANK``` first. (**This is compulsory even if you only have 1 node**).\nHere is the code of the head node:\n```bash\nexport PET_NODE_RANK=0\nray start --head\n```\n\n### Run backend handler\n\nRunning the following command to launch the server handler:\n1. Modify ```serper_api_key``` or ```azure_bing_search_subscription_key``` \u0026 ```search_engine``` in ```./scrl/handler/config.yaml```\n2. Add  ```qwen-plus``` api key in ```./scrl/handler/server_handler.py```\n```python\nclient = OpenAI(\n    api_key=\"sk-xxx\",\n    base_url=\"xxxx\"\n)\n```\n3. Start server handler:\n```bash\n python ./scrl/handler/server_handler.py\n```\n\nAfter launching all server handlers, you can replace ```server_url_list``` in ```./scrl/handler/config.yaml``` in your training host node and then run:\n```bash\n python ./scrl/handler/handler.py\n```\n### Training model\n\nUsing the following command to train the model:\n```bash\n bash train_grpo.sh\n```\n\n### Evaluate\nUsing the following command to generate rollout:\n```bash\n bash evaluate.sh\n```\nYou can find the rollout file in: ```./outputs/{project_name}/{experiment_name}/rollout/rollout_step_0.json```\nYou can rename and copy it into ```./evaluate/{experiment_name}_result.json```\n\nThen, run the following command:\n```bash\n python ./evaluate/cacluate_metrics.py {experiment_name}\n```\nYou can check the score in ```./evaluate/{experiment_name}_score.json```\n\n## 🙏 Acknowledgement\n\nDeepResearcher is inspired by [Deepseek-R1](https://github.com/deepseek-ai/DeepSeek-R1) with its implementation based on [veRL](https://github.com/volcengine/verl) and [Search-r1](https://github.com/PeterGriffinJin/Search-R1). We deeply appreciate the contributions of these teams to open-source research and development. \n\n## ✍️ Citation\n\nPlease cite the repo if the model/code/conclusion in this repo are helpful to you.\n```\n@misc{zheng2025deepresearcherscalingdeepresearch,\n      title={DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments}, \n      author={Yuxiang Zheng and Dayuan Fu and Xiangkun Hu and Xiaojie Cai and Lyumanshan Ye and Pengrui Lu and Pengfei Liu},\n      year={2025},\n      eprint={2504.03160},\n      archivePrefix={arXiv},\n      primaryClass={cs.AI},\n      url={https://arxiv.org/abs/2504.03160}, \n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FGAIR-NLP%2FDeepResearcher","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FGAIR-NLP%2FDeepResearcher","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FGAIR-NLP%2FDeepResearcher/lists"}