{"id":26936577,"url":"https://github.com/tpoisonooo/rograg","last_synced_at":"2025-04-09T14:06:21.350Z","repository":{"id":271233867,"uuid":"905143736","full_name":"tpoisonooo/ROGRAG","owner":"tpoisonooo","description":"ROGRAG: A Robustly Optimized GraphRAG Framework","archived":false,"fork":false,"pushed_at":"2025-04-07T03:47:20.000Z","size":1958,"stargazers_count":102,"open_issues_count":5,"forks_count":9,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-07T04:29:51.119Z","etag":null,"topics":["knowledge-base","knowledge-graph","knownledge-augmented-generation","llm","precision","retrieval-augmented-generation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tpoisonooo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-18T08:48:58.000Z","updated_at":"2025-04-07T03:47:21.000Z","dependencies_parsed_at":"2025-01-06T13:27:04.924Z","dependency_job_id":"f6182040-849a-4a60-bcbe-1bf56b2fb623","html_url":"https://github.com/tpoisonooo/ROGRAG","commit_stats":null,"previous_names":["tpoisonooo/huixiangdou2","tpoisonooo/rograg"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tpoisonooo%2FROGRAG","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tpoisonooo%2FROGRAG/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tpoisonooo%2FROGRAG/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tpoisonooo%2FROGRAG/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tpoisonooo","download_url":"https://codeload.github.com/tpoisonooo/ROGRAG/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248054223,"owners_count":21039952,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["knowledge-base","knowledge-graph","knownledge-augmented-generation","llm","precision","retrieval-augmented-generation"],"created_at":"2025-04-02T13:00:42.211Z","updated_at":"2025-04-09T14:06:21.345Z","avatar_url":"https://github.com/tpoisonooo.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"English | [Simplified Chinese](./README_zh_cn.md)\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"./resource/logo_3.png\" style=\"zoom:50%;\" /\u003e\n\u003c/div\u003e\n\n\u003cdiv\u003e\n  \u003ca href=\"https://arxiv.org/abs/2503.06474\" target=\"_blank\"\u003e\n    \u003cimg alt=\"Arxiv\" src=\"https://img.shields.io/badge/arxiv-2503.06474%20-darkred?logo=arxiv\u0026logoColor=white\" /\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n\n## 🔥 Introduction\n\nROGRAG enhances LLM performance on specialized topics using a robust GraphRAG approach. It features a two-stage (dual-level and logic form methods) retrieval mechanism to improve accuracy without extra computation costs. ROGRAG achieves a 15% score boost on [SeedBench](https://github.com/open-sciencelab/SeedBench), outperforming mainstream methods. \n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/user-attachments/assets/5754c247-f6af-44b2-addb-5840ee2ee247\" width=500\u003e\n\u003c/div\u003e\n\n**Key Highlights**:\n\n  - Two-stage retrieval for robustness\n  - Incremental database construction\n  - Enhanced fuzzy matching and structured reasoning\n\n\u003cdiv align=\"center\"\u003e\n\n| Method          | QA-1 (Accuracy) | QA-2 (F1) | QA-3 (Rouge) | QA-4 (Rouge) |\n|-----------------|-----------------|-----------|--------------|--------------|\n| vanilla (w/o RAG) | 0.57            | 0.71      | 0.16         | 0.35         |\n| LangChain        | 0.68            | 0.68      | 0.15         | 0.04         |\n| BM25             | 0.65            | 0.69      | 0.23         | 0.03         |\n| RQ-RAG           | 0.59            | 0.62      | 0.17         | 0.33         |\n| ROGRAG (Ours)    | **0.75**        | **0.79**  | **0.36**     | **0.38**     |\n\n\u003c/div\u003e\n\nDeployed on an online research platform, ROGRAG is ready for integration. [Here](https://arxiv.org/abs/2503.06474) is the technical report.\n\nIf it is useful to you, please star it ⭐\n\n## 📖 Documentation\n- [1. Run from Docker (CMD / Swagger Server API / Gradio)](docs/en/doc_how_to_run_from_docker.md)\n- [2. Run from Source](docs/en/doc_how_to_run.md)\n- [3. Directory Structure and Function](docs/en/doc_architecture.md)\n- [**FAQ** about environment and error](https://github.com/tpoisonooo/HuixiangDou2/issues/8) \n\n## 🔆 Version Description\n\nCompared to [HuixiangDou](https://github.com/internlm/huixiangdou), this repo improves accuracy:\n1. **Graph Schema**. Dense retrieval is only for querying similar entities and relationships.\n2. Ported/merged multiple open-source implementations, with code differences of nearly 18k lines:\n   - **Data**. Organized a set of real domain knowledge that LLM has not fully seen for testing (gpt accuracy \u003c 0.6)\n   - **Ablation**. Confirmed the impact of different stages and parameters on accuracy\n\n3. API remains compatible. That means Wechat/Lark/Web in v1 is also accessible.\n   ```text\n   # v1 API https://github.com/InternLM/HuixiangDou/blob/main/huixiangdou/service/parallel_pipeline.py#L290\n   async def generate(self,\n               query: Union[Query, str],\n               history: List[Tuple[str]]=[], \n               language: str='zh', \n               enable_web_search: bool=True,\n               enable_code_search: bool=True):\n   \n   # v2 API https://github.com/tpoisonooo/HuixiangDou2/blob/main/huixiangdou/pipeline/parallel.py#L135\n   async def generate(self,\n                   query: Union[Query, str],\n                   history: List[Pair] = [],\n                   request_id: str = 'default',\n                   language: str = 'zh_cn'):\n   ```\n   \n\n## 🍀 Acknowledgements\n- [SiliconCloud](https://siliconflow.cn) Abundant LLM API, some models are free\n- [KAG](https://github.com/OpenSPG/KAG) Graph retrieval based on reasoning\n- [DB-GPT](https://github.com/eosphoros-ai/DB-GPT) LLM tool collection\n- [LightRAG](https://github.com/HKUDS/LightRAG) Simple and efficient graph retrieval solution\n- [SeedBench](https://github.com/open-sciencelab/SeedBench) A multi-task benchmark for evaluating LLMs in seed science\n\n## 📝 Citation\n\n!!! The impact of open-source on different fields/industries varies. Since licensing restriction, we can **only give the code and test conclusions, and the test data cannot be provided**.\n\n```text\n@misc{kong2024huixiangdou,\n      title={HuiXiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance},\n      author={Huanjun Kong and Songyang Zhang and Jiaying Li and Min Xiao and Jun Xu and Kai Chen},\n      year={2024},\n      eprint={2401.08772},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL}\n}\n\n@misc{kong2024labelingsupervisedfinetuningdata,\n      title={Labeling supervised fine-tuning data with the scaling law}, \n      author={Huanjun Kong},\n      year={2024},\n      eprint={2405.02817},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https://arxiv.org/abs/2405.02817}, \n}\n\n@misc{kong2025huixiangdou2robustlyoptimizedgraphrag,\n      title={HuixiangDou2: A Robustly Optimized GraphRAG Approach}, \n      author={Huanjun Kong and Zhefan Wang and Chenyang Wang and Zhe Ma and Nanqing Dong},\n      year={2025},\n      eprint={2503.06474},\n      archivePrefix={arXiv},\n      primaryClass={cs.IR},\n      url={https://arxiv.org/abs/2503.06474}, \n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftpoisonooo%2Frograg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftpoisonooo%2Frograg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftpoisonooo%2Frograg/lists"}