{"id":28098788,"url":"https://github.com/rmanluo/gfm-rag","last_synced_at":"2026-03-17T06:14:15.550Z","repository":{"id":276068214,"uuid":"925042132","full_name":"RManLuo/gfm-rag","owner":"RManLuo","description":"[NeurIPS'25, ICLR'26] Graph Foundation Model for Retrieval Augmented Generation","archived":false,"fork":false,"pushed_at":"2026-02-25T12:22:50.000Z","size":4487,"stargazers_count":221,"open_issues_count":3,"forks_count":26,"subscribers_count":5,"default_branch":"main","last_synced_at":"2026-03-10T08:46:52.770Z","etag":null,"topics":["gpt","graphrag","knowledge-graph","large-language-models","llm","rag","retrieval-augmented-generation"],"latest_commit_sha":null,"homepage":"https://rmanluo.github.io/gfm-rag/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RManLuo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-01-31T05:34:09.000Z","updated_at":"2026-03-04T15:08:34.000Z","dependencies_parsed_at":"2025-02-06T05:29:24.136Z","dependency_job_id":"f16e59ea-ece6-496b-a2ad-de3f83cb126d","html_url":"https://github.com/RManLuo/gfm-rag","commit_stats":null,"previous_names":["rmanluo/gfm-rag"],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/RManLuo/gfm-rag","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RManLuo%2Fgfm-rag","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RManLuo%2Fgfm-rag/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RManLuo%2Fgfm-rag/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RManLuo%2Fgfm-rag/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RManLuo","download_url":"https://codeload.github.com/RManLuo/gfm-rag/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RManLuo%2Fgfm-rag/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30614684,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-17T04:46:40.957Z","status":"ssl_error","status_checked_at":"2026-03-17T04:46:32.538Z","response_time":56,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gpt","graphrag","knowledge-graph","large-language-models","llm","rag","retrieval-augmented-generation"],"created_at":"2025-05-13T17:58:50.202Z","updated_at":"2026-03-17T06:14:15.511Z","avatar_url":"https://github.com/RManLuo.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation\n\u003cdiv align=\"left\"\u003e\n   \u003cp\u003e\n   \u003ca href='https://rmanluo.github.io/gfm-rag/'\u003e\u003cimg src='https://img.shields.io/badge/Project-Page-Green'\u003e\u003c/a\u003e\n   \u003ca href='https://www.arxiv.org/abs/2502.01113'\u003e\u003cimg src='https://img.shields.io/badge/arXiv-2502.01113-b31b1b'\u003e\u003c/a\u003e\n   \u003ca href='https://huggingface.co/collections/rmanluo/gfm-rag-67a1ef7bfe097a938d8848dc'\u003e\u003cimg src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-GFM--RAG-blue'\u003e\u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/gfmrag/\"\u003e\n  \u003c/p\u003e\n  \u003cp\u003e\n  \u003cimg src='https://img.shields.io/github/stars/RManLuo/gfm-rag?color=green\u0026style=social' /\u003e\n  \u003ca href=\"https://pypi.org/project/gfmrag/\"\u003e\n    \u003cimg alt=\"PyPI - Version\" src=\"https://img.shields.io/pypi/v/gfmrag\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/gfmrag/\"\u003e\n    \u003cimg alt=\"PyPI - Downloads\" src=\"https://img.shields.io/pypi/dm/gfmrag\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/RManLuo/gfm-rag/issues\"\u003e\n    \u003cimg alt=\"GitHub Issues\" src=\"https://img.shields.io/github/issues/RManLuo/gfm-rag\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/RManLuo/gfm-rag/discussions\"\u003e\n    \u003cimg alt=\"GitHub Discussions\" src=\"https://img.shields.io/github/discussions/RManLuo/gfm-rag\"\u003e\n  \u003c/a\u003e\n  \u003c/p\u003e\n\u003c/div\u003e\n\n[\\[中文解读\\]](https://rman.top/2025/03/01/gfm-rag/)\n\nThe GFM-RAG is the first graph foundation model-powered RAG pipeline that combines the power of graph neural networks to reason over knowledge graphs and retrieve relevant documents for question answering.\n\n![](docs/images/intro.png)\n\nWe first build a knowledge graph index (KG-index) from the documents to capture the relationships between knowledge. Then, we feed the query and constructed KG-index into the pre-trained graph foundation model (GFM) retriever to obtain relevant documents for LLM generation. The GFM retriever experiences large-scale training and can be directly applied to unseen datasets without fine-tuning.\n\nFor more details, please refer to our [project page](https://rmanluo.github.io/gfm-rag/) and [paper](https://www.arxiv.org/abs/2502.01113).\n\n## 🎉 News\n- **[2025-02-06]** We have released the GFM-RAG codebase and a [8M pre-trained model](https://huggingface.co/rmanluo/GFM-RAG-8M). 🚀\n\n## Features\n\n- **Graph Foundation Model (GFM)**: A graph neural network-based retriever that can reason over the KG-index.\n- **Knowledge Graph Index**: A knowledge graph index that captures the relationships between knowledge.\n- **Efficiency**: The GFM-RAG pipeline is efficient in conducting multi-hop reasoning with single-step retrieval.\n- **Generalizability**: The GFM-RAG can be directly applied to unseen datasets without fine-tuning.\n- **Transferability**: The GFM-RAG can be fine-tuned on your own dataset to improve performance on specific domains.\n- **Compatibility**: The GFM-RAG is compatible with arbitrary agent-based framework to conduct multi-step reasoning.\n- **Interpretability**: The GFM-RAG can illustrate the captured reasoning paths for better understanding.\n\n## Dependencies\n\n- Python 3.12\n- CUDA 12 and above\n\n## Installation\n\nConda provides an easy way to install the CUDA development toolkit which is required by GFM-RAG\n\nInstall packages\n```bash\nconda create -n gfmrag python=3.12\nconda activate gfmrag\nconda install cuda-toolkit -c nvidia/label/cuda-12.4.1 # Replace with your desired CUDA version\npip install gfmrag\n```\n\n## Quick Start\n\n\u003e [!NOTE]\n\u003e Read the full documentation at: https://rmanluo.github.io/gfm-rag/\n\n### Prepare Data\n\nWe have provided the testing split and an example of the training data in [here](https://drive.google.com/drive/folders/11xuSKD20c1X0bJRZRVvRc8ocW7wgX7Rw?usp=sharing).\n\nYou need to prepare the following files:\n\n- `dataset_corpus.json`: A JSON file containing the entire document corpus.\n- `train.json` (optional): A JSON file containing the training data.\n- `test.json` (optional): A JSON file containing the test data.\n\nPlace your files in the following structure:\n```\ndata_name/\n├── raw/\n│   ├── dataset_corpus.json\n│   ├── train.json # (optional)\n│   └── test.json # (optional)\n└── processed/ # Output directory\n```\n\n#### `dataset_corpus.json`\n\nThe `dataset_corpus.json` is a dictionary where each key is the title or unique id of a document and the value is the text of the document.\n\n```json\n{\n    \"Fred Gehrke\":\n        \"Clarence Fred Gehrke (April 24, 1918 – February 9, 2002) was an American football player and executive.  He played in the National Football League (NFL) for the Cleveland / Los Angeles Rams, San Francisco 49ers and Chicago Cardinals from 1940 through 1950.  To boost team morale, Gehrke designed and painted the Los Angeles Rams logo in 1948, which was the first painted on the helmets of an NFL team.  He later served as the general manager of the Denver Broncos from 1977 through 1981.  He is the great-grandfather of Miami Marlin Christian Yelich\"\n    ,\n    \"Manny Machado\":\n        \"Manuel Arturo Machado (] ; born July 6, 1992) is an American professional baseball third baseman and shortstop for the Baltimore Orioles of Major League Baseball (MLB).  He attended Brito High School in Miami and was drafted by the Orioles with the third overall pick in the 2010 Major League Baseball draft.  He bats and throws right-handed.\"\n    ,\n    ...\n }\n```\n\n#### `train.json` and `test.json` (optional)\nIf you want to train and evaluate the model, you need to provide training and testing data in the form of a JSON file. Each entry in the JSON file should contain the following fields:\n\n- `id`: A unique identifier for the example.\n- `question`: The question or query.\n- `supporting_facts`: A list of supporting facts relevant to the question. Each supporting fact is a list containing the title of the document that can be found in the `dataset_corpus.json` file.\n\nEach entry can also contain additional fields depending on the task. For example:\n\n- `answer`: The answer to the question.\n\nThe additional fields will be copied during the following steps of the pipeline.\n\nExample:\n```json\n[\n\t{\n\t\t\"id\": \"5adf5e285542992d7e9f9323\",\n\t\t\"question\": \"When was the judge born who made notable contributions to the trial of the man who tortured, raped, and murdered eight student nurses from South Chicago Community Hospital on the night of July 13-14, 1966?\",\n\t\t\"answer\": \"June 4, 1931\",\n\t\t\"supporting_facts\": [\n\t\t\t\"Louis B. Garippo\",\n\t\t\t\"Richard Speck\"\n\t\t]\n\t},\n\t{\n\t\t\"id\": \"5a7f7b365542992097ad2f80\",\n\t\t\"question\": \"Did the Beaulieu Mine or the McIntyre Mines yield gold and copper?\",\n\t\t\"answer\": \"The McIntyre also yielded a considerable amount of copper\",\n\t\t\"supporting_facts\": [\n\t\t\t\"Beaulieu Mine\",\n\t\t\t\"McIntyre Mines\"\n\t\t]\n\t}\n    ...\n]\n```\n\n### Index Dataset\n\nYou need to create a KG-index [configuration file](gfmrag/workflow/config/stage1_index_dataset.yaml).\n\nDetails of the configuration parameters are explained in the [KG-index Configuration](https://rmanluo.github.io/gfm-rag/config/kg_index_config/) page.\n\n```bash\npython -m gfmrag.workflow.stage1_index_dataset\n```\n\nThis method performs two main tasks:\n\n1. Create and save knowledge graph related files (`kg.txt` and `document2entities.json`) from the `dataset_corpus.json` file\n2. Identify the query entities and supporting entities in training and testing data if available in the raw data directory.\n\nFiles created:\n\n- `kg.txt`: Contains knowledge graph triples\n- `document2entities.json`: Maps documents to their entities\n- `train.json`: Processed training data (if raw exists)\n- `test.json`: Processed testing data (if raw exists)\n\nDirectory structure:\n```\ndata_name/\n├── raw/\n│   ├── dataset_corpus.json\n│   ├── train.json (optional)\n│   └── test.json (optional)\n└── processed/\n    └── stage1/\n        ├── kg.txt\n        ├── document2entities.json\n        ├── train.json\n        └── test.json\n```\n\n### GFM-RAG Retrieval\n\nYou need to create a [configuration file](gfmrag/workflow/config/stage3_qa_ircot_inference.yaml) for inference.\n\n\u003e [!NOTE]\n\u003e We have already released the pre-trained model [here](https://huggingface.co/rmanluo/GFM-RAG-8M), which can be used directly for retrieval. The model will be automatically downloaded by specifying it in the configuration.\n\u003e ```yaml\n\u003e graph_retriever:\n\u003e     model_path: rmanluo/GFM-RAG-8M\n\u003e ```\n\nDetails of the configuration parameters are explained in the [GFM-RAG Configuration](https://rmanluo.github.io/gfm-rag/config/gfmrag_retriever_config/) page.\n\n#### Initialize GFMRetriever\n\nYou can initialize the GFMRetriever with the following code. It will load the pre-trained GFM-RAG model and the KG-index for retrieval.\n\n```python\nimport logging\nimport os\n\nimport hydra\nfrom hydra.core.hydra_config import HydraConfig\nfrom omegaconf import DictConfig, OmegaConf\n\nfrom gfmrag import GFMRetriever\n\nlogger = logging.getLogger(__name__)\n\n\n@hydra.main(\n    config_path=\"config\", config_name=\"stage3_qa_ircot_inference\", version_base=None\n)\ndef main(cfg: DictConfig) -\u003e None:\n    output_dir = HydraConfig.get().runtime.output_dir\n    logger.info(f\"Config:\\n {OmegaConf.to_yaml(cfg)}\")\n    logger.info(f\"Current working directory: {os.getcwd()}\")\n    logger.info(f\"Output directory: {output_dir}\")\n\n    gfmrag_retriever = GFMRetriever.from_config(cfg)\n```\n\n#### Document Retrieval\n\nYou can use GFM-RAG retriever to reason over the KG-index and obtain documents for a given query.\n```python\ndocs = retriever.retrieve(\"Who is the president of France?\", top_k=5)\n```\n\n#### Question Answering\n\n```python\nfrom hydra.utils import instantiate\nfrom gfmrag.llms import BaseLanguageModel\nfrom gfmrag.prompt_builder import QAPromptBuilder\n\nllm = instantiate(cfg.llm)\nqa_prompt_builder = QAPromptBuilder(cfg.qa_prompt)\n\nmessage = qa_prompt_builder.build_input_prompt(current_query, retrieved_docs)\nanswer = llm.generate_sentence(message)  # Answer: \"Emmanuel Macron\"\n```\n\n## GFM Fine-tuning\n\nDuring fine-tuning, the GFM model will be trained on the query-documents pairs `train.json` from the labeled dataset to learn complex relationships for retrieval.\n\nIt can be conducted on your own dataset to improve the performance of the model on your specific domain.\n\nAn example of the training data:\n\n```json\n[\n\t{\n\t\t\"id\": \"5abc553a554299700f9d7871\",\n\t\t\"question\": \"Kyle Ezell is a professor at what School of Architecture building at Ohio State?\",\n\t\t\"answer\": \"Knowlton Hall\",\n\t\t\"supporting_facts\": [\n\t\t\t\"Knowlton Hall\",\n\t\t\t\"Kyle Ezell\"\n\t\t],\n\t\t\"question_entities\": [\n\t\t\t\"kyle ezell\",\n\t\t\t\"architectural association school of architecture\",\n\t\t\t\"ohio state\"\n\t\t],\n\t\t\"supporting_entities\": [\n\t\t\t\"10 million donation\",\n\t\t\t\"2004\",\n\t\t\t\"architecture\",\n\t\t\t\"austin e  knowlton\",\n\t\t\t\"austin e  knowlton school of architecture\",\n\t\t\t\"bachelor s in architectural engineering\",\n\t\t\t\"city and regional planning\",\n\t\t\t\"columbus  ohio  united states\",\n\t\t\t\"ives hall\",\n\t\t\t\"july 2002\",\n\t\t\t\"knowlton hall\",\n\t\t\t\"ksa\",\n\t\t]\n\t},\n    ...\n]\n```\n\nYou need to create a [configuration file](gfmrag/workflow/config/stage2_qa_finetune.yaml) for fine-tuning.\n\n\u003e [!NOTE]\n\u003e We have already released the pre-trained model checkpoint [here](https://huggingface.co/rmanluo/GFM-RAG-8M), which can be used for further finetuning. The model will be automatically downloaded by specifying it in the configuration.\n\u003e ```yaml\n\u003e checkpoint: rmanluo/GFM-RAG-8M\n\u003e ```\n\nDetails of the configuration parameters are explained in the [GFM-RAG Fine-tuning Configuration](https://rmanluo.github.io/gfm-rag/config/gfmrag_finetune_config/) page.\n\nYou can fine-tune the pre-trained GFM-RAG model on your dataset using the following command:\n\n```bash\npython -m gfmrag.workflow.stage2_qa_finetune\n# Multi-GPU training\ntorchrun --nproc_per_node=4 -m gfmrag.workflow.stage2_qa_finetune\n# Multi-node Multi-GPU training\ntorchrun --nproc_per_node=4 --nnodes=2 -m gfmrag.workflow.stage2_qa_finetune\n```\n\n## Reproduce Results reported in the paper\n\n### Download datasets\n\nWe are working on releasing the full training datasets.\n\nWe have provided the testing split and an example of the training data in [here](https://drive.google.com/drive/folders/11xuSKD20c1X0bJRZRVvRc8ocW7wgX7Rw?usp=sharing).\n\nDownload the datasets and put them under the `data` directory.\n\n```text\ndata/\n├── 2wikimultihopqa_test\n│   ├── processed\n│   └── raw\n├── hotpotqa_test\n│   ├── processed\n│   └── raw\n├── hotpotqa_train_example\n│   ├── processed\n│   └── raw\n└── musique_test\n    ├── processed\n    └── raw\n```\n\n\n### Index Dataset\n\nWe have provided the indexed testing datasets in the `data/*/processed/stage1` directory. You can build the index for the testing dataset with the following command:\n\n```bash\n# Build the index for testing dataset\nN_GPU=1\nDATA_ROOT=\"data\"\nDATA_NAME_LIST=\"hotpotqa_test 2wikimultihopqa_test musique_test\"\nfor DATA_NAME in ${DATA_NAME_LIST}; do\n   python -m gfmrag.workflow.stage1_index_dataset \\\n   dataset.root=${DATA_ROOT} \\\n   dataset.data_name=${DATA_NAME}\ndone\n```\n\nFull script is available at [scripts/stage1_data_index.sh](scripts/stage1_data_index.sh).\n\n### GFM Training\n\nUnsupervised training on the constructed KG.\n\n```bash\npython -m gfmrag.workflow.stage2_kg_pretrain\n# Multi-GPU training\ntorchrun --nproc_per_node=4 -m gfmrag.workflow.stage2_kg_pretrain\n```\n\nFull script is available at [scripts/stage2_pretrain.sh](scripts/stage2_pretrain.sh).\n\nSupervised training on the QA dataset.\n\n```bash\npython -m gfmrag.workflow.stage2_qa_finetune\n# Multi-GPU training\ntorchrun --nproc_per_node=4 -m gfmrag.workflow.stage2_qa_finetune\n```\n\nFull script is available at [scripts/stage2_finetune.sh](scripts/stage2_finetune.sh).\n\n### Retrieval Evaluation\n\n```bash\nN_GPU=4\nDATA_ROOT=\"data\"\ncheckpoints=rmanluo/GFM-RAG-8M # Or the path to your checkpoints\ntorchrun --nproc_per_node=${N_GPU} -m gfmrag.workflow.stage2_qa_finetune \\\n    train.checkpoint=${checkpoints} \\\n    datasets.cfgs.root=${DATA_ROOT} \\\n    datasets.train_names=[] \\\n    train.num_epoch=0\n```\n\n### QA Reasoning\n\n#### Single Step QA Reasoning\n```bash\n# Batch inference for QA on the test set.\nN_GPU=4\nDATA_ROOT=\"data\"\nDATA_NAME=\"hotpotqa\" # hotpotqa musique 2wikimultihopqa\nLLM=\"gpt-4o-mini\"\nDOC_TOP_K=5\nN_THREAD=10\ntorchrun --nproc_per_node=${N_GPU} -m gfmrag.workflow.stage3_qa_inference \\\n    dataset.root=${DATA_ROOT} \\\n    qa_prompt=${DATA_NAME} \\\n    qa_evaluator=${DATA_NAME} \\\n    llm.model_name_or_path=${LLM} \\\n    test.n_threads=${N_THREAD} \\\n    test.top_k=${DOC_TOP_K} \\\n    dataset.data_name=${DATA_NAME}_test\n```\n\nhotpotqa\n```bash\ntorchrun --nproc_per_node=4 -m gfmrag.workflow.stage3_qa_inference dataset.data_name=hotpotqa_test qa_prompt=hotpotqa qa_evaluator=hotpotqa\n```\n\nmusique\n```bash\ntorchrun --nproc_per_node=4 -m gfmrag.workflow.stage3_qa_inference dataset.data_name=musique_test qa_prompt=musique qa_evaluator=musique\n```\n\n2Wikimultihopqa\n```bash\ntorchrun --nproc_per_node=4 -m gfmrag.workflow.stage3_qa_inference dataset.data_name=2wikimultihopqa_test qa_prompt=2wikimultihopqa qa_evaluator=2wikimultihopqa\n```\n#### Multi Step IRCOT QA Reasoning\n```bash\n# IRCoT + GFM-RAG inference on QA tasks\nN_GPU=1\nDATA_ROOT=\"data\"\nDATA_NAME=\"hotpotqa\" # hotpotqa musique 2wikimultihopqa\nLLM=\"gpt-4o-mini\"\nMAX_STEPS=3\nMAX_SAMPLE=10\npython -m gfmrag.workflow.stage3_qa_ircot_inference \\\n    dataset.root=${DATA_ROOT} \\\n    llm.model_name_or_path=${LLM} \\\n    qa_prompt=${DATA_NAME} \\\n    qa_evaluator=${DATA_NAME} \\\n    agent_prompt=${DATA_NAME}_ircot \\\n    test.max_steps=${MAX_STEPS} \\\n    test.max_test_samples=${MAX_SAMPLE} \\\n    dataset.data_name=${DATA_NAME}_test\n```\n\nhotpotqa\n```bash\npython -m gfmrag.workflow.stage3_qa_ircot_inference qa_prompt=hotpotqa qa_evaluator=hotpotqa agent_prompt=hotpotqa_ircot dataset.data_name=hotpotqa_test test.max_steps=2\n```\n\nmusique\n```bash\npython -m gfmrag.workflow.stage3_qa_ircot_inference qa_prompt=musique qa_evaluator=musique agent_prompt=musique_ircot dataset.data_name=musique_test test.max_steps=4\n```\n\n2Wikimultihopqa\n```bash\npython -m gfmrag.workflow.stage3_qa_ircot_inference qa_prompt=2wikimultihopqa qa_evaluator=2wikimultihopqa agent_prompt=2wikimultihopqa_ircot dataset.data_name=2wikimultihopqa_test test.max_steps=2\n```\n\n### Path Interpretations\n```bash\npython -m gfmrag.workflow.experiments.visualize_path dataset.data_name=hotpotqa_test\n```\n\n## Acknowledgements\n\nWe greatly appreciate the following repositories for their help to this project:\n\n* [DeepGraphLearning/ULTRA](https://github.com/DeepGraphLearning/ULTRA): The ULTRA model is used as the base GNN model for the GFM retriever.\n* [OSU-NLP-Group/HippoRAG](https://github.com/OSU-NLP-Group/HippoRAG): We get great inspiration from the KG construction process of HippoRAG.\n* [microsoft/graphrag](https://github.com/microsoft/graphrag): We get great inspiration from the project design of GraphRAG.\n\n## Citation\n\nIf you find this repository helpful, please consider citing our paper:\n\n```bibtex\n@article{luo2025gfmrag,\n  title={GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation},\n  author={Luo, Linhao and Zhao, Zicheng and Haffari, Gholamreza and Phung, Dinh and Gong, Chen and Pan, Shirui},\n  journal={arXiv preprint arXiv:2502.01113},\n  year={2025}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frmanluo%2Fgfm-rag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frmanluo%2Fgfm-rag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frmanluo%2Fgfm-rag/lists"}