{"id":25812181,"url":"https://github.com/aghoshpro/graphrag","last_synced_at":"2026-04-12T15:40:12.370Z","repository":{"id":278407371,"uuid":"314205119","full_name":"aghoshpro/GraphRAG","owner":"aghoshpro","description":"Building knowledge graph (KG) from raw text, creating a community hierarchy, generating summaries, and using these structures for RAG-based tasks.","archived":false,"fork":false,"pushed_at":"2025-03-13T17:29:09.000Z","size":5521,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-22T18:06:51.436Z","etag":null,"topics":["anthropic-claude","embeddings","knowledge-graph","langchain","neo4j","ollama","openai","rag","txt"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aghoshpro.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-11-19T09:56:11.000Z","updated_at":"2025-03-13T17:29:14.000Z","dependencies_parsed_at":"2025-03-13T18:41:14.671Z","dependency_job_id":null,"html_url":"https://github.com/aghoshpro/GraphRAG","commit_stats":null,"previous_names":["aghoshpro/graphragx"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/aghoshpro/GraphRAG","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aghoshpro%2FGraphRAG","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aghoshpro%2FGraphRAG/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aghoshpro%2FGraphRAG/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aghoshpro%2FGraphRAG/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aghoshpro","download_url":"https://codeload.github.com/aghoshpro/GraphRAG/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aghoshpro%2FGraphRAG/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261338952,"owners_count":23143892,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anthropic-claude","embeddings","knowledge-graph","langchain","neo4j","ollama","openai","rag","txt"],"created_at":"2025-02-28T01:53:06.189Z","updated_at":"2026-04-12T15:40:12.358Z","avatar_url":"https://github.com/aghoshpro.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Knowledge Graph RAG\n\nKnowledge Graph RAG is a special type of RAG system that combines the benefits of knowledge graphs with large language models (LLMs). In GraphRAG, the knowledge graph serves as an organised data structure of factual information, while the LLM functions as a reasoning engine, processing user questions, obtaining appropriate knowledge from the graph, and delivering logical responses.\n\nRecent scientific research [[1](https://arxiv.org/abs/2405.02048), [2](https://arxiv.org/pdf/2404.16130)] reveals that GraphRAG outperforms vector store-powered traditional RAG systems with more accurate responses in a cost effective and scalable way.\n\n\u003cimg src=\"assets\\grapgragg.jpg\"\u003e\n\nConsider the following scenario to demonstrate GraphRAG's effectiveness: both a traditional RAG and GraphRAG are charged with identifying the top five themes in a dataset. The baseline RAG struggled to retrieve unrelated text and represent the essential themes accurately. In contrast, GraphRAG provided a concise and meaningful response, identifying the main ideas and supporting them with source material references.\n\n## Traditional RAG vs GraphRAG\n\ntraditional RAG treats each document as an isolated unit, while graph RAG captures the relationships between entities mentioned in the documents. This structural understanding enables more sophisticated querying and better answers.\n\n\u003cimg src=\"assets\\Vs.png\"\u003e\n\n- Source: [Collab](https://colab.research.google.com/drive/1MnZ6CeUNiVTrJGwYpJaduQBbCsNEVrbD?usp=sharing#scrollTo=iXmdiUY7NlQA)\n\n## Environment Setup\n\n### 1. Clone Repository\n\n- Create a working directory and navigate to it:\n\n  ```bash\n  cd GraphRAG\n  ```\n\n- Open `cmd` or `terminal` to clone repository\n\n  ```bash\n  git clone https://github.com/aghoshpro/GraphRAG.git\n  ```\n\n### 2. **Setup Local Environment**\n\n- Create a virtual environment `myvenv` inside the `./GraphRAG` folder and activate it:\n\n  ```bash\n  python -m venv myvenv\n  ```\n\n  ```bash\n  # Windows\n  .\\myvenv\\Scripts\\activate    # OR source myvenv/bin/activate (in Linux or Mac)\n  ```\n\n- Install dependencies:\n\n  ```bash\n  pip install --upgrade -r requirements.txt\n  ```\n\u003c!-- ### Get API Keys\n\n- Anthropic: \u003chttps://console.anthropic.com/settings/keys\u003e\n- OpenAI \u003chttps://platform.openai.com/settings/proj_D0EtqGQ3jNT0h8LnOHnLAVkO/api-keys\u003e\n\n- Put them in `.env` file and add it to `.gitignore` so it will be not shared during git commit --\u003e\n\n### 3. Start Neo4J Docker\n\n  ```sh\n  docker compose up\n  ```\n\n## 🕹️ Run\n\n```bash\nstreamlit run app.py\n```\n\n- Select `llama3.2` as the model and start chatting.\n\n### 🧪 OR Experiment with code if you want\n\n  ```sh\n  jupyter notebook\n  ```\n\n## References\n\n1. [pip](https://pip.pypa.io/en/stable/installation/)\n2. [PythonNotes](https://note.nkmk.me/en/)\n3. [LangChain ChatModels](https://python.langchain.com/docs/integrations/chat/)\n4. [LangChain Neo4J](https://neo4j.com/labs/genai-ecosystem/langchain/)\n5. [LangChain Community](https://api.python.langchain.com/en/latest/community_api_reference.html)\n6. [Langchain SQL](https://python.langchain.com/docs/how_to/sql_prompting/)\n\n### Knowledge Graph\n\n1. [Neo4J KG](https://neo4j.com/blog/genai/what-is-knowledge-graph/)\n2. [RDFvsPropKG](https://neo4j.com/blog/knowledge-graph/rdf-vs-property-graphs-knowledge-graphs/)\n3. [NLPGraph](https://journalofbigdata.springeropen.com/articles/10.1186/s40537-020-00383-w/metrics)\n\n### Graph RAG\n\n1. [Kdnuggets](https://www.kdnuggets.com/an-introduction-to-graph-rag)\n2. [OntoText](https://www.ontotext.com/blog/matching-skills-and-candidates-with-graph-rag/)\n3. [falkordb](https://www.falkordb.com/blogs/what-is-graphrag/)\n4. [PG Vector](https://supabase.com/blog/openai-embeddings-postgres-vector)\n\n### Colab\n\n1. [Colab](https://colab.research.google.com/drive/1MnZ6CeUNiVTrJGwYpJaduQBbCsNEVrbD?usp=sharing#scrollTo=iXmdiUY7NlQA)\n\n\u003c!-- To find all districts of Bolzano with an elevation between 510-520 meters, you can use the following SPARQL query:\\n\\n\n\n```sparql\nSELECT ?distName ?elevation WHERE {\n  ?region rdfs:label ?distName .\n  ?region geo:asWKT ?distWkt .\n  FILTER (CONTAINS(?distWkt, 'Bolzano') AND CONTAINS(?distWkt,'510-520'))\n}\n```\n\n\\n\\nThis query filters the districts where the elevation is between 510-520 meters and contains the label \"Bolzano\".' --\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faghoshpro%2Fgraphrag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faghoshpro%2Fgraphrag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faghoshpro%2Fgraphrag/lists"}