{"id":30208520,"url":"https://github.com/264gaurav/graph_rag","last_synced_at":"2025-09-13T07:12:55.579Z","repository":{"id":309195543,"uuid":"1035430948","full_name":"264Gaurav/Graph_RAG","owner":"264Gaurav","description":"Graph RAG system using Neo4j, Gemma, and Groq — Imports documents, converts them into nodes/relationships via Cypher, and stores them in Neo4j. User queries retrieve relevant graph data, enabling multi-hop reasoning and accurate, context-aware answers powered by the Gemma model on Groq.","archived":false,"fork":false,"pushed_at":"2025-08-10T13:32:35.000Z","size":1049,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-08-10T14:25:19.597Z","etag":null,"topics":["gemma","google-colab-notebook","graph-databases","graphdb","groq","knowledge-graph","langchain","neo4j","rag"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/264Gaurav.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-08-10T11:39:33.000Z","updated_at":"2025-08-10T13:32:38.000Z","dependencies_parsed_at":"2025-08-10T14:26:22.593Z","dependency_job_id":"d8fcaa36-6278-4efd-826c-4b9c9f342a54","html_url":"https://github.com/264Gaurav/Graph_RAG","commit_stats":null,"previous_names":["264gaurav/graph_rag"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/264Gaurav/Graph_RAG","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/264Gaurav%2FGraph_RAG","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/264Gaurav%2FGraph_RAG/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/264Gaurav%2FGraph_RAG/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/264Gaurav%2FGraph_RAG/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/264Gaurav","download_url":"https://codeload.github.com/264Gaurav/Graph_RAG/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/264Gaurav%2FGraph_RAG/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274931771,"owners_count":25375990,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-13T02:00:10.085Z","response_time":70,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gemma","google-colab-notebook","graph-databases","graphdb","groq","knowledge-graph","langchain","neo4j","rag"],"created_at":"2025-08-13T18:01:08.032Z","updated_at":"2025-09-13T07:12:55.564Z","avatar_url":"https://github.com/264Gaurav.png","language":"Jupyter Notebook","readme":"# 🚀 Graph RAG — Neo4j + Gemma (Groq) + Langchain\n\nA **Graph-based Retrieval-Augmented Generation** system that ingests documents, builds a **Neo4j knowledge graph** with Cypher, and uses **Gemma** on the **Groq platform** for fast, accurate, relationship-aware question answering.\n\n---\n\n## 📌 Overview\n\nTraditional **RAG** retrieves chunks of unstructured text using search techniques such as **dense vector similarity**, **sparse/lexical search** (e.g., keyword or BM25), or **hybrid search** that combines both approaches.\n**Graph RAG** goes further — it stores data as **entities** (nodes) and **relationships** (edges) in a **graph database**, enabling **multi-hop reasoning**(the ability to connect and traverse multiple linked facts to answer complex queries) and delivering **explainable answers**.\n\nThis project:\n\n1. Ingests documents.\n2. Extracts entities and relationships.\n3. Stores them in **Neo4j**.\n4. Uses **LangChain’s GraphCypherQAChain** to query the graph.\n5. Passes relevant context to **Gemma** (via Groq) for final answer generation.\n\n---\n\n## 🧠 Key Concepts\n\n- **Graph Database (Neo4j):** Stores and queries data as nodes \u0026 edges for connected insights.\n- **Knowledge Graph:** Structured network of facts linking entities and relationships.\n- **RAG:** Retrieval-Augmented Generation — retrieve external data, feed to an LLM.\n- **Graph RAG:** RAG enhanced with graph queries for deeper, relationship-aware reasoning.\n\n---\n\n## ⚙️ Architecture\n\n```mermaid\nflowchart LR\n    A[Document Ingestion] --\u003e B[Entity \u0026 Relationship Extraction]\n    B --\u003e C[Cypher Query Generation]\n    C --\u003e D[Neo4j Knowledge Graph]\n    E[User Query] --\u003e F[GraphCypherQAChain]\n    D --\u003e F\n    F --\u003e G[Gemma LLM via Groq]\n    G --\u003e H[Context-Aware Answer]\n```\n\n---\n\n## ▶️ Quickstart\n\n```bash\npip install --upgrade langchain langchain-community langchain-groq neo4j\n\nexport NEO4J_URI=\"bolt://\u003chost\u003e:7687\"\nexport NEO4J_USERNAME=\"neo4j\"\nexport NEO4J_PASSWORD=\"\u003cpassword\u003e\"\nexport GROQ_API_KEY=\"\u003cgroq-api-key\u003e\"\n```\n\n---\n\n## 💻 Example Usage\n\n```python\nfrom langchain_community.graphs import Neo4jGraph\nfrom langchain_groq import ChatGroq\nfrom langchain.chains import GraphCypherQAChain\nimport os\n\ngraph = Neo4jGraph(url=os.environ[\"NEO4J_URI\"],\n                   username=os.environ[\"NEO4J_USERNAME\"],\n                   password=os.environ[\"NEO4J_PASSWORD\"])\ngraph.refresh_schema()\n\nllm = ChatGroq(groq_api_key=os.environ[\"GROQ_API_KEY\"], model_name=\"Gemma2-9b-It\")\n\nchain = GraphCypherQAChain.from_llm(llm=llm, graph=graph, verbose=True, allow_dangerous_requests=True)\n\nresult = chain.invoke({\"query\": \"Who was the director of the movie GoldenEye\"})\nprint(result)\n```\n\n---\n\n## 🔧 Example Cypher\n\n```cypher\nLOAD CSV WITH HEADERS FROM 'https://raw.githubusercontent.com/.../movies_small.csv' AS row\nMERGE (m:Movie {id: row.movieId})\nSET m.title = row.title, m.released = date(row.released), m.imdbRating = toFloat(row.imdbRating)\nFOREACH (actor IN split(row.actors, '|') |\n  MERGE (p:Person {name: trim(actor)}) MERGE (p)-[:ACTED_IN]-\u003e(m))\nFOREACH (director IN split(row.directors, '|') |\n  MERGE (p:Person {name: trim(director)}) MERGE (p)-[:DIRECTED]-\u003e(m))\nFOREACH (genre IN split(row.genres, '|') |\n  MERGE (g:Genre {name: trim(genre)}) MERGE (m)-[:IN_GENRE]-\u003e(g));\n```\n\n---\n\n## 📸 Sample Output\n\n**Database visualisation in Graph :** (you can see here `https://console-preview.neo4j.io/tools/query` )\n\n![Sample Output](images/img1.png)\n\n---\n\n**Database visualisation in Table :**\n\n![Sample Output](images/img2.png)\n\n---\n\n\u003e The screenshot above shows the reasoning steps and final answer generated by the **Gemma model** after retrieving relevant nodes and relationships from **Neo4j**.\n\n---\n\n## 🎯 Benefits of Graph RAG\n\n✅ Multi-hop reasoning over connected facts\n✅ More accurate, explainable answers\n✅ Works well in finance, healthcare, research, legal domains\n\n---\n\n## 📌 Tech Stack\n\n- **Neo4j** — Graph database\n- **Cypher** — Graph query language\n- **Gemma** — Large Language Model\n- **Groq** — High-speed inference\n- **LangChain** — Orchestration\n\n---\n\n## ⚠️ Notes\n\n- Use environment variables or secret managers for credentials.\n- `allow_dangerous_requests=True` allows generated Cypher execution — validate queries in production.\n- Enhance ingestion with NLP-based entity/relation extraction for better graph quality.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F264gaurav%2Fgraph_rag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F264gaurav%2Fgraph_rag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F264gaurav%2Fgraph_rag/lists"}