https://github.com/intellectronica/battle-of-the-semantics
GraphRag vs Embeddings
https://github.com/intellectronica/battle-of-the-semantics
ai embeddings graphrag llm rag
Last synced: 30 days ago
JSON representation
GraphRag vs Embeddings
- Host: GitHub
- URL: https://github.com/intellectronica/battle-of-the-semantics
- Owner: intellectronica
- Created: 2024-07-08T08:04:21.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-07-14T17:50:49.000Z (about 1 year ago)
- Last Synced: 2024-07-14T19:17:19.039Z (about 1 year ago)
- Topics: ai, embeddings, graphrag, llm, rag
- Language: Jupyter Notebook
- Homepage:
- Size: 672 KB
- Stars: 12
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Battle of the Semantics: GraphRag vs Embeddings Index
Retrieval Augmented Generation (RAG) is often performed by chunking long texts, creating a text embedding for each chunk, and retrieving chunks for including in the LLM generation context based on a similarity search against the query. This approach works well in many scenarios, and at compelling speed and cost trade-offs, but doesn't always cope well in scenarios where a detailed understanding of the text is required.
GraphRag ( [microsoft.github.io/graphrag](https://microsoft.github.io/graphrag/) ), a new indexing method released by Microsoft, promises to address this defficiency by using an LLM to analyse the indexed text and construct and knowledge graph of entities from it. This more detailed semantic understanding of the content of the text can result in searches that produces a more accurate and complete context for the LLM to work with in generation.
To compare both method, let's see what results we get when indexing and retrieving the text with both techniques.
See [battle-of-the-semantics.ipynb](battle-of-the-semantics.ipynb)
[](https://youtu.be/Y2pwIrhboro)