Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gusye1234/nano-graphrag
A simple, easy-to-hack GraphRAG implementation
https://github.com/gusye1234/nano-graphrag
gpt gpt-4o graphrag learning-by-doing llm rag
Last synced: about 1 month ago
JSON representation
A simple, easy-to-hack GraphRAG implementation
- Host: GitHub
- URL: https://github.com/gusye1234/nano-graphrag
- Owner: gusye1234
- Created: 2024-07-25T07:53:58.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2024-08-12T14:02:40.000Z (about 1 month ago)
- Last Synced: 2024-08-12T16:12:09.278Z (about 1 month ago)
- Topics: gpt, gpt-4o, graphrag, learning-by-doing, llm, rag
- Language: Python
- Homepage:
- Size: 175 KB
- Stars: 7
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
- awesome-LLM-resourses - nano-GraphRAG - to-hack GraphRAG implementation. (RAG)
README
nano-GraphRAG
A simple, easy-to-hack GraphRAG implementation
⚠️ It's still under development and not ready yet ⚠️
😭 [GraphRAG](https://arxiv.org/pdf/2404.16130) is good and powerful, but the official [implementation](https://github.com/microsoft/graphrag/tree/main) is difficult/painful to **read or hack**.
😊 This project provides a **smaller, faster, cleaner GraphRAG**, while remaining the core functionality.
🎁 Excluding `tests` and prompts, `nano-graphrag` is about **700 lines of code**.
👌 Small yet **scalable**, **asynchronous** and **fully typed**
## TODO before publishing
- [x] Index
- [x] Chunking
- [x] Entity extraction
- [x] Entity summary
- [x] Compute communities
- [x] Entities Embedding
- [x] Community Report
- [ ] Query
- [ ] Global
- [ ] Local## Install
**Install from PyPi**
```shell
pip install nano-graphrag
```**Install from source**
```shell
# clone this repo first
cd nano-graphrag
pip install -e .
```## Quick Start - Not yet
download a copy of A Christmas Carol by Charles Dickens:
```shell
curl https://raw.githubusercontent.com/gusye1234/nano-graphrag/main/tests/mock_data.txt > ./book.txt
```Use the below python snippet:
```python
from nano_graphrag import GraphRAGgraph_func = GraphRAG(working_dir="./dickens")
with open("./book.txt") as f
graph_func.insert(f.read())print(graph_func.query("What are the top themes in this story?"))
```Next time you initialize a `GraphRAG` from the same `working_dir`, it will reload all the contexts automatically.
### Async Support
For each method `NAME(...)` , there is a corresponding async method `aNAME(...)`
```python
await graph_func.ainsert(...)
await graph_func.aquery(...)
...
```### Available Parameters
In IDE/VSCode, hovering your cursor on `GraphRAG` to see all the available parameters.
## Advanced - Prompts
`nano-graphrag` use prompts from `nano_graphrag.prompt.PROMPTS` dict object. You can play with it and replace any prompt inside.
## Advanced - Storage
You can replace all storage-related components to your own implementation, `nano-graphrag` mainly uses three kinds of storage:
- `base.BaseKVStorage` for storing key-json pairs of data.
- By default we use disk file storage as the backend.
- `GraphRAG(.., key_string_value_json_storage_cls=YOURS,...)`
- `base.BaseVectorStorage` for indexing embeddings.
- By default we use [`milvus-lite`](https://github.com/milvus-io/milvus-lite) as the backend.
- `GraphRAG(.., vector_db_storage_cls=YOURS,...)`
- `base.BaseGraphStorage` for storing knowledge graph.
- By default we use [`networkx`](https://github.com/networkx/networkx) as the backend.
- `GraphRAG(.., graph_storage_cls=YOURS,...)`## Benchmark - Not yet
...