An open API service indexing awesome lists of open source software.

https://github.com/hscspring/nlm

Memory for Knowledge Graph, using Neo4j. 知识图谱存储与查询。
https://github.com/hscspring/nlm

knowledge-graph neo4j

Last synced: 9 months ago
JSON representation

Memory for Knowledge Graph, using Neo4j. 知识图谱存储与查询。

Awesome Lists containing this project

README

          

# README

This is a repo focused on NLP memory. Specifically, memorize (store) a node or relationship to the knowledge graph (Actually a Neo4j database instance). And recall (query) a node or relationship from the memory. It's not only a module, but also a RPC service which can be easily setup.

Here are some scenes:

- When input is a node or relationship
- Use several information of a node or relationship to recall a node or a relationship with full information in the memory.
- Automatically add a node or relationship when there is nothing to recall.
- Automatically update the properties of a node or relationship when a node or relationship has been recalled.
- When input is a raw string or a NLU output
- Automatically extract nodes or relationships from the input.
- Then do the things above.

The extractor is in development.

Furthermore, recalls are based on nodes (label and name) and relationships (start, end, kind), and their properties are mainly used to sort the results.

中文文档和设计思想:[自然语言记忆模块(NLM) | Yam](https://yam.gift/2019/12/02/2019-12-02-NLM/)。

## Setup

**IMPORTANT**: only support Python3.7+.

- Step 1: Install dependencies

```bash
# do not have pipenv
$ python3 -m venv env
$ source env/bin/activate
$ pip install -r requirements.txt
```

- Step 2: Setup a neo4j database

```bash
$ docker run --rm -it -p 7475:7474 -p 7688:7687 neo4j
```

Here we use another two ports for play and test.

When the docker has been set up, you should open `http://localhost:7475/browser/`, modify the port to 7688, input the password `neo4j` and then change the password to `password`

- Step 3: Running the tests

```bash
$ pipenv shell
$ pytest
```

This step will add 8 nodes and relationships to your Neo4j database.

![](./imgs/graph1.png)

The document is under `./docs` which can be generated by [Sphinx](http://www.sphinx-doc.org/en/master/), just run `make html`.

## Usage

### Module

```python
from py2neo.database import Graph
from nlm import NLMLayer, GraphNode, GraphRelation

mem = NLMLayer(graph=Graph(port=7688),
fuzzy_node=False,
add_inexistence=False,
update_props=False)

############ Node ############
# recall
node = GraphNode("Person", "AliceThree")
mem(node)
[GraphNode(label='Person', name='AliceThree', props={'age': 22, 'sex': 'male'})]

# add inexistence, here `add_inexistence=True` has covered the NLMLayer config.
new = GraphNode("Person", "Bob")
mem(new, add_inexistence=True)
[]

# fuzzy recall
node = GraphNode("Person", "AliceT")
mem(node, fuzzy_node=True)
[GraphNode(label='Person', name='AliceTwo', props={'age': 21, 'occupation': 'teacher'})]

# update property
node = GraphNode("Person", "AliceThree", props={"age": 24})
mem(node, update_props=True)
[GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'})]

# topn
node = GraphNode("Person", "AliceT")
mem(node, fuzzy_node=True, topn=2)
[GraphNode(label='Person', name='AliceTwo', props={'age': 21, 'occupation': 'teacher'}),
GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'})
]

############ Relation ############

# recall
start = GraphNode("Person", "AliceThree")
end = GraphNode("Person", "AliceOne")
relation = GraphRelation(start, end, "LOVES")
mem(relation)
[GraphRelation(
start=GraphNode(label='Person', name='AliceThree', props={'age': 22, 'sex': 'male'}),
end=GraphNode(label='Person', name='AliceOne', props={'occupation': 'teacher', 'age': 22, 'sex': 'female'}),
kind='LOVES',
props={'from': 2011, 'roles': 'husband'})
]

# add inexistence
start = GraphNode("Person", "AliceThree")
end = GraphNode("Person", "Bob")
relation = GraphRelation(start, end, "KNOWS")
mem(relation, add_inexistence=True)
[]

# fuzzy recall
start = GraphNode("Person", "AliceTh")
end = GraphNode("Person", "AliceO")
relation = GraphRelation(start, end, "LOVES")
mem(relation, fuzzy_node=True)
[GraphRelation(
start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}),
end=GraphNode(label='Person', name='AliceOne', props={'occupation': 'teacher', 'age': 22, 'sex': 'female'}),
kind='LOVES',
props={'from': 2011, 'roles': 'husband'})
]

# two nodes, topn
start = GraphNode("Person", "AliceThree")
end = GraphNode("Person", "AliceOne")
relation = GraphRelation(start, end)
mem(relation, topn=3)
[GraphRelation(
start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}),
end=GraphNode(label='Person', name='AliceOne', props={'occupation': 'teacher', 'age': 22, 'sex': 'female'}),
kind='WORK_WITH',
props={'from': 2009, 'roles': 'boss'}),
GraphRelation(
start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}),
end=GraphNode(label='Person', name='AliceOne', props={'occupation': 'teacher', 'age': 22, 'sex': 'female'}),
kind='LOVES',
props={'from': 2011, 'roles': 'husband'})
]

# update property (relationship)
start = GraphNode("Person", "AliceThree")
end = GraphNode("Person", "Bob")
relation = GraphRelation(start, end, "KNOWS", {"roles": "classmate"})
mem(relation, update_props=True)
[GraphRelation(
start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}),
end=GraphNode(label='Person', name='Bob', props={}),
kind='KNOWS',
props={})
]
mem(relation)
[GraphRelation(
start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}),
end=GraphNode(label='Person', name='Bob', props={}),
kind='KNOWS',
props={'roles': 'classmate'})
]

# update property (node + relationship)
start = GraphNode("Person", "AliceThree")
end = GraphNode("Person", "Bob", {"sex": "male"})
relation = GraphRelation(start, end, "KNOWS", {"roles": "friend"})
mem(relation, update_props=True)
[GraphRelation(
start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}),
end=GraphNode(label='Person', name='Bob', props={'sex': 'male'}),
kind='KNOWS',
props={'roles': 'friend'})
]

start = GraphNode("Person", "AliceThree")
end = GraphNode("Person", "Bob", {"sex": "male"})
relation = GraphRelation(start, end, "STUDY_WITH", {"roles": "classmate"})
mem(relation, update_props=True)
mem(relation)
[GraphRelation(
start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}),
end=GraphNode(label='Person', name='Bob', props={'sex': 'male'}),
kind='STUDY_WITH',
props={'roles': 'classmate'})
]

mem(GraphRelation(start, end), topn=2)
[GraphRelation(
start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}),
end=GraphNode(label='Person', name='Bob', props={'sex': 'male'}),
kind='STUDY_WITH',
props={'roles': 'classmate'}),
GraphRelation(
start=GraphNode(label='Person', name='AliceThree', props={'age': 24, 'sex': 'male'}),
end=GraphNode(label='Person', name='Bob', props={'sex': 'male'}),
kind='KNOWS',
props={'roles': 'friend'})
]

############ RawString and NLU Output ############
# will first extract nodes or relationships, then like the above.
# will coming soon.

############ Graph ############
mem.labels
frozenset({'Person'})

mem.relationship_types
frozenset({'KNOWS', 'LIKES', 'LOVES', 'STUDY_WITH', 'WORK_WITH'})

mem.nodes_num
9

mem.relationships_num
10

mem.nodes
# all nodes generator

mem.relationships
# all relationships generator

mem.query("MATCH (a:Person) RETURN a.age, a.name LIMIT 5")
[{'a.age': 21, 'a.name': 'AliceTwo'},
{'a.age': 23, 'a.name': 'AliceFour'},
{'a.age': 22, 'a.name': 'AliceOne'},
{'a.age': 24, 'a.name': 'AliceFive'},
{'a.age': None, 'a.name': 'Bob'}
]
```

Since our `mem` is actually inherited from the `py2neo.Graph`, all the functions in the `py2neo.Graph` can be called through `mem`. We just make it more convenient and easy to use, especially focus on storage and query.

In addition, when `fuzzy_node` is True, properties will not be updated. Because the query might be a fuzzy node which does not have the properties we have sent in.

### RPC Service

In the gRPC service, you have to have the parameters be set when you are running the serve.

```bash
$ python server.py [OPTIONS]

Options:
-fn fuzzy_node
-ai add_inexistence
-up update_props
```

You could use any programming language in the client side, more detail please read [gRPC](https://grpc.io/).

There are total 4 interfaces here:

- NodeRecall
- RelationRecall
- StrRecall
- NLURecall

The last two is still in development. There is a python client example (`client.py`) in the repo.

## Why

The original intention is to build a memory part for [chatbot](https://yam.gift/2019/07/20/2019-07-20-ChatBot-Design/). We just want the chatbot to automatically memorize the nodes and relationships discovered in dialogue. The input was defined to be the output of NLU (understand) layer. We also want to use the information when the chatbot is responding. So the output was defined to be the input of NLG (generate) layer or NLI (infer) layer. That's it.

## Batch

We have also written an example (under `./batch_example`) to add many nodes and relationships in one time. The data comes from [QASystemOnMedicalKG](https://github.com/liuhuanyong/QASystemOnMedicalKG), feel free to modify the code to fit your demand.

## Changelog

- 191201 create