Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dobraczka/eche
πΈοΈ Little helper for handling entity clusters
https://github.com/dobraczka/eche
clustering connected-components deduplication entity-resolution record-linkage transitive-closure
Last synced: about 1 month ago
JSON representation
πΈοΈ Little helper for handling entity clusters
- Host: GitHub
- URL: https://github.com/dobraczka/eche
- Owner: dobraczka
- Created: 2024-02-14T16:34:22.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-03-22T14:22:52.000Z (9 months ago)
- Last Synced: 2024-10-01T09:11:59.017Z (3 months ago)
- Topics: clustering, connected-components, deduplication, entity-resolution, record-linkage, transitive-closure
- Language: Python
- Homepage: https://eche.readthedocs.io
- Size: 95.7 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
Usage
=====
Eche provides a `ClusterHelper` class to conveniently handle entity clusters.```python
from eche import ClusterHelper
ch = ClusterHelper([{"a1", "b1"}, {"a2", "b2"}])
print(ch.clusters)
{0: {'a1', 'b1'}, 1: {'a2', 'b2'}}
```Add an element to a cluster
```python
ch.add_to_cluster(0, "c1")
print(ch.clusters)
{0: {'a1', 'b1', 'c1'}, 1: {'a2', 'b2'}}
```Add a new cluster
```python
ch.add({"e2", "f1", "c3"})
print(ch.clusters)
{0: {'a1', 'b1', 'c1'}, 1: {'a2', 'b2'}, 2: {'f1', 'e2', 'c3'}}
```Remove an element from a cluster
```python
ch.remove("b1")
print(ch.clusters)
{0: {'a1', 'c1'}, 1: {'a2', 'b2'}, 2: {'f1', 'e2', 'c3'}}
```The ``__contains__`` function is smartly overloaded. You can check if an entity is in the `ClusterHelper`:
```python
"a1" in ch
# True
```If a cluster is present
```python
{"c1","a1"} in ch
# True
```And even if a link exists or not
```python
("f1","e2") in ch
# True
("a1","e2") in ch
# False
```To know the cluster id of an entity you can look it up with
```python
print(ch.elements["a1"])
0
```To get members of a cluster either use
```python
print(ch.members(0))
{'a1', 'b1', 'c1'}
```or simply
```python
print(ch[0])
{'a1', 'b1', 'c1'}
```More functions can be found in the [Documentation](https://eche.readthedocs.io).
Installation
============
Simply use `pip` for installation:
```
pip install eche
```