Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dobraczka/eche

πŸ•ΈοΈ Little helper for handling entity clusters
https://github.com/dobraczka/eche

clustering connected-components deduplication entity-resolution record-linkage transitive-closure

Last synced: about 1 month ago
JSON representation

πŸ•ΈοΈ Little helper for handling entity clusters

Awesome Lists containing this project

README

        


eche logo


Actions Status
Documentation Status
Stable python versions
Ruff

Usage
=====
Eche provides a `ClusterHelper` class to conveniently handle entity clusters.

```python
from eche import ClusterHelper
ch = ClusterHelper([{"a1", "b1"}, {"a2", "b2"}])
print(ch.clusters)
{0: {'a1', 'b1'}, 1: {'a2', 'b2'}}
```

Add an element to a cluster

```python
ch.add_to_cluster(0, "c1")
print(ch.clusters)
{0: {'a1', 'b1', 'c1'}, 1: {'a2', 'b2'}}
```

Add a new cluster

```python
ch.add({"e2", "f1", "c3"})
print(ch.clusters)
{0: {'a1', 'b1', 'c1'}, 1: {'a2', 'b2'}, 2: {'f1', 'e2', 'c3'}}
```

Remove an element from a cluster

```python
ch.remove("b1")
print(ch.clusters)
{0: {'a1', 'c1'}, 1: {'a2', 'b2'}, 2: {'f1', 'e2', 'c3'}}
```

The ``__contains__`` function is smartly overloaded. You can check if an entity is in the `ClusterHelper`:

```python
"a1" in ch
# True
```

If a cluster is present

```python
{"c1","a1"} in ch
# True
```

And even if a link exists or not

```python
("f1","e2") in ch
# True
("a1","e2") in ch
# False
```

To know the cluster id of an entity you can look it up with

```python
print(ch.elements["a1"])
0
```

To get members of a cluster either use

```python
print(ch.members(0))
{'a1', 'b1', 'c1'}
```

or simply

```python
print(ch[0])
{'a1', 'b1', 'c1'}
```

More functions can be found in the [Documentation](https://eche.readthedocs.io).

Installation
============
Simply use `pip` for installation:
```
pip install eche
```