https://github.com/dobraczka/embarrassment
πΌπΌπΌ Convenience functions to work with pandas triple dataframes
https://github.com/dobraczka/embarrassment
knowledge-graph pandas rdf triples
Last synced: 3 months ago
JSON representation
πΌπΌπΌ Convenience functions to work with pandas triple dataframes
- Host: GitHub
- URL: https://github.com/dobraczka/embarrassment
- Owner: dobraczka
- Created: 2024-02-09T09:26:40.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-02-12T15:56:51.000Z (about 1 year ago)
- Last Synced: 2025-02-01T23:35:11.006Z (3 months ago)
- Topics: knowledge-graph, pandas, rdf, triples
- Language: Python
- Homepage: https://embarrassment.readthedocs.io/
- Size: 267 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
![]()
embarrassment
Convenience functions for pandas dataframes containing triples. Fun fact: a group of pandas (e.g. three) is commonly referred to as an [embarrassment](https://www.zmescience.com/feature-post/what-is-a-group-of-pandas-called-its-surprisingly-complicated/).
This library's main focus is to easily make commonly used functions available, when exploring [triples](https://en.wikipedia.org/wiki/Semantic_triple) stored in pandas dataframes. It is not meant to be an efficient graph analysis library.
Usage
=====
You can use a variety of convenience functions, let's create some simple example triples:
```python
>>> import pandas as pd
>>> rel = pd.DataFrame([("e1","rel1","e2"), ("e3", "rel2", "e1")], columns=["head","relation","tail"])
>>> attr = pd.DataFrame([("e1","attr1","lorem ipsum"), ("e2","attr2","dolor")], columns=["head","relation","tail"])
```
Search in attribute triples:
```python
>>> from embarrassment import search
>>> search(attr, "lorem ipsum")
head relation tail
0 e1 attr1 lorem ipsum
>>> search(attr, "lorem", method="substring")
head relation tail
0 e1 attr1 lorem ipsum
```
Select triples with a specific relation:
```python
>>> from embarrassment import select_rel
>>> select_rel(rel, "rel1")
head relation tail
0 e1 rel1 e2
```
Perform operations on the immediate neighbor(s) of an entity, e.g. get the attribute triples:
```python
>>> from embarrassment import neighbor_attr_triples
>>> neighbor_attr_triples(rel, attr, "e1")
head relation tail
1 e2 attr2 dolor
```
Or just get the triples:
```python
>>> from embarrassment import neighbor_rel_triples
>>> neighbor_rel_triples(rel, "e1")
head relation tail
1 e3 rel2 e1
0 e1 rel1 e2
```
By default you get in- and out-links, but you can specify a direction:
```python
>>> neighbor_rel_triples(rel, "e1", in_out_both="in")
head relation tail
1 e3 rel2 e1
>>> neighbor_rel_triples(rel, "e1", in_out_both="out")
head relation tail
0 e1 rel1 e2
```Using pandas' [pipe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pipe.html) operator you can chain operations.
Let's see a more elaborate example by loading a dataset from [sylloge](https://github.com/dobraczka/sylloge):```python
>>> from sylloge import MovieGraphBenchmark
>>> from embarrassment import clean, neighbor_attr_triples, search, select_rel
>>> ds = MovieGraphBenchmark()
>>> # clean attribute triples
>>> cleaned_attr = clean(ds.attr_triples_left)
>>> # find uri of James Tolkan
>>> jt = search(cleaned_attr, query="James Tolkan")["head"].iloc[0]
>>> # get neighbor triples
>>> # and select triples with title and show values
>>> title_rel = "https://www.scads.de/movieBenchmark/ontology/title"
>>> ds.rel_triples_left.pipe(
neighbor_attr_triples, attr_df=cleaned_attr, wanted_eid=jt
).pipe(select_rel, rel=title_rel)["tail"]
12234 A Nero Wolfe Mystery
12282 Door to Death
12440 Die Like a Dog
12461 The Next Witness
Name: tail, dtype: object
```Installation
============
You can install `embarrassment` via pip:
```
pip install embarrassment
```