https://github.com/arcangelo7/knowledge-graphs-inversion

A tool for RML inversion: converting RDF knowledge graphs back to their original data formats (CSV, SQL) by reversing the RML mapping process
https://github.com/arcangelo7/knowledge-graphs-inversion

Last synced: 10 months ago
JSON representation

A tool for RML inversion: converting RDF knowledge graphs back to their original data formats (CSV, SQL) by reversing the RML mapping process

Host: GitHub
URL: https://github.com/arcangelo7/knowledge-graphs-inversion
Owner: arcangelo7
Created: 2025-06-10T15:28:46.000Z (about 1 year ago)
Default Branch: master
Last Pushed: 2025-08-27T15:56:40.000Z (10 months ago)
Last Synced: 2025-08-27T21:16:29.248Z (10 months ago)
Language: Python
Size: 54.5 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# RML Inversion

A tool for **RML inversion**: converting RDF knowledge graphs back to their original data formats (CSV, SQL) by reversing the RML mapping process.

## Overview

This project implements the inverse process of RML (RDF Mapping Language):
- **Forward RML**: CSV/SQL → RDF using morph-kgc
- **Inverse RML**: RDF → CSV/SQL (this project)

Currently supports:
- **CSV files**
- **SQL databases**

## Requirements

- Python 3.12+
- [uv](https://docs.astral.sh/uv/) package manager

## Quick Start

```bash
# Install uv if needed
curl -LsSf https://astral.sh/uv/install.sh | sh

# Setup project
git clone https://github.com/arcangelo7/knowledge-graphs-inversion.git
cd knowledge-graphs-inversion
uv sync

# Run the main application
uv run python app.py
```

## Managing Dependencies with uv

```bash
# Add new dependency
uv add package-name

# Remove dependency
uv remove package-name

# Update all
uv sync --upgrade

# Run without activating venv
uv run python script.py
```

## Benchmarking

This project integrates the [KROWN benchmark framework](https://github.com/kg-construct/KROWN) for evaluating the performance of the knowledge graphs inversion system with PostgreSQL focus.

### Setup Benchmark Environment

1. **Initialize KROWN submodule:**
```bash
git submodule update --init --recursive
```

2. **Install dependencies:**
```bash
uv sync
```

### Running KROWN Benchmark

**Run PostgreSQL benchmark:**
```bash
# Run with in-memory RDF processing (default)
uv run python benchmarks/run_krown_benchmark.py

# Run with Virtuoso triplestore for better performance on large datasets
uv run python benchmarks/run_krown_benchmark.py --use-virtuoso
```

**Prerequisites for Virtuoso benchmarks:**
If using the `--use-virtuoso` option, you must start Virtuoso before running the benchmark:
```bash
# Start Virtuoso container (required for --use-virtuoso option)
uv run python -m virtuoso_utilities.launch_virtuoso --name virtuoso-kgi --http-port 8890 --detach --wait-ready
```

This will:
- Generate test data using KROWN's data generator (PostgreSQL format)
- Create 3 benchmark scenarios: Small (1K), Medium (10K), Large (50K rows)
- Run the inversion system on each scenario using either in-memory RDF processing or Virtuoso triplestore
- Generate performance metrics and results

**SPARQL Backend Options:**
- **In-memory processing** (default): Uses rdflib for SPARQL queries, suitable for small to medium datasets
- **Virtuoso triplestore** (`--use-virtuoso`): Uses OpenLink Virtuoso for SPARQL queries, recommended for large datasets. Requires pre-existing Virtuoso instance running on localhost:8890

### Benchmark Results

Results are stored in `benchmarks/krown/results/` with:
- Execution times for each scenario
- Data and mapping file sizes
- Triple Maps and Predicate Object Maps counts
- JSON format for analysis

## License

ISC License

## Author

**arcangelo7** - arcangelo.massari@unibo.it

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/arcangelo7/knowledge-graphs-inversion

Awesome Lists containing this project

README