https://github.com/odysa/rdf4j-python
Python client for Eclipse RDF4J — interact with RDF4J repositories, execute SPARQL queries, and manage RDF data seamlessly in Python.
https://github.com/odysa/rdf4j-python
database rag rdf sementic sparql
Last synced: 4 months ago
JSON representation
Python client for Eclipse RDF4J — interact with RDF4J repositories, execute SPARQL queries, and manage RDF data seamlessly in Python.
- Host: GitHub
- URL: https://github.com/odysa/rdf4j-python
- Owner: odysa
- License: bsd-3-clause
- Created: 2025-04-27T15:52:22.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2026-02-28T17:50:12.000Z (4 months ago)
- Last Synced: 2026-02-28T20:09:33.597Z (4 months ago)
- Topics: database, rag, rdf, sementic, sparql
- Language: Python
- Homepage: https://rdf4j-python.readthedocs.io/en/stable/
- Size: 226 KB
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# rdf4j-python
[](https://badge.fury.io/py/rdf4j-python)
[](https://pypi.org/project/rdf4j-python/)
[](https://github.com/odysa/rdf4j-python/actions/workflows/ci.yaml)
[](https://opensource.org/licenses/BSD-3-Clause)
[](https://github.com/odysa/rdf4j-python/tree/main/docs)
**A modern Python client for the Eclipse RDF4J framework, enabling seamless RDF data management and SPARQL operations from Python applications.**
rdf4j-python bridges the gap between Python and the robust [Eclipse RDF4J](https://rdf4j.org/) ecosystem, providing a clean, async-first API for managing RDF repositories, executing SPARQL queries, and handling semantic data with ease.
## Features
- **Async-First Design**: Native support for async/await with synchronous fallback
- **Repository Management**: Create, access, and manage RDF4J repositories programmatically
- **SPARQL Support**: Execute SELECT, ASK, CONSTRUCT, and UPDATE queries effortlessly
- **SPARQL Query Builder**: Fluent, programmatic query construction with method chaining
- **Transaction Support**: Atomic operations with commit/rollback and isolation levels
- **Flexible Data Handling**: Add, retrieve, and manipulate RDF triples and quads
- **File Upload**: Upload RDF files (Turtle, N-Triples, N-Quads, RDF/XML, JSON-LD, TriG, N3) directly to repositories
- **Multiple Formats**: Support for various RDF serialization formats
- **Repository Types**: Memory stores, native stores, HTTP repositories, and more
- **Named Graph Support**: Work with multiple graphs within repositories
- **Inferencing**: Built-in support for RDFS and custom inferencing rules
## Installation
### Prerequisites
- Python 3.11 or higher
- RDF4J Server (for remote repositories) or embedded usage
### Install from PyPI
```bash
pip install rdf4j-python
```
### Install with Optional Dependencies
```bash
# Include SPARQLWrapper integration
pip install rdf4j-python[sparqlwrapper]
```
### Development Installation
```bash
git clone https://github.com/odysa/rdf4j-python.git
cd rdf4j-python
uv sync --group dev
```
## Usage
### Quick Start
```python
import asyncio
from rdf4j_python import AsyncRdf4j
from rdf4j_python.model.repository_config import RepositoryConfig, MemoryStoreConfig, SailRepositoryConfig
from rdf4j_python.model.term import IRI, Literal
async def main():
# Connect to RDF4J server
async with AsyncRdf4j("http://localhost:19780/rdf4j-server") as db:
# Create an in-memory repository
config = RepositoryConfig(
repo_id="my-repo",
title="My Repository",
impl=SailRepositoryConfig(sail_impl=MemoryStoreConfig(persist=False))
)
repo = await db.create_repository(config=config)
# Add some data
await repo.add_statement(
IRI("http://example.com/person/alice"),
IRI("http://xmlns.com/foaf/0.1/name"),
Literal("Alice")
)
# Query the data
results = await repo.query("SELECT * WHERE { ?s ?p ?o }")
for result in results:
print(f"Subject: {result['s']}, Predicate: {result['p']}, Object: {result['o']}")
if __name__ == "__main__":
asyncio.run(main())
```
### SPARQL Query Builder
Build queries programmatically with method chaining instead of writing raw SPARQL strings:
```python
from rdf4j_python import select, ask, construct, describe, GraphPattern, Namespace
ex = Namespace("ex", "http://example.org/")
foaf = Namespace("foaf", "http://xmlns.com/foaf/0.1/")
# SELECT with typed terms — IRIs serialize automatically
query = (
select("?person", "?name")
.where("?person", foaf.type, ex.Person)
.where("?person", foaf.name, "?name")
.optional("?person", foaf.email, "?email")
.filter("?name != 'Bob'")
.order_by("?name")
.limit(10)
.build()
)
# Or use string-based prefixed names
query = (
select("?name")
.prefix("foaf", "http://xmlns.com/foaf/0.1/")
.where("?person", "a", "foaf:Person")
.where("?person", "foaf:name", "?name")
.build()
)
# GROUP BY with aggregation
query = (
select("?city", "(COUNT(?person) AS ?count)")
.where("?person", ex.city, "?city")
.group_by("?city")
.having("COUNT(?person) > 1")
.order_by("DESC(?count)")
.build()
)
# ASK, CONSTRUCT, and DESCRIBE
ask_query = ask().where("?s", ex.name, "?name").build()
construct_query = (
construct(("?s", ex.fullName, "?name"))
.where("?s", ex.firstName, "?fname")
.bind("CONCAT(?fname, ' ', ?lname)", "?name")
.build()
)
describe_query = describe(ex.alice).build()
```
The query builder supports FILTER, OPTIONAL, UNION, BIND, VALUES, sub-queries, DISTINCT, ORDER BY, GROUP BY, HAVING, LIMIT, and OFFSET. Both raw strings and typed objects (`IRI`, `Variable`, `Literal`, `Namespace`) work as terms.
### Working with Multiple Graphs
```python
from rdf4j_python.model.term import Quad
async def multi_graph_example():
async with AsyncRdf4j("http://localhost:19780/rdf4j-server") as db:
repo = await db.get_repository("my-repo")
# Add data to specific graphs
statements = [
Quad(
IRI("http://example.com/person/bob"),
IRI("http://xmlns.com/foaf/0.1/name"),
Literal("Bob"),
IRI("http://example.com/graph/people")
),
Quad(
IRI("http://example.com/person/bob"),
IRI("http://xmlns.com/foaf/0.1/age"),
Literal("30", datatype=IRI("http://www.w3.org/2001/XMLSchema#integer")),
IRI("http://example.com/graph/demographics")
)
]
await repo.add_statements(statements)
# Query specific graph
graph_query = """
SELECT * WHERE {
GRAPH {
?person ?property ?value
}
}
"""
results = await repo.query(graph_query)
```
### Advanced Repository Configuration
Here's a more comprehensive example showing repository creation with different configurations:
```python
async def advanced_example():
async with AsyncRdf4j("http://localhost:19780/rdf4j-server") as db:
# Memory store with persistence
persistent_config = RepositoryConfig(
repo_id="persistent-repo",
title="Persistent Memory Store",
impl=SailRepositoryConfig(sail_impl=MemoryStoreConfig(persist=True))
)
# Create and populate repository
repo = await db.create_repository(config=persistent_config)
# Bulk data operations
data = [
(IRI("http://example.com/alice"), IRI("http://xmlns.com/foaf/0.1/name"), Literal("Alice")),
(IRI("http://example.com/alice"), IRI("http://xmlns.com/foaf/0.1/email"), Literal("alice@example.com")),
(IRI("http://example.com/bob"), IRI("http://xmlns.com/foaf/0.1/name"), Literal("Bob")),
]
statements = [
Quad(subj, pred, obj, IRI("http://example.com/default"))
for subj, pred, obj in data
]
await repo.add_statements(statements)
# Query with the fluent query builder
from rdf4j_python import select
from rdf4j_python.model._namespace import Namespace
foaf = Namespace("foaf", "http://xmlns.com/foaf/0.1/")
query = (
select("?name", "?email")
.where("?person", foaf.name, "?name")
.optional("?person", foaf.email, "?email")
.order_by("?name")
.build()
)
results = await repo.query(query)
```
### Uploading RDF Files
```python
import pyoxigraph as og
async def upload_example():
async with AsyncRdf4j("http://localhost:19780/rdf4j-server") as db:
repo = await db.get_repository("my-repo")
# Upload a Turtle file (format auto-detected from extension)
await repo.upload_file("data.ttl")
# Upload to a specific named graph
await repo.upload_file("data.ttl", context=IRI("http://example.com/graph"))
# Upload with explicit format
await repo.upload_file("data.txt", rdf_format=og.RdfFormat.N_TRIPLES)
# Upload with base URI for relative URIs
await repo.upload_file("data.ttl", base_uri="http://example.com/")
```
### Using Transactions
```python
from rdf4j_python import IsolationLevel
async def transaction_example():
async with AsyncRdf4j("http://localhost:19780/rdf4j-server") as db:
repo = await db.get_repository("my-repo")
# Atomic operations with auto-commit/rollback
async with repo.transaction() as txn:
await txn.add_statements([
Quad(IRI("http://example.com/alice"), IRI("http://xmlns.com/foaf/0.1/name"), Literal("Alice")),
Quad(IRI("http://example.com/bob"), IRI("http://xmlns.com/foaf/0.1/name"), Literal("Bob")),
])
await txn.delete_statements([old_quad])
# Commits automatically on success, rolls back on exception
# With specific isolation level
async with repo.transaction(IsolationLevel.SERIALIZABLE) as txn:
await txn.update("""
DELETE { ?s "draft" }
INSERT { ?s "published" }
WHERE { ?s "draft" }
""")
```
For more detailed examples, see the [examples](examples/) directory.
## Development
### Setting up Development Environment
1. **Clone the repository**:
```bash
git clone https://github.com/odysa/rdf4j-python.git
cd rdf4j-python
```
2. **Install development dependencies**:
```bash
uv sync --group dev
```
3. **Start RDF4J Server** (for integration tests):
```bash
# Using Docker
docker run -p 19780:8080 eclipse/rdf4j:latest
```
4. **Run tests**:
```bash
pytest tests/
```
5. **Run linting**:
```bash
ruff check .
ruff format .
```
### Project Structure
```
rdf4j_python/
├── _driver/ # Core async driver implementation
├── model/ # Data models and configurations
├── query/ # SPARQL query builder
├── exception/ # Custom exceptions
└── utils/ # Utility functions
examples/ # Usage examples
tests/ # Test suite
docs/ # Documentation
```
## Contributing
We welcome contributions! Here's how to get involved:
1. Fork the repository on GitHub
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Make your changes and add tests
4. Run the test suite to ensure everything works
5. Commit your changes (`git commit -m 'Add amazing feature'`)
6. Push to your branch (`git push origin feature/amazing-feature`)
7. Open a Pull Request
### Running Examples
```bash
# Make sure RDF4J server is running on localhost:19780
python examples/complete_workflow.py
python examples/query.py
```
## License
This project is licensed under the BSD 3-Clause License. See the [LICENSE](LICENSE) file for details.
Copyright (c) 2025, Chengxu Bian
## Support
- **Issues & Bug Reports**: [GitHub Issues](https://github.com/odysa/rdf4j-python/issues)
- **Documentation**: [docs/](https://github.com/odysa/rdf4j-python/tree/main/docs)
- **Questions**: Feel free to open a discussion or issue
If you find this project useful, please consider starring the repository!