https://github.com/ch-sander/zotero_rdf_server
This server loads multiple Zotero libraries into an RDF graph, exposes a local SPARQL endpoint, and allows exporting the graph.
https://github.com/ch-sander/zotero_rdf_server
oxigraph rdf sparql zotero zotero-addon
Last synced: 7 days ago
JSON representation
This server loads multiple Zotero libraries into an RDF graph, exposes a local SPARQL endpoint, and allows exporting the graph.
- Host: GitHub
- URL: https://github.com/ch-sander/zotero_rdf_server
- Owner: ch-sander
- License: mit
- Created: 2025-04-28T07:00:32.000Z (18 days ago)
- Default Branch: main
- Last Pushed: 2025-05-08T17:56:48.000Z (7 days ago)
- Last Synced: 2025-05-08T21:59:45.555Z (7 days ago)
- Topics: oxigraph, rdf, sparql, zotero, zotero-addon
- Language: Python
- Homepage: https://ch-sander.github.io/zotero_rdf_server/
- Size: 3.28 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Zotero RDF Server
This server loads multiple Zotero libraries into an RDF graph,
exposes a local SPARQL endpoint, and allows exporting the graph.
A **visual query builder** is found in `/explorer` to explore the graph or go to [GitHub Pages](https://ch-sander.github.io/zotero_rdf_server/).### Why this Tool?
While Zotero offers robust functionality for storing and collaboratively managing cloud-hosted libraries, it lacks support for federated access and cross-library exploration or search.
This **Zotero RDF Server** is an initial attempt to fill that gap. It implements basic entity mapping (e.g., tags, creators), but remains tightly constrained by Zotero’s inherently textual data model and API structure.
A logical next step would be to implement a **knowledge base mapping** layer to enable richer semantic interoperability.## 📘 How to Create a Zotero Cloud Library
To use this tool, you need at least one Zotero cloud library (either **user** or **group**). Here’s how to set it up:
1. **Create a Zotero Account**
Sign up at [https://www.zotero.org/user/register](https://www.zotero.org/user/register)2. **Install Zotero** *(optional but recommended)*
Download from [https://www.zotero.org/download](https://www.zotero.org/download)3. **Create a Library**
- **User Library**: Log in and add items directly to your personal Zotero library.
- **Group Library**:
- Go to [https://www.zotero.org/groups](https://www.zotero.org/groups)
- Click **Create a New Group**
- Choose visibility and permissions
- Add items via the Zotero client or web interface4. **Find your Library ID**
- Visit your group library online (e.g. `https://www.zotero.org/groups/2536132/your-group-name`)
- The number in the URL is your `library_id`.5. **Create an API Key**
- Go to [https://www.zotero.org/settings/keys](https://www.zotero.org/settings/keys)
- Click **Create new private key**
- Select the appropriate access level (e.g., read-only)👉 More help in the official docs:
[Zotero Web Library](https://www.zotero.org/support/web_library)
[Groups](https://www.zotero.org/support/groups)
[API Guide](https://www.zotero.org/support/dev/web_api/v3/start)---
## Features
- Load modes: JSON (via Zotero API), RDF (via API), or manual RDF import
- Efficient graph loading using Pyoxigraph wherever possible
- Configurable API query parameters (e.g., `itemType`, `tag`, `collection`)
- Correct Zotero RDF namespace handling
- Export as TriG or N-Quads
- Docker and Compose support
- Includes Oxigraph SPARQL server at port `7878`
- *(FastAPI endpoint for `/sparql` not yet implemented)*## Configuration
Place both YAML filenames in your `.env`, not in the code or Dockerfile. Only these two environment variables need updating when you rename or move configuration files:
```bash
CONFIG_FILE=custom-config.yaml
ZOTERO_CONFIG_FILE=custom-zotero.yaml
```Docker-Compose will mount these files into `/app` and your Python code loads them via `os.getenv(...)` with sensible defaults (`config.yaml` and `zotero.yaml`).
### `config.yaml`
Defines server and storage settings:
```yaml
server:
port: 8000 # HTTP port for Uvicorn
refresh_interval: 3600 # polling interval in seconds, 0 will prevent refreshing and only load local store
store_mode: "directory" # "memory" or "directory"
store_directory: "./data" # only for directory mode
export_directory: "./exports" # SPARQL result exports
import_directory: "./import" # for RDF manual imports
backup_directory: "./backup" # for RDF manual backups
log_level: "info" # logging level (debug, info, warn, error)
```### `zotero.yaml`
Contains the Zotero-specific settings:
```yaml
# Global RDF context (used for default vocabulary namespace)
context:
vocab: "http://www.zotero.org/namespaces/export#"
api_url: "https://api.zotero.org/"
base: "https://www.zotero.org/"
schema: "https://api.zotero.org/schema" # If specified, will generate a basic OWL ontology as a named graph using the IRI from vocablibraries:
- name: My Library # Only required for "manual_import" as a subdirectory containing RDF files
api_key: "xxxx"
library_type: "groups" # "user" or "groups"
library_id: "123"
load_mode: "json" # Options: "json", "rdf", or "manual_import"
rdf_export_format: "rdf_zotero" # Options: "rdf_zotero", "rdf_bibliontology"; only needed if load_mode = "rdf"
# base_uri: "https://www.example.com#" Used as the URI for the library's named graph and as the base URI for all named nodes created for Zotero items and collections. Defaults to "{context.base}{libraries.library_type}/{libraries.library_id}" as defined in this YAML
# uuid_namespace: "https://www.example.com#" Used to generate consistent UUIDs for named nodes across multiple libraries in the union graph. Defaults to base_uri if not specified
map: # Skip this block if no specifications are needed. Empty lists will be ignored
# white: [title, date] # Whitelist – only include these fields and those in 'named'
black: [title, date] # Blacklist – exclude these fields
rdf_mapping: [creators, tags, collections]
item_type: ["_Item", "itemType"] # Determines RDF type; leading underscore indicates a constant predicate. If not specified, defaults to "Item". If not starting with "http", the default vocab from context will be used
collection_type: ["_Collection"] # If not specified, defaults to "Collection"
named_library: "inLibrary" # If specified, adds an object property with this name linking to the library's named graph URI to support querying across named graphs
item_additional:
- property: "http://www.w3.org/2000/01/rdf-schema#label"
value: "title"
named_node: false
- property: "http://www.w3.org/2002/07/owl#sameAs"
value: "url"
named_node: true
api_query_params:
itemType: "book" # Optional, freely configurable
# tag: "important" # Optional, freely configurable
# collection: "XYZ123" # Optional- name: "Library 2"
api_key: "YOUR_OTHER_API_KEY"
library_type: "user"
library_id: "ANOTHER_ID"
load_mode: "json"
# no rdf_export_format needed
api_query_params:
itemType: "article"```
## Running
### Locally
```bash
pip install -r requirements.txt
python zotero_rdf_server.py
```### Docker
```bash
docker-compose up --build
```## API Endpoints
| Endpoint | Description |
|:---------|:-------------|
| `/sparql` | Run SPARQL queries (GET/POST) *(not yet implemented)* |
| `/export?format=trig` | Export full RDF dataset in TriG format |
| `/export?format=nquads` | Export full RDF dataset in N-Quads format |
| `/export?format=ttl&graph=/export?format=ttl&graph=http%3A%2F%2Fwww.zotero.org%2Fnamespaces%2Fexport%23` | Export a named graph in Turtle format (only content of the given graph) |
| `/backup` | creates a backup to indicated backup folder (**deletes previous backup!**) |
| `/optimize` | optimizes the current store |### Export Parameters
- `format`: One of `trig`, `nquads`, `ttl`, `nt`, `n3`, `xml` (default: `trig`)
- `graph` *(optional)*: IRI of the named graph to export. Required for formats that do not support named graphs (e.g., `ttl`, `nt`, etc.) if you don’t want to export the default graph.### Interactive Documentation
Visit `/docs` for the Swagger UI or `/redoc` for alternative OpenAPI documentation.
## Notes
- RDF export from Zotero API uses temporary file and bulk_load()
- Manual import mode reads local `.rdf`, `.trig`, `.ttl`, `.nt`, `.nq` files
- All Zotero entries are typed as `z:Item`
- Query parameters are configurable for fine-grained API filtering## License
MIT License