{"id":31837818,"url":"https://github.com/scdh/xtriples-micro","last_synced_at":"2026-05-11T09:54:50.012Z","repository":{"id":315428611,"uuid":"1059432242","full_name":"SCDH/xtriples-micro","owner":"SCDH","description":"XTriples implementation in XSLT for local usage or deployment on a micro service","archived":false,"fork":false,"pushed_at":"2025-09-18T15:09:01.000Z","size":2750,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-18T15:55:10.538Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"XSLT","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SCDH.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-18T12:43:01.000Z","updated_at":"2025-09-18T15:07:22.000Z","dependencies_parsed_at":"2025-09-18T16:05:44.391Z","dependency_job_id":null,"html_url":"https://github.com/SCDH/xtriples-micro","commit_stats":null,"previous_names":["scdh/xtriples-micro"],"tags_count":12,"template":false,"template_full_name":null,"purl":"pkg:github/SCDH/xtriples-micro","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SCDH%2Fxtriples-micro","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SCDH%2Fxtriples-micro/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SCDH%2Fxtriples-micro/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SCDH%2Fxtriples-micro/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SCDH","download_url":"https://codeload.github.com/SCDH/xtriples-micro/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SCDH%2Fxtriples-micro/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279009990,"owners_count":26084674,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-12T02:00:06.719Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-12T02:59:57.186Z","updated_at":"2026-05-11T09:54:49.993Z","avatar_url":"https://github.com/SCDH.png","language":"XSLT","funding_links":[],"categories":[],"sub_categories":[],"readme":"# `xtriples-micro` – An XTriples Processor for Micro Services and Local Usage\n\n[![Tests](https://github.com/SCDH/xtriples-micro/actions/workflows/test.yaml/badge.svg)](https://github.com/SCDH/xtriples-micro/actions/workflows/test.yaml)\n[![Create release](https://github.com/SCDH/xtriples-micro/actions/workflows/deploy.yaml/badge.svg)](https://github.com/SCDH/xtriples-micro/actions/workflows/deploy.yaml)\n\n\n`xtriples-micro` is an implementation of a\n[XTriples](https://xtriples.lod.academy/) processor that works without\nan eXist datebase.\n\nXTriples? In XTriples, instead of writing specialized programs in\nXSLT, XQuery, Python, etc. for extracting RDF triples from XML\ndocuments, we write configuration files containing selectors. These\nconfig files are evaluated by an XTriples processor, which returns RDF\ntriples. Here's an example of such a configuration file:\n\n```\n\u003c?xml-model uri=\"https://xtriples.lod.academy/xtriples.rng\" type=\"application/xml\" schematypens=\"http://relaxng.org/ns/structure/1.0\"?\u003e\n\u003cxtriples\u003e\n    \u003cconfiguration\u003e\n        \u003cvocabularies\u003e\n            \u003cvocabulary prefix=\"gods\" uri=\"https://xtriples.lod.academy/examples/gods/\"/\u003e\n            \u003cvocabulary prefix=\"rdf\" uri=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\"/\u003e\n            \u003cvocabulary prefix=\"rdfs\" uri=\"http://www.w3.org/2000/01/rdf-schema#\"/\u003e\n            \u003cvocabulary prefix=\"foaf\" uri=\"http://xmlns.com/foaf/0.1/\"/\u003e\n        \u003c/vocabularies\u003e\n        \u003ctriples\u003e\n            \u003cstatement\u003e\n                \u003csubject prefix=\"gods\"\u003e/@id\u003c/subject\u003e\n                \u003cpredicate prefix=\"rdf\"\u003etype\u003c/predicate\u003e\n                \u003cobject prefix=\"foaf\" type=\"uri\"\u003ePerson\u003c/object\u003e\n            \u003c/statement\u003e\n            \u003cstatement\u003e\n                \u003csubject prefix=\"gods\"\u003e/@id\u003c/subject\u003e\n                \u003cpredicate prefix=\"rdfs\"\u003elabel\u003c/predicate\u003e\n                \u003cobject type=\"literal\" lang=\"en\"\u003e/name/english\u003c/object\u003e\n            \u003c/statement\u003e\n            \u003cstatement\u003e\n                \u003csubject prefix=\"gods\"\u003e/@id\u003c/subject\u003e\n                \u003cpredicate prefix=\"rdfs\"\u003elabel\u003c/predicate\u003e\n                \u003cobject type=\"literal\" lang=\"gr\"\u003e/name/greek\u003c/object\u003e\n            \u003c/statement\u003e\n            \u003cstatement\u003e\n                \u003csubject prefix=\"gods\"\u003e/@id\u003c/subject\u003e\n                \u003cpredicate prefix=\"rdfs\"\u003eseeAlso\u003c/predicate\u003e\n                \u003cobject type=\"uri\"\u003e/concat(\"http://en.wikipedia.org/wiki/\", $currentResource/name/english)\u003c/object\u003e\n            \u003c/statement\u003e\n        \u003c/triples\u003e\n    \u003c/configuration\u003e\n    \u003ccollection uri=\"?select=[0-9]+.xml\"\u003e\n\t   \u003cresource uri=\"{//god}\"/\u003e\n    \u003c/collection\u003e\n\u003c/xtriples\u003e\n```\n\nTo get more impressions, have a look at the extraction\n[recipes](recipes).\n\nWhile the original XTriples processor requires an eXist database and\napplies a configuration only on the fixed set of XML files contained\nin it, the implementation at hand runs outside of a database, e.g., on\na local set of documents. It can also be deployed on the famous [SEED\nXML Transformer](https://github.com/scdh/seed-xc). This deployment gives you a lightweight microservice,\nwhere you can send a single XML document and a config file to and get\nRDF triples in return.\n\n## Getting started\n\n### Microservice\n\nTODO\n\n### Oxygen\n\nThis project offers an Oxygen framework, that assists writing XTriples\nconfiguration files and also provides transformation scenarios for\napplying a configuration to a single or a collection of\ndocuments. Installation is as simple as using the following\ninstallation link in the installation dialog found in **Help** -\u003e\n**Install New Addons**:\n\n```\nhttps://scdh.github.io/xtriples-micro/descriptor.xml\n```\n\nSee detailed description in the [Wiki](https://github.com/SCDH/xtriples-micro/wiki/Oxygen-Framework)!\n\n\n### XSLT Package\n\nFor using the XTriples engine in CI/CD pipelines or in downstream\nprojects, installation of a released package is the way to go. The\n[Wiki](https://github.com/SCDH/xtriples-micro/wiki/Installation-of-a-Release)\ngives detailed instructions!\n\n### Playing around and Testing\n\nFor playing around with XTriples and validating that it is suitable\ntechnology, you can also clone this repository. It comes with a fully\nreproducible [tooling](https://github.com/scdh/tooling) environment\nthat installs all tools needed for running and testing in a\nsandbox. You only need a Java development kit (JDK) installed. On\ndebian-based systems, you can install it with `sudo apt install\nopenjdk`.\n\nTo set up the tooling environment, clone this repository, `cd` into\nyour working copy and run:\n\n```\n./mvnw package  # Linux\n```\n\nor\n\n```\nmvnw.cmd package   # Windows\n```\n\nThis will download Saxon-HE etc. and generate wrapper files, that set\nup the classpath for using them.\n\nAfter running the command above, the wrapper scripts are in\n`target/bin/`. E.g., there are a wrappers around\n[Saxon-HE](https://www.saxonica.com/documentation12/index.html#!using-xsl/commandline)\nand [Jena RIOT](https://jena.apache.org/documentation/io/):\n\n```\ntarget/bin/xslt.sh -?\n```\n\n```\ntarget/bin/riot.sh -h\n```\n\n\n## Extracting RDF Triples\n\nThere are XSLT stylesheets, that do the work of evaluating an XTriples\nconfiguration file and applying it to XML documents.\n\n### `extract.xsl`\n\n[`xsl/extract.xsl`](xsl/extract.xsl) extracts\nfrom an XML document given as source by applying a configuration\npassed in via the stylesheet parameter `config-uri`.\n\n\n```shell\ntarget/bin/xslt.sh -xsl:xsl/extract.xsl -s:test/gods/1.xml config-uri=$(realpath test/gods/configuration.xml)\n```\n\nThe output should look like this:\n\n```ntriples\n\u003chttps://xtriples.lod.academy/examples/gods/1\u003e \u003chttp://www.w3.org/1999/02/22-rdf-syntax-ns#type\u003e \u003chttp://xmlns.com/foaf/0.1/Person\u003e  .\n\u003chttps://xtriples.lod.academy/examples/gods/1\u003e \u003chttp://www.w3.org/2000/01/rdf-schema#label\u003e \"Aphrodite\"@en  .\n\u003chttps://xtriples.lod.academy/examples/gods/1\u003e \u003chttp://www.w3.org/2000/01/rdf-schema#label\u003e \"Ἀφροδίτη\"@gr  .\n\u003chttps://xtriples.lod.academy/examples/gods/1\u003e \u003chttp://www.w3.org/2000/01/rdf-schema#seeAlso\u003e \u003chttp://en.wikipedia.org/wiki/Aphrodite\u003e  .\n```\n\nIf your result is polluted with debug messages, you can append `2\u003e\n/dev/null` to silence them or use Saxon's `-o:` option to send the\noutput to a file. They are printed to stderr.\n\nIf you want an other format, pipe the result to Jena RIOT like so:\n\n```\ntarget/bin/xslt.sh -xsl:xsl/extract.xsl -s:test/gods/1.xml config-uri=$(realpath test/gods/configuration.xml) | target/bin/riot.sh --out rdf/xml\n```\n\nHere's the result:\n\n```xml\n\u003crdf:RDF\n    xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\"\n    xmlns:rdfs=\"http://www.w3.org/2000/01/rdf-schema#\"\n    xmlns:j.0=\"http://xmlns.com/foaf/0.1/\" \u003e\n  \u003crdf:Description rdf:about=\"https://xtriples.lod.academy/examples/gods/1\"\u003e\n    \u003crdfs:seeAlso rdf:resource=\"http://en.wikipedia.org/wiki/Aphrodite\"/\u003e\n    \u003crdfs:label xml:lang=\"gr\"\u003eἈφροδίτη\u003c/rdfs:label\u003e\n    \u003crdfs:label xml:lang=\"en\"\u003eAphrodite\u003c/rdfs:label\u003e\n    \u003crdf:type rdf:resource=\"http://xmlns.com/foaf/0.1/Person\"/\u003e\n  \u003c/rdf:Description\u003e\n\u003c/rdf:RDF\u003e\n```\n\n\n\nThis is the only transformation that makes sense deploying on a micro\nservice. See [seed](seed.md).\n\n### `extract-collection.xsl`\n\n[`xsl/extract-doc-param.xsl`](xsl/extract-doc-param.xsl) takes a\nconfiguration as source document and applies it to the collecton of\nXML documents given in `/xtriples/collection/@uri`, which is\ninterpreted as a Saxon collection URI. See section [Implementation of\nthe Specs](#implementation-of-the-specs) for details. This is\ncompatible to the reference implementation.\n\nExample:\n\n```\ntarget/bin/xslt.sh -xsl:xsl/extract-collection.xsl -s:test/gods/configuration.xml\n```\n\nThis will extract triples from all the God files in\n[`test/gods`](test/gods) due to the collection URI `\u003ccollection\nuri=\"?select=[0-9]+.xml\"\u003e`. It is a relative URI (current directory\n`.`), and the [`select` query\nstring](https://www.saxonica.com/documentation12/index.html#!sourcedocs/collections/collection-directories)\nis interpreted by the Saxon processor.\n\n\n### `extract-doc-param.xsl`\n\n[`xsl/extract-doc-param.xsl`](xsl/extract-doc-param.xsl) takes a\nconfiguration as source document and applies it to an XML document\nreferenced by the `source-uri` stylesheet parameter.\n\n```shell\ntarget/bin/xslt.sh -xsl:xsl/extract-param-doc.xsl -s:test/gods/configuration.xml source-uri=$(realpath test/gods/1.xml)\n```\n\nThe `is-collection-uri` stylesheet parameter can be used, to indicate,\nthat the URI is a collection:\n\n```shell\ntarget/bin/xslt.sh -xsl:xsl/extract-param-doc.xsl -s:test/gods/configuration.xml is-collection-uri=true source-uri=/path/to/edition?select=*.tei.xml;recurse=true\n```\n\nThis works with any [type of collection](#collections).\n\nSo stylesheet can be used for passing user-defined collections into\nonce written extraction recipes.\n\n## Extraction Recipes\n\nThe [`recipes`](recipes) folder has XTriple configurations for\nextraction tasks that occur in many projects.\n\n## Writing configurations\n\n1. The content of `\u003csubject\u003e`, `\u003cpredicate\u003e`, `\u003cobject\u003e` and\n   `\u003ccondition\u003e` is evaluated as an **XPath** expression, if and only\n   if the content starts with a **slash** `/`. Before the expression\n   is evaluated, it is prepended with `$currentResource` (or\n   `$externalResource` respectively). E.g., `/@id` is evaluated as\n   `$currentResource/@id`. In `\u003ccondition\u003e` the XPath is constructed\n   like this: `xs:boolean($currentResource CONDITION )`.\n1. Keep the difference of **document** vs.  **resource** in mind: Each\n   document may contain multiple resources if\n   `/xtriples/collection/resource/@uri` is used to unnest resources\n   from a document. The variable `$currentResource` and\n   `$resourceIndex` provide access to the resource and its index.\n1. This resource context is transparent to the underlying\n   document. Thus, accessing parts of the document outside of the\n   context subtree is possible:\n   `$currentResource/ancestor::TEI/teiHeader`.\n1. The XPath evaluation uses **namespaces** made up from the\n   prefix-to-URI mapping from the `\u003cvocabularies\u003e` section of the\n   configuration file. Thus:\n   - If you want to extract RDF from non-namespace XML sources, do not\n     use the empty string prefix in the vocabularies, since that would\n     bind the default namespace for XPath evaluation to this\n     vocabulary URI.\n   - Be careful about using the default namespace, since it is not\n     compatible with the reference implementation. See\n     [below](#implementation-of-the-specs)!\n1. Using **BNodes** may be a bit tricky. See [these hints](bnodes.md).\n\n\n## Implementation of the Specs\n\nThis is a full implementation of the [XTriples\nspec](https://xtriples.lod.academy/documentation.html).\n\n### Additional Features\n\nIn addition to the specs, this implementation adds the following\nfeatures:\n\n1. In addition to static ISO 639 language identifiers, `object/@lang`\n   can also be XPath expressions, that return such language\n   identifiers. This feature is handy for projects that set up\n   language in their XML documents.\n1. By leaving away `@prefix` for a `\u003cvocabulary\u003e` or setting it to the\n   empty string, the default namespace when evaluating XPath\n   expressions binds to this vocabulary URI. Thus, when setting\n   `\u003cvocabulary uri=\"http://www.tei-c.org/ns/1.0\"/\u003e`, you can write\n   XPaths like this: `\u003cobject\n   type=\"literal\"\u003e//(teiHeader/fileDesc/titleStmt/title)[1]\u003c/object\u003e`\n   without prefixing the element names. See\n   [`test/config-02.xml`](test/config-02.xml) for a self contained\n   test case. Evaluating it on the [reference\n   implementation](https://xtriples.lod.academy/index.html) fails,\n   while the implementation at hand processes it correctly.\n1. It is possible to use your own functions in the XPath expressionss\n   in the `\u003cconfiguration\u003e` section: You can load an additional XSLT\n   stylesheet by using the `libraries` (sequence of xs:anyURI) or\n   `libraries-csv` (a string of comma separated URIs) stylesheet\n   parameters. Please notice, that you have to declare your function's\n   visibility non-private and non-hidden, e.g., `@visibility=public`,\n   cf. [XSLT 3.0 TR](https://www.w3.org/TR/xslt-30/#evaluate-static-context).\n   ```shell\n   target/bin/xslt.sh -xsl:xsl/extract-collection.xsl -s:...  libraries-csv=$(realpath my-utils.xsl)\n   ```\n\n\n### Collections\n\nDue to not running inside an eXist database, the evaluation of the\n`\u003ccollection\u003e` section of the configuration differs from the reference\nimplementation. However, you can get full compatibility mode (see end\nof this section).\n\nIn contrast to the specs, **`/xtriples/collection/@uri`** is ignored,\nwhen a single XML source document is passed to the processor, i.e.,\nwhen using `xsl/extract.xsl` or `xsl/extract-param-dox.xsl`.\n\nWhen using `xsl/extract-collection.xsl`, it is evaluated as a [Saxon\ncollection\nURI](https://www.saxonica.com/documentation12/index.html#!sourcedocs/collections/collection-uris). It\ncan thus be a\n\n- [directory\n  URI](https://www.saxonica.com/documentation12/index.html#!sourcedocs/collections/collection-directories)\n  with select pattern for finding files (relative URIs are resolved\n  against the evaluated configuration file), or\n- [zip-collection](https://www.saxonica.com/documentation12/index.html#!sourcedocs/collections/ZIP-collections)\n  (zip, jar, docx) which will automatically be unpacked and\n  crawled, or a\n- [collection\n  catalog](https://www.saxonica.com/documentation12/index.html#!sourcedocs/collections/collection-catalogs)\n  listing files to crawl or\n- your own collection type provided you have written your own\n  [collection\n  finder](https://www.saxonica.com/documentation12/index.html#!sourcedocs/collections/user-collections).\n\n*Link based resource crawling* and *literal resource crawling* are\nsupported exactly as in the reference implementation. In both modes,\nthere is no `@uri` attribute present for the collection.\n\nYou can get full compatibility by setting the `is-collection-uri`\nstylesheet parameter to `false`. This way, all the `@uri` attribute of\neach `\u003ccollection\u003e` is not read as a Saxon collection URI, but as a\nsingle document URI. Using this attribute, *XPath based resource\ncrawling with resources spread over multiple files* is also supported.\n\nYou can evaluate the examples in `test/gods` with\n`is-collection-uri=false` and by using the XML catalog in\n`test/catalog.xml`, which maps lod academy URIs to local files:\n\n```\ntarget/bin/xslt.sh -xsl:xsl/extract-collection.xsl -s:test/gods/conf-NN.xml -catalog:test/catalog.xml is-collection-uri=false\n```\n\n\n## Output: NTriples\n\nThere's only one output format: NTriples. In a microservice\narchitecture, converting to other formats is done in a converter\nservice. NTriples is the RDF serialization of choice, because the\nresponse bodies of multiple request can simply be concatenated into\none graph.\n\n## Development\n\nRun tests with\n\n```\ntarget/bin/test.sh\n```\n\nor\n\n```\nsource target/bin/classpath.sh # only once needed per shell session\nant -Dcatalog=test/catalog.xml test\n```\n\n## License\n\nThis is distributed under the MIT license.\n\nThe tests cases directly in `test/gods/` where taken from the\n[original eXist-db\nimplementation](https://github.com/digicademy/xtriples/tree/master),\nwhich is licensed under the terms of the MIT license.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscdh%2Fxtriples-micro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fscdh%2Fxtriples-micro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscdh%2Fxtriples-micro/lists"}