Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/VladimirAlexiev/rdf2rml

RDF by Example: rdfpuml for True RDF Diagrams, rdf2rml for R2RML Generation
https://github.com/VladimirAlexiev/rdf2rml

graphviz plantuml r2rml r2rml-mapping rdf2rml rdfpuml uml-diagram visualization

Last synced: about 2 months ago
JSON representation

RDF by Example: rdfpuml for True RDF Diagrams, rdf2rml for R2RML Generation

Lists

README

        

#+OPTIONS: ':nil *:t -:t ::t <:t H:5 \n:nil ^:{} arch:headline author:t broken-links:nil
#+OPTIONS: c:nil creator:nil d:(not "LOGBOOK") date:t e:t email:nil f:t inline:t num:nil
#+OPTIONS: p:nil pri:nil prop:nil stat:t tags:t tasks:t tex:t timestamp:nil title:t toc:5
#+OPTIONS: todo:t |:t
#+OPTIONS: html-link-use-abs-url:nil html-postamble:auto html-preamble:t html-scripts:t
#+OPTIONS: html-style:t html5-fancy:nil tex:nil
#+STARTUP: nonum
#+TITLE: RDF by Example: rdfpuml for True RDF Diagrams, rdf2rml for R2RML Generation
#+DATE: <2022-09-20>
#+AUTHOR: Vladimir Alexiev
#+EMAIL: [email protected]
#+LANGUAGE: en
#+CREATOR: Emacs 25.3.1 (Org mode 9.1.13)
#+TODO: TODO INPROGRESS | DONE CANCELED
#+HTML_DOCTYPE: xhtml-strict
#+HTML_CONTAINER: div
#+DESCRIPTION:
#+KEYWORDS: RDF, visualization, PlantUML, R2RML, generation, model-driven, RDF by Example, rdfpuml, rdf2rml, rdf2sparql, rdf2tarql, rdf2ontorefine

* Table of Contents :TOC:noexport:
:PROPERTIES:
:TOC: :include all
:END:

:CONTENTS:
- [[#introduction][Introduction]]
- [[#citation][Citation]]
- [[#documentation][Documentation]]
- [[#related-work][Related Work]]
- [[#installation][Installation]]
- [[#docker-image][Docker Image]]
- [[#debian-repo][Debian Repo]]
- [[#change-log][Change Log]]
- [[#2023-06-07-rdf2sparqlpl-minimize-binds-in-delete-clause][2023-06-07 rdf2sparql.pl: minimize binds in delete clause]]
- [[#2023-06-06-rdf2sparqlpl-global---filter-options][2023-06-06 rdf2sparql.pl: global --filter options]]
- [[#2023-06-01-rdfpumlpl-remove-carpalways][2023-06-01 rdfpuml.pl: remove Carp::Always]]
- [[#2023-05-17-rdf2sparqlpl-conditional-nodes][2023-05-17 rdf2sparql.pl: Conditional Nodes]]
- [[#2023-05-05-rdfpumlpl-dont-mangle-round-brackets][2023-05-05 rdfpuml.pl: don't mangle round brackets]]
- [[#2023-04-29-rdfpumlpl-pumloption][2023-04-29 rdfpuml.pl: puml:option]]
- [[#2023-04-19-rdf2sparqlpl-per-model-filter-dynamic-graph][2023-04-19 rdf2sparql.pl: per-model filter, dynamic graph]]
- [[#2022-08-23-rdf2sparqlpl-add-datatype-to-var-name-instead-of-uppercasing][2022-08-23 rdf2sparql.pl: add datatype to var name instead of UPPERCASING]]
- [[#2022-08-23-rdfpumlpl-handle-blank-node-types-add-shell-scripts][2022-08-23 rdfpuml.pl: handle blank-node types; add shell scripts]]
- [[#2022-08-15-rdf2sparqlpl-merge-to-one-tool][2022-08-15 rdf2sparql.pl: merge to one tool]]
- [[#2022-04-08-rdf2ontorefinepl-generate-ontorefine-update-queries][2022-04-08 rdf2ontorefine.pl: generate OntoRefine Update queries]]
- [[#2021-09-02-rdfpumlpl-unicode-processing][2021-09-02 rdfpuml.pl: Unicode Processing]]
- [[#2020-09-17-rdf2rml-logicaltable][2020-09-17 rdf2rml: logicalTable]]
- [[#2020-06-01-rdf2tarqlpl-generate-tarql-scripts][2020-06-01 rdf2tarql.pl: generate TARQL scripts]]
- [[#2020-06-01-rdf2rml-improve-scripts-sql-querytable-propagation][2020-06-01 rdf2rml: improve scripts, SQL query/table propagation]]
- [[#2020-05-30-rdf2rml-handle-inverse-edge][2020-05-30 rdf2rml: handle inverse edge]]
- [[#2018-11-14-rdfpumlpl-avoid-pumlstereotype-class-node][2018-11-14 rdfpuml.pl: avoid puml:stereotype class node]]
- [[#2018-06-29-rdfpumlpl-bug-class-and-pumlinlineproperty][2018-06-29 rdfpuml.pl bug: class and puml:InlineProperty]]
- [[#2018-04-05-rdfpumlpl-arrow-attributes][2018-04-05 rdfpuml.pl: Arrow Attributes]]
- [[#2018-02-25-rdfpumlpl-arrow-color][2018-02-25 rdfpuml.pl: Arrow Color]]
- [[#2017-08-25-rdfpumlpl-decorative-arrows][2017-08-25 rdfpuml.pl: decorative arrows]]
- [[#2016-02-10-rdfpumlpl-blank-nodes-hidden-links][2016-02-10 rdfpuml.pl: blank nodes, hidden links]]
- [[#to-do-tasks][To Do Tasks]]
- [[#near-term][Near-term]]
- [[#modularize-and-package-better][Modularize and Package Better]]
- [[#regression-tests][Regression Tests]]
- [[#rdf2rml-disentangle-inverse-edge][rdf2rml: disentangle inverse edge]]
- [[#release-on-cpan][Release on CPAN]]
- [[#add-unicode-tests][Add Unicode tests]]
- [[#prefixes][Prefixes]]
- [[#allow-specifying-the-prefixes-file][Allow specifying the prefixes file]]
- [[#eliminate-curiepm][Eliminate Curie.pm]]
- [[#remember-prefixes-from-input-file][Remember prefixes from input file]]
- [[#support-more-rdf-formats][Support more RDF Formats]]
- [[#batch-processing][Batch Processing]]
- [[#manual-batching]["Manual" Batching]]
- [[#mid-term][Mid-Term]]
- [[#upgrade-to-use-attean][Upgrade to use Attean]]
- [[#integrate-in-emacs-org-mode][Integrate in Emacs org-mode]]
- [[#node-colors-icons-tooltips][Node colors, icons, tooltips]]
- [[#more-arrow-types-and-styles][More arrow types and styles]]
- [[#extra-layout-options][Extra Layout Options]]
- [[#custom-reification][Custom Reification]]
- [[#use-mindmapwbs-for-hierarchies][Use MindMap/WBS for Hierarchies]]
- [[#long-term][Long-Term]]
- [[#rdf2soml-to-generate-semantic-object-models][rdf2soml to Generate Semantic Object Models]]
- [[#cardinality-with-rdf][Cardinality With RDF*]]
- [[#cardinality-with-blank-node][Cardinality With Blank Node]]
- [[#rdf2shape-to-describe--generate-rdf-shapes][rdf2shape to Describe & Generate RDF Shapes]]
- [[#visualize-rdf-shapes-shacl-and-shex][Visualize RDF Shapes (SHACL and ShEx)]]
- [[#generate-transformations-for-other-than-relational-sources][Generate transformations for other than relational sources]]
:END:

* Introduction
See these publications:
- RDF by Example: rdfpuml for True RDF Diagrams, rdf2rml for R2RML Generation.
Vladimir Alexiev. In Semantic Web in Libraries 2016 (SWIB 16), Bonn, Germany, November 2016:
[[http://rawgit2.com/VladimirAlexiev/my/master/pres/20161128-rdfpuml-rdf2rml/index.html][Presentation]], [[http://rawgit2.com/VladimirAlexiev/my/master/pres/20161128-rdfpuml-rdf2rml/index-full.html][HTML]], [[http://rawgit2.com/VladimirAlexiev/my/master/pres/20161128-rdfpuml-rdf2rml/RDF_by_Example.pdf][PDF]], [[https://youtu.be/4WoYlaGF6DE][Video]]
- Generation of Declarative Transformations from Semantic Models.
Vladimir Alexiev. In European Data Conference on Reference Data and Semantics (ENDORSE 2023), Mar 2023:
[[https://drive.google.com/open?id=1Cq5o9th_P812paqGkDsaEomJyAmnypkD][paper]], [[https://docs.google.com/presentation/d/1JCMQEH8Tw_F-ta6haIToXMLYJxQ9LRv6/edit][presentation]], [[https://youtu.be/yL5nI_3ccxs][video]]

RDF is a graph data model, so the best way to understand RDF data schemas (ontologies, application profiles, RDF shapes) is with a diagram.
Many RDF visualization tools exist,
but they either focus on large graphs (where the details are not easily visible),
or the visualization results are not satisfactory,
or manual tweaking of the diagrams is required.

- *rdfpuml* makes true diagrams directly from Turtle examples using PlantUML and GraphViz.
Diagram readability is of prime concern, and rdfpuml introduces various diagram control mechanisms using triples in the ~puml:~ namespace.
Special attention is paid to inlining and visualizing various Reification mechanisms (described with PRV).
We give examples from Getty CONA, Getty Museum, AAC (mappings of museum data to CIDOC CRM),
Multisensor (NIF and FrameNet), EHRI (Holocaust Research into Jewish social networks), Duraspace (Portland Common Data Model for holding metadata in institutional repositories), Video annotation.

If the example instances include embedded source field names, they can describe a mapping precisely.
I've implemented a few more tools to generate transformations:
- *rdf2rml* generates [[https://www.w3.org/TR/r2rml/][R2RML]] transformations for RDBMS tables or SQL queries. Compared to R2RML, this saves about 15x in complexity and is competitive with the dedicated DSL [[https://rml.io/yarrrml/][YARRML]]
- *rdf2sparql* generates [[https://platform.ontotext.com/ontorefine/][OntoRefine]] or [[https://tarql.github.io/][TARQL]] transformations from CSV/TSV
that take the form of SPARQL UPDATE (for direct GraphDB loading)
or CONSTRUCT (for conversion to RDF).
(Subsumes two deprecated tools ~rdf2tarql~ and ~rdf2ontorefine~)

See http://twitter.com/hashtag/rdfpuml for news, diagrams and announcements.

** Citation
If you use this software, please cite it as shown above.
- Github shows a link "About> Cite this repository" (see [[https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/creating-a-repository-on-github/about-citation-files][about-citation-files]])
- [[CITATION.cff]] describes both the software and the above publications.
It's a YAML CFF file, see https://citation-file-format.github.io/
- [[CITATION.bib]] describes only the above publications. It's a bibtex file

** Documentation
- [[https://github.com/VladimirAlexiev/rdf2rml/blob/master/doc/rdfpuml.md][rdfpuml.md]] or [[http://rawgit2.com/VladimirAlexiev/rdf2rml/master/doc/rdfpuml.html][rdfpuml.html]]
- [[https://github.com/VladimirAlexiev/rdf2rml/blob/master/doc/rdf2rml.md][rdf2rml.md]] or [[http://rawgit2.com/VladimirAlexiev/rdf2rml/master/doc/rdf2rml.html][rdf2rml.html]]
- [[https://github.com/VladimirAlexiev/rdf2rml/blob/master/doc/rdf2sparql.md][rdf2sparql.md]] or [[http://rawgit2.com/VladimirAlexiev/rdf2rml/master/doc/rdf2sparql.html][rdf2sparql.html]] (subsumes ~rdf2tarql~ and ~rdf2ontorefine~)

** Related Work

The following works use or mention this software:

- V. Alexiev, A. Kiryakov, P. Tarkalanov (2017)
[[https://www.researchgate.net/profile/Plamen-Tarkalanov/publication/342956150_euBusinessGraph_Company_and_Economic_Data_for_Innovative_Products_and_Services/links/5f0efda445851512999b206b/euBusinessGraph-Company-and-Economic-Data-for-Innovative-Products-and-Services.pdf][euBusinessGraph: Company and economic data for innovative products and services]].
13th International Conference on Semantic Systems (Semantics 2017)
- L. Zhuhadar, M. Ciampa (2017). [[https://www.sciencedirect.com/science/article/abs/pii/S0747563217306933?via%3Dihub][Leveraging learning innovations in cognitive computing with massive data sets: Using the offshore Panama papers leak to discover patterns]]. Computers in Human Behavior. doi:10.1016/j.chb.2017.12.013
- C. Debruyne, D. Lewis, D. O’Sullivan (October 2018).
[[https://link.springer.com/chapter/10.1007/978-3-030-02671-4_21][Generating Executable Mappings from RDF Data Cube Data Structure Definitions]].
In Confederated International Conferences "On the Move to Meaningful Internet Systems" (OTM 2018),
pages 333-350. doi:10.1007/978-3-030-02671-4_21
- V. Alexiev (2018).
[[http://dipp.math.bas.bg/images/2018/019-050_32_11-iDiPP2018-34.pdf][Museum Linked Open Data: Ontologies, Datasets, Projects (invited report)]].
In Digital Presentation and Preservation of Cultural and Scientific Heritage (DIPP 2018).
Volume 8, pages 19-50. Burgas, Bulgaria, September 2018
- A.D. Junior (2019).
[[http://www.tara.tcd.ie/bitstream/handle/2262/86157/AdemarCrotti-thesis_final.pdf][A Jigsaw Puzzle Metaphor for Representing Linked Data Mappings]].
PhD Thesis, Knowledge and Data Engineering Group (KDEG), Trinity College, Dublin, Ireland
- V. Alexiev, P. Tarkalanov, N. Georgiev, L. Pavlova (2020).
[[https://dipp.math.bas.bg/images/2020/045-064_1.2_iDiPP2020-24_v.1c.pdf][Bulgarian Icons in Wikidata and EDM]].
Digital Presentation and Preservation of Cultural and Scientific Heritage (DIPP 2020).
- Matjaz Rihtar. https://github.com/mrihtar/rdfgraph:
inspired by ~rdfpuml~, written in Python 2.7, uses Redland's ~librdf~ library.
I worked with Matjaz in the euBusinessGraph project.

* Installation
Checkout this repo and add ~rdf2rml/bin~ to your path.
Install the following prerequisites:
- both tools: Perl. Tested with version 5.22 on Windows (cygwin and Strawberry).
- rdfpuml:
- [[http://www.graphviz.org/][GraphViz]]
- [[http://plantuml.com/download][PlantUML]].
You need a recent version for new features like arrow length and color. I'm currently running 1.2018.10beta7.
See in particular [[http://plantuml.com/class-diagram][plantuml class diagrams]].
- Perl modules: use ~cpan~ or ~cpanm~ to install them:
~RDF::Trine RDF::Query Encode FindBin Carp::Always Slurp~
- ~RDF::Prefixes::Curie~. This is my own module located in [[./lib]], and *rdfpuml* needs ~FindBin~ to locate it.
- rdf2rml:
- [[https://jena.apache.org/download/][Apache Jena]]: ~riot~, ~update~. Tested with version 3.1.0 of 2016-05-10.
- cat, grep, rm

** Docker Image
If you prefer to work with Docker so you don't need to install software manually,
you can use this [[https://docker-registry.ontotext.com/#browse/search=keyword%3Drdf2rml][rdf2rml image]] from the public Nexus (Docker Registry) of Ontotext.
To run it, use:

: docker run -v :/files --rm docker-registry.ontotext.com/rdf2rml:latest`

Where ~~ is the local directory holding your ~.ttl~ files.
It was made on 31 May 2023 and uses the following versions:
- [[https://github.com/VladimirAlexiev/rdf2rml][rdf2rml]]: 31 May 2023, with fixed [[https://github.com/VladimirAlexiev/rdf2rml/issues/22][issue 22]]
- [[https://plantuml.com/download][PlantUML]]: 1.2023.7
- [[https://jena.apache.org/download/][Jena]]: 4.8.0

Note: [[https://github.com/VladimirAlexiev/rdf2rml/pull/7][pull request 7]] of 17 Sep 2019 by Jem Rayfield (~@jazzyray~)
dockerizes the installation, and makes extra changes related to input/output and configuration.
However, it has not been merged yet

* Debian Repo
Jonas Smedegaard (~@jonassmedegaard~, dr at jones fullstop dk) has volunteered for some of the tasks below.
His development is at https://salsa.debian.org/debian/rdf2rml/branches.
To adopt changes, do something like this.

- To merge *all* commits in the ~salsa/develop~ branch:
#+begin_src sh
cd rdf2rml # i.e. your local clone of your Github project
git remote add salsa https://salsa.debian.org/debian/rdf2rml.git
git fetch salsa
git merge salsa/develop
#+end_src

- To adopt only single commits from the ~salsa/develop~ branch, issue ~remote~ and ~fetch~ as above, then issue:
#+begin_src sh
git cherry-pick $commit1 $commit2 $commit3
#+end_src

* Change Log
** 2023-06-07 rdf2sparql.pl: minimize binds in ~delete~ clause
[[https://github.com/VladimirAlexiev/rdf2rml/issues/27][Issue 27]]: minimize the ~delete~ clause to include only necessary binds:
- ~--filterColumn~ variable prebind
- templated GRAPH URL and its constituent variables
** 2023-06-06 rdf2sparql.pl: global ~--filter~ options
[[https://github.com/VladimirAlexiev/rdf2rml/issues/26][Issue 26]]: add command-line options ~--filterColumn, --filter~ that are useful for handling both initial loading and data updates.
See [[https://github.com/VladimirAlexiev/rdf2rml/blob/master/doc/rdf2sparql.md#global-filtering][global filtering]] and ~test/graphs-crunchbase~
** 2023-06-01 rdfpuml.pl: remove Carp::Always
[[https://github.com/VladimirAlexiev/rdf2rml/issues/2][Issue 2]] remove ~Carp::Always~ since it produces a stack trace that's too verbose
** 2023-05-17 rdf2sparql.pl: Conditional Nodes
- Support "Conditional Nodes", i.e. URLs that are conditional on the existence of some fields.
- [[https://github.com/VladimirAlexiev/rdf2rml/issues/22][issue 22]] fixed (2023-05-31)
** 2023-05-05 rdfpuml.pl: don't mangle round brackets
- [[https://github.com/VladimirAlexiev/rdf2rml/issues/21][issue 21]]: Round brackets in fields (eg "(name)") and URLs (eg ) are not mangled to square brackets anymore
** 2023-04-29 rdfpuml.pl: puml:option
- [[https://github.com/VladimirAlexiev/rdf2rml/issues/18][issue 18]] Add ~puml:option~ for ~left to right direction~ etc
** 2023-04-19 rdf2sparql.pl: per-model filter, dynamic graph
- [[https://github.com/VladimirAlexiev/rdf2rml/issues/19][issue 19]] Implement filter function, see ~test/filter-content~
- [[https://github.com/VladimirAlexiev/rdf2rml/issues/20][issue 20]] Allow dynamic graph (computed from a data column), see ~test/graphs-crunchbase~
** 2022-08-23 rdf2sparql.pl: add datatype to var name instead of UPPERCASING
Datatype attachment eg ~strdt(?var,xsd:date)~ now outputs to ~?var_xsd_date~ to avoid conflict with input field names in ALL_UPPERCASE
** 2022-08-23 rdfpuml.pl: handle blank-node types; add shell scripts
- [[https://github.com/VladimirAlexiev/rdf2rml/issues/10][issue 10]] Handle blank-node types that occur on owl:Restriction (see ~test/blank-node~)
- Duplicate ~rdfpuml.bat, puml.bat~ as shell scripts ~rdfpuml, puml~ for use in Makefiles across Linux and Windows
** 2022-08-15 rdf2sparql.pl: merge to one tool
Merge ~rdf2tarql~ and ~rdf2ontorefine~ to one tool ~rdf2sparql~
** 2022-04-08 rdf2ontorefine.pl: generate OntoRefine Update queries
Add script to generate OntoRefine SPARQL Update queries from model.
** 2021-09-02 rdfpuml.pl: Unicode Processing
Use Perl option ~-C~ when invoking for proper Unicode processing.
See doc section ~rdfpuml.html#Unicode~
** 2020-09-17 rdf2rml: logicalTable
Use URL for logicalTable instead of blank node, so that R2RML generated from different models for different tables can be merged more easily.
Warning: this assumes that all instances of one subjectMap use the same query.
** 2020-06-01 rdf2tarql.pl: generate TARQL scripts
Add rdf2tarql.pl script to generate TARQL script (CSV-RDF conversion) from model.
** 2020-06-01 rdf2rml: improve scripts, SQL query/table propagation
- Improve script to abort if the first pipeline step ("update") fails
- Improve script to work on Cygwin (invokes the Jena tools as ~riot.bat~ and ~update.bat~)
- Filter out harmless warnings from Jena update's error log
for datatypes like ~xsd:integer, xsd:date~ etc since the mention of a source field doesn't match the syntax of such literals.
- If a node has single outgoing link and no SQL query/table (~puml:label~),
propagate that property backward across the link into the node
(previously that was done only for incoming links)
** 2020-05-30 rdf2rml: handle inverse edge
When an edge ~Y-P-X~ is recorded in the RDB table of ~X~ (as foreign key) or in an association table,
it is awkward to specify that table in the node ~Y~.
So I added this SPARQL UPDATE clause:
- If a node ?y has no SQL, is not Inlined, has a single outgoing edge, then add the SQL of its counterparty ?x as default
** 2018-11-14 rdfpuml.pl: avoid puml:stereotype class node
I often define ~puml:stereotype~ for some classes in prefixes.ttl.
If the class is not used in some particular turtle, it should avoid emitting a disconnected puml class.
- ~stereotypes()~: Avoid emitting
- ~has_statements_different_from()~: Check that a node has statements other than puml:stereotype
** 2018-06-29 rdfpuml.pl bug: class and puml:InlineProperty
When a type is also used with ~puml:InlineProperty~, it caused this error:
: Can't locate object method "uri_value" via package "RDF::Trine::Node::Literal" at rdfpuml.pl line 261.
: main::puml_qname(RDF::Trine::Node::Literal=ARRAY(0x4fd0920)) called at rdfpuml.pl line 279
: main::puml_node2(RDF::Trine::Node::Literal=ARRAY(0x4fd0920)) called at rdfpuml.pl line 128
An inline is converted to a literal, but rdf:type is always assumed to be a URL.
Test: [[./test/regression/type-inlineProperty.ttl]]
** 2018-04-05 rdfpuml.pl: Arrow Attributes
Add arrow attributes (dotted, dashed, bold) and length
Test: [[./test/regression/arrowLen.ttl]]
** 2018-02-25 rdfpuml.pl: Arrow Color
Support arrow color (named or hex)
** 2017-08-25 rdfpuml.pl: decorative arrows
Fix unicode of "decorative arrows" on links going to a Reified Relation:
: left => "←", right => "→", up => "↑", down => "↓"
** 2016-02-10 rdfpuml.pl: blank nodes, hidden links
- support blank nodes
- support new puml "hidden" links that can sometimes help the layout: http://plantuml.com/class-diagram#layout
* To Do Tasks
Help needed for the following tasks.
Post bugs and enhancement requests to this repo!

** Near-term

*** Modularize and Package Better

*** Regression Tests
- ~sort~ is added at various places to make the tool more deterministic, i.e. independent of order of RDF statements in the input file.
However, this will interfere with the ability to control the layout, especially of disconnected components (see [[https://forum.plantuml.net/2538][layout_new_line]])
- Some regression tests are added.

*** rdf2rml: disentangle inverse edge
In the case ~Y-P-X~ described above:
- Also need to record ~?y puml:property ?p~ so this prop name can be added to ?y's subject map
- When making ?map, take ~puml:property~ into account
- But ?map is made many times, and copy-paste is no good...
- Also, this should be done in some cases but not others...
- So it's better to record ~?y puml:map ?map~ ...

*** Release on CPAN

*** Add Unicode tests
Add ttl with non-ASCII chars: Accented, Cyrillic, French, etc.
- Accented: ~"Rudolf Mössbauer"~ in [[./test/TRR/societyMember.ttl]]

*** Prefixes
**** Allow specifying the prefixes file
See https://github.com/VladimirAlexiev/rdf2rml/pull/7
**** Eliminate Curie.pm
[[./lib/RDF/Prefixes/Curie.pm]] remembers ~@base~ and uses that for URL shortening.
Once [[https://github.com/kasei/perlrdf/issues/131][perlrdf#131]] is fixed, eliminate this dependency (local module)
**** Remember prefixes from input file
~rdfpuml~ shortens URLs using prefixes only from ~prefixes.ttl~, but should also use prefixes defined in the individual input file.
*** Support more RDF Formats
Now it only supports Turtle, because it concatenates ~prefixes.ttl~ to the main file.
If it can collect all prefixes from RDF files, such concatenation won't be needed

*** Batch Processing
Issue [[https://github.com/VladimirAlexiev/rdf2rml/issues/1][#1]]: plantuml is slow to start up, so we'd like to process a bunch of ~puml~ files at once.
The best way is to have a smarter script or ~Makefile~ that uses the following http://plantuml.com/command-line features:
- Keep the intermediate ~puml~ files (the current ~Makefile~ doesn't preserve them)
- Run ~plantuml~ on a whole folder (with ~-r[ecurse]~ it can even recurse through subfolders)
- Use ~-checkmetadata~ to skip ~png~ files that don't need to be regenerated.
(The whole ~puml~ text is stored in the ~png~,
so ~plantuml~ can quickly check that there are no changes)
- The ~Makefile~ should start ~plantuml~ only once, if some of the ~puml~ files is newer than its respective ~png~ file

**** "Manual" Batching
Before I discovered the ~-checkmetadata~ option,
I had the idea that ~rdfpuml~ could put several diagrams in one ~puml~ file:
#+BEGIN_EXAMPLE
@startuml file1.png
# made from file1.ttl
@enduml
@startuml file2.png
# made from file2.ttl
@enduml
#+END_EXAMPLE
However, this interferes with ~make~ processing that regenerates only ~png~ for changed ~ttl~ files,
and makes things less modular overall.

** Mid-Term

*** Upgrade to use Attean
[[https://github.com/kasei/perlrdf][Trine (Perl RDF)]] is end of life. [[https://github.com/kasei/attean][Attean]] is the new generation

*** Integrate in Emacs ~org-mode~
Write Turtle, see diagram (easy to do)

*** Node colors, icons, tooltips
See [[./ideas]]

*** More arrow types and styles
- See ~arrows arrows-2~ from https://github.com/anoff/blog/tree/master/static/assets/plantuml/diagrams:

[[./ideas/arrows.png]] [[./ideas/arrows-2.png]]

- Arrow styles and colors (bold, dashed etc): https://mrhaki.blogspot.com/2016/12/plantuml-pleasantness-get-plantuml.html

- ~plantuml -pattern~ regexes:
: dotted|dashed|plain|bold|hidden|norank|single|thickness

*** Extra Layout Options
Local layout options are described in [[http://wiki.plantuml.net/site/class-diagram#help_on_layout][Help on Layout]]:
- "hidden" makes a constraint between two nodes, but does not draw the link (~rdfpuml~ already implements this)
- [[https://forum.plantuml.net/3188/add-norank-option-on-links][norank]] ignores a link for layout purposes (same as graphviz ~constraint=false~)
- "together" groups classes as if they were in the same package (i.e. puts them in a graphviz cluster)

Global options include (eg see [[http://www.plantuml.com/plantuml/uml/bP8nQmCn38Lt_mfnoq7XGZgrGoYXMJeqIpfqTkwKdeXi7xRI4kYFBvSORCSGg8OGdlJfFPbR1z5UJePLsuuq8FJaUqPr-OzcaZCOD7lq8PUqYAVzIJ2eS2GxQQyDC5cKyuJWl8mkQuHH3-w7x1SSD0TKRMfjoMvOX_19WupmjCnxrWqOS8BdGlNQ7gEg55b1Vz0zmlOIyfs2e4LVDNQECHFVDFC7-c_giHfLgct18siXPmEqhL8R9hG2LNNTIodaUyj4QMRrs-N8TNTbqJmsLuleq2mNYuS6ydDKvXQfsY66kacJzdM5NnoUVnAVtzj16MVdd56pK3350IMmSLQyOyOXldQTB8AhsIsl2arl0RVtH_G-MK2HlC_DvwPsdXN-mQMw-NxYzBruXT6hauYiqGudmty0][this diagram]]):
#+begin_plantuml
skinparam Linetype ortho
skinparam NodeSep 80
skinparam RankSep 80
skinparam Padding 5
skinparam MinClassWidth 40
skinparam SameClassWidth true
#+end_plantuml

And there are a lot more undocumented features: https://forum.plantuml.net/7095

*** Custom Reification
Ability to describe custom reification situations using the Property Reification Vocabulary (PRV)

*** Use MindMap/WBS for Hierarchies
Plantuml now has [[http://plantuml.com/mindmap-diagram][MindMap]] and [[http://plantuml.com/wbs-diagram][WBS (or OBS)]] diagrams that use a simple bulleted syntax to draw hierarchies.

It would be nice to use this to draw hierarchies of individuals, in particular taxonomies.

Here are examples of the two styles:
- [[http://www.plantuml.com/plantuml/uml/SoWkIImgoStCIybDBE3IKd1szUVIqbBmLGi6Ka0wiIWxjIGpBntC2qxCIIq6IJk7W5Mv-0Q0nTsB4WioN9p0x82Sn9Aq_A9SBeVKl1IekG00][Mindmap]]
- [[http://www.plantuml.com/plantuml/uml/SoWkIImgAKygvj9IS7RrvzBIKl1L2mPIG3gnA3kr93Cl7SmBJin9BGP9EuU0LRdu1e35tOiI2p9SdC3iW9p4ahJyebmkXzIy5A2P0000][WBS]]

** Long-Term
*** rdf2soml to Generate Semantic Object Models
A new tool ~rdf2soml~ to generate Ontotext Platform SOML from RDF examples.

What's missing? Most importantly: property cardinality and virtual inverses.

PlantUML can show arrow cardinalities, and this simple and natural [[http://www.plantuml.com/plantuml/uml/SoWkIImgAStDuSh8J4bLICuiIiv9XR1JSmjAAXLoKqioybEAaOKIIqgACfDAIrABkI8Kb0oi39KKT7DIqqfqxHIK3ArobHGY5QmK2eho2_HZyZBpoWA0B2w7rBmKe2q0][PlantUML code]]:
#+BEGIN_SRC plantuml
X "0:1" -left-> "1:m" Y : prop/\ninvProp
#+END_SRC
Is depicted as follows:

[[./ideas/cardinality-and-inverse.png]]

We have two options how to express this in triples:

**** Cardinality With RDF*
#+BEGIN_SRC turtle
##### model triples
:X :prop :Y.
##### puml triples
<< :X :prop :Y >>
puml:arrow puml:left; # direction
puml:min 1; puml:max puml:inf; # cardinality
puml:inverseAlias [puml:min 0; puml:max 1; puml:name "invProp"]. # virtual inverse
#+END_SRC
- Pros: very natural
- Cons:
- Perl RDF doesn't support RDF*, and few editors support it either.
- Annotating a triple does not assert it, so we need to assert it as well

**** Cardinality With Blank Node

#+BEGIN_SRC turtle
##### model triples
:X :prop :Y.
##### puml triples
:X puml:left :Y. # direction
:X :prop [ # a puml:Cardinality; # may need this marker class to skip the node from the diagram
puml:min 1; puml:max puml:inf; # cardinality
puml:object :Y; # only needed if X has several relations "prop" and they need different annotations
puml:inverseAlias [puml:min 0; puml:max 1; puml:name "invProp"] # virtual inverse
].
#+END_SRC
*** rdf2shape to Describe & Generate RDF Shapes
*** Visualize RDF Shapes (SHACL and ShEx)
Issue [[https://github.com/VladimirAlexiev/rdf2rml/issues/8][#8]]: discussion with Thomas Francart of Sparna

I developed this SHACL to PlantUML converter, in Java, based on TopQuadrant SHACL lib, and the result is at https://shacl-play.sparna.fr/play/draw and code at https://github.com/sparna-git/shacl-play/tree/master/shacl-diagram

I don't have a strong opinion on the example you provide, an alternative idea that comes to my mind is
#+begin_src turtle
:node1 :link [
rdf:value :node2;
puml:min 1 ;
puml:max 2 ;
]
#+end_src
But this changes the structure of the example graph itself, which might not be convenient

*** Generate transformations for other than relational sources
R2RML works great for RDBMS, but how about other sources?
Extend rdf2rml to generate:
- [[http://rml.io][RML:]] extends R2RML to handle RDB, XML, JSON, CSV
- [[http://github.com/semantalytics/xsparql][XSPARQL:]] extends XQuery with SPARQL construct and JSON input
- DONE [[https://tarql.github.io/][tarql]]: handles TSV/CSV with SPARQL construct
- DONE OntoRefine: transformation of TSV/CSV and direct loading to GraphDB with SPARQL Update