{"id":39497971,"url":"https://github.com/clarin-eric/vlo-mapping-creator","last_synced_at":"2026-01-18T05:44:02.401Z","repository":{"id":28904744,"uuid":"119487116","full_name":"clarin-eric/VLO-mapping-creator","owner":"clarin-eric","description":"Tool to create a VLO mapping file based on a CSV and optionally a CLAVAS vocabulary","archived":false,"fork":false,"pushed_at":"2022-04-28T07:18:18.000Z","size":40,"stargazers_count":0,"open_issues_count":3,"forks_count":0,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-09-10T03:14:31.296Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"XSLT","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/clarin-eric.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-01-30T05:28:35.000Z","updated_at":"2019-09-10T09:40:31.000Z","dependencies_parsed_at":"2022-07-27T17:18:43.537Z","dependency_job_id":null,"html_url":"https://github.com/clarin-eric/VLO-mapping-creator","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/clarin-eric/VLO-mapping-creator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clarin-eric%2FVLO-mapping-creator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clarin-eric%2FVLO-mapping-creator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clarin-eric%2FVLO-mapping-creator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clarin-eric%2FVLO-mapping-creator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/clarin-eric","download_url":"https://codeload.github.com/clarin-eric/VLO-mapping-creator/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clarin-eric%2FVLO-mapping-creator/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28531370,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-18T00:39:45.795Z","status":"online","status_checked_at":"2026-01-18T02:00:07.578Z","response_time":98,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-18T05:44:01.573Z","updated_at":"2026-01-18T05:44:02.389Z","avatar_url":"https://github.com/clarin-eric.png","language":"XSLT","funding_links":[],"categories":[],"sub_categories":[],"readme":"# VLO-mapping-creator\nTool to create a VLO mapping file based on a CSV and optionally a CLAVAS vocabulary.\n\n## CSV\n\nSee [CSV file](src/test/resources/resourceclass-full.csv)\n\n|   | A                                        | B                  | C     | D                |\n| - | ---------------------------------------- | ------------------ | ----- |------------------|\n| 1 | resourceclass                            | resourceclass      | genre | TF-Notes         |\n| 2 | AnnotatedTextCorpus                      | annotatedText;text | some  |                  |\n| 3 | SongsAnthologiesLinguistic corporaCorpus | audioRecording     | other |                  |\n| 4 | ~Speech.*                                | audioRecording     | foo   |                  |\n| 5 | Spoken Corpus                            | audioRecording     | bar   |                  |\n| 6 | OralCorpus                               | corpus             |       |                  |\n| 7 | OralCorpus                               | audioRecording     |       |                  |\n| 8 | AnthologiesDevotional, \"literature\"      | !                  |       | skip             |\n| 9 | foo                                      |                    |       | too be discussed |\n\n- Row 1: column headers referring to facets\n- Column A: source\n- Column B and higher: targets, each facet should appear only once, unless a column header starts with `TF-` (case insensitive)\n- Column B and highter: a column where a header starts with `TF-` (case insensitive) is to be used by the task force (for notes or whatever) and will not be interpreted by the VLO Mapping Creator (see column D)\n- Source values (row 2 and higher, column A): if starting with a tilde (`~`) the value is assumed to be a regular expression (see line 4)\n- Target values (row 2 and higher, column B and higher): multiple values for one target facet are to be separated by semicolon (`;`) (see line 2)\n- Target values (row 2 and higher, column A): if the value is a exclamation mark (`!`) the source value is deleted and not replaced (see line 8)\n- Make sure all rows have an equal number of columns!\n- Source values are grouped into the mapping XML (see line 6 and 7)\n- If no target values/actions are known the row will be skipped (see line 9)\n\n```\n\"resourceclass\",\"resourceclass\",\"genre\",\"TF-notes\"\n\"AnnotatedTextCorpus\",\"annotatedText;text\",\"some\",\n\"SongsAnthologiesLinguistic corporaCorpus\",\"audioRecording\",\"other\",\n\"~Speech.*\",\"audioRecording\",\"foo\",\n\"Spoken Corpus\",\"audioRecording\",\"bar\",\n\"OralCorpus\",\"corpus\",,\nOralCorpus,\"audioRecording\",,\n\"AnthologiesDevotional, \"\"literature\"\"\",!,,skip\nfoo,,,to be discussed\n```\n\n- Double quote (`“`) in the value can be escaped by doubling (`foo””bar`) (see line 8)\n- Double quotes (`“`) are only mandatory if the value contains a comma (`,`) (see line 7 and 8)\n\n## SKOS\n\nThe VLO Mapping Creator can merge a SKOS file with a CSV file. The process add mappings from `altLabel`s and `hiddenLabel`s to the `prefLabel`.\n\n### Caveats\n\n- This only works for simple cases at the moment, i.e., the curation of a single facet where the CSV file has as source (column A) the same facet as the target (column B).\n- Also regexps are not yet supported.\n\n## Template\n\nIn a mapping file the behaviour of intergrating the target value into the target facet can be tweaked, e.g., to overwrite all existing values. The default behaviour for a facet can taken from a template file.\n\n```XML\n\u003cdef\u003e\n    \u003ctarget-facet name=\"resourceclass\" removeSourceValue=\"true\" overrideExistingValues=\"false\"/\u003e\n\u003c/def\u003e\n``` \n\n## Command Line\n\n```sh\n$ java -jar vlo-mapping-creator.jar -?\nINF: java -jar vlo-mapping-creator.jar \u003cOPTION\u003e* \u003cCSV\u003e, where \u003cOPTION\u003e is one of those:\nINF: -s=\u003cSKOS\u003e SKOS file to merge with the CSV\nINF: -t=\u003cTMPL\u003e Template file to merge with the Mapping XML\nINF: -d        Enable debug info\n```\n\n### Examples\n\n```sh\n$ java -jar vlo-mapping-creator.jar -s src/test/resources/resourceclass.skos -t src/test/resources/default.xml src/test/resources/resourceclass.csv\n```\n```XML\n\u003c?xml version=\"1.0\" encoding=\"UTF-8\"?\u003e\n\u003cvalue-mappings\u003e\n    \u003corigin-facet name=\"resourceclass\"\u003e\n        \u003cvalue-map\u003e\n            \u003ctarget-facet name=\"resourceclass\" removeSourceValue=\"true\" overrideExistingValues=\"false\"/\u003e\n            \u003ctarget-value-set\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003eannotatedText\u003c/target-value\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003etext\u003c/target-value\u003e\n                \u003csource-value\u003eAnnotatedTextCorpus\u003c/source-value\u003e\n            \u003c/target-value-set\u003e\n            \u003ctarget-value-set\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003eaudioRecording\u003c/target-value\u003e\n                \u003csource-value\u003eSongsAnthologiesLinguistic corporaCorpus\u003c/source-value\u003e\n            \u003c/target-value-set\u003e\n            \u003ctarget-value-set\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003eaudioRecording\u003c/target-value\u003e\n                \u003csource-value\u003eSpeechCorpus\u003c/source-value\u003e\n            \u003c/target-value-set\u003e\n            \u003ctarget-value-set\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003eaudioRecording\u003c/target-value\u003e\n                \u003csource-value\u003eSpoken Corpus\u003c/source-value\u003e\n            \u003c/target-value-set\u003e\n            \u003ctarget-value-set\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003ecorpus\u003c/target-value\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003eaudioRecording\u003c/target-value\u003e\n                \u003csource-value\u003eOralCorpus\u003c/source-value\u003e\n            \u003c/target-value-set\u003e\n            \u003ctarget-value-set\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003eplainText\u003c/target-value\u003e\n                \u003csource-value\u003eAnthologiesDevotional literature\u003c/source-value\u003e\n            \u003c/target-value-set\u003e\n            \u003ctarget-value-set\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003etool\u003c/target-value\u003e\n                \u003csource-value\u003etol\u003c/source-value\u003e\n            \u003c/target-value-set\u003e\n        \u003c/value-map\u003e\n    \u003c/origin-facet\u003e\n\u003c/value-mappings\u003e\n```\n\n```sh\n$ java -jar target/vlo-mapping-creator.jar -t src/test/resources/default.xml src/test/resources/resourceclass-full.csv\n```\n```XML\n\u003c?xml version=\"1.0\" encoding=\"UTF-8\"?\u003e\n\u003cvalue-mappings\u003e\n    \u003corigin-facet name=\"resourceclass\"\u003e\n        \u003cvalue-map\u003e\n            \u003ctarget-facet name=\"resourceclass\" removeSourceValue=\"true\" overrideExistingValues=\"false\"/\u003e\n            \u003ctarget-value-set\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003eannotatedText\u003c/target-value\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003etext\u003c/target-value\u003e\n                \u003ctarget-value facet=\"genre\"\u003esome\u003c/target-value\u003e\n                \u003csource-value\u003eAnnotatedTextCorpus\u003c/source-value\u003e\n            \u003c/target-value-set\u003e\n            \u003ctarget-value-set\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003eaudioRecording\u003c/target-value\u003e\n                \u003ctarget-value facet=\"genre\"\u003eother\u003c/target-value\u003e\n                \u003csource-value\u003eSongsAnthologiesLinguistic corporaCorpus\u003c/source-value\u003e\n            \u003c/target-value-set\u003e\n            \u003ctarget-value-set\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003eaudioRecording\u003c/target-value\u003e\n                \u003ctarget-value facet=\"genre\"\u003efoo\u003c/target-value\u003e\n                \u003csource-value isRegex=\"true\"\u003eSpeech*\u003c/source-value\u003e\n            \u003c/target-value-set\u003e\n            \u003ctarget-value-set\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003eaudioRecording\u003c/target-value\u003e\n                \u003ctarget-value facet=\"genre\"\u003ebar\u003c/target-value\u003e\n                \u003csource-value\u003eSpoken Corpus\u003c/source-value\u003e\n            \u003c/target-value-set\u003e\n            \u003ctarget-value-set\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003ecorpus\u003c/target-value\u003e\n                \u003ctarget-value facet=\"resourceclass\"\u003eaudioRecording\u003c/target-value\u003e\n                \u003csource-value\u003eOralCorpus\u003c/source-value\u003e\n            \u003c/target-value-set\u003e\n            \u003ctarget-value-set\u003e\n                \u003ctarget-value facet=\"resourceclass\" removeSourceValue=\"true\"/\u003e\n                \u003csource-value\u003eAnthologiesDevotional, \"literature\"\u003c/source-value\u003e\n            \u003c/target-value-set\u003e\n        \u003c/value-map\u003e\n    \u003c/origin-facet\u003e\n\u003c/value-mappings\u003e\n```\n\n## TODO\n\n- [X] XSL log messages are not handled correctly yet\n- [ ] add tests\n- [ ] it needs to be possible to provide a vocabulary specific XSLT to process the SKOS mapping in more advanced cases","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclarin-eric%2Fvlo-mapping-creator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fclarin-eric%2Fvlo-mapping-creator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclarin-eric%2Fvlo-mapping-creator/lists"}