{"id":44554083,"url":"https://github.com/digicademy/cmiferator","last_synced_at":"2026-02-13T21:06:01.893Z","repository":{"id":245851443,"uuid":"810274995","full_name":"digicademy/cmiferator","owner":"digicademy","description":"CMIFerator – generate Correspondence Metadata Interchange File from eXist-db based editions of letters","archived":false,"fork":false,"pushed_at":"2025-02-11T16:58:50.000Z","size":38,"stargazers_count":2,"open_issues_count":12,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-09-04T23:50:48.444Z","etag":null,"topics":["correspsearch","exist-db","xquery"],"latest_commit_sha":null,"homepage":"","language":"XSLT","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/digicademy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-06-04T11:35:57.000Z","updated_at":"2025-06-18T18:46:52.000Z","dependencies_parsed_at":"2024-06-24T13:19:45.613Z","dependency_job_id":"f60398d1-ba72-4d07-a86d-7ee30d5a91c0","html_url":"https://github.com/digicademy/cmiferator","commit_stats":null,"previous_names":["digicademy/cmiferator"],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/digicademy/cmiferator","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digicademy%2Fcmiferator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digicademy%2Fcmiferator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digicademy%2Fcmiferator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digicademy%2Fcmiferator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/digicademy","download_url":"https://codeload.github.com/digicademy/cmiferator/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/digicademy%2Fcmiferator/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29417709,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-13T06:24:03.484Z","status":"ssl_error","status_checked_at":"2026-02-13T06:23:12.830Z","response_time":78,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["correspsearch","exist-db","xquery"],"created_at":"2026-02-13T21:06:01.054Z","updated_at":"2026-02-13T21:06:01.888Z","avatar_url":"https://github.com/digicademy.png","language":"XSLT","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CMIFerator\n\nGenerate [CMIF (Correspondence Metadata Interchange File)](https://correspsearch.net/en/documentation.html) from [eXist-db](http://exist-db.org/) based editions of letters (e.g. using [ediarum](https://www.ediarum.org/)) ready for ingest in [correspSearch](https://correspsearch.net/).\n\nThe CMIFerator is a library of XQuery functions you can use to build your own CMIF API endpoint. A minimal example of an endpoint is given below.\n\nThe CMIFerator is released as an eXist-db library package that can be installed using the eXist-db package manager. Its tested and intended environment is eXist-db.\n\nCurrently, the CMIFerator supports CMIF version 1.\n\nThe XSLT stylesheet which subsets TEI `\u003ccorrespDesc\u003e` elements for (strict) conformance to CMIF (version 1) may possibly be of interest outside of the CMIFerator, as a starting point for more individualised purposes. (However, currently it depends on the configuration file specific to the CMIFerator.)\n\n## Documentation\n\nThe CMIFerator covers the following processing steps from normalised letter data to CMIF file:\n\n1. Update `\u003ccorrespAction\u003e` elements in individual letter files with the most up-to-date information from index files (regularised person names, person identifiers, regularised place names …).\n2. Subset `\u003ccorrespDesc\u003e` elements in individual letter files for (strict) conformance to the CMIF standard.\n3. Wrap `\u003ccorrespDesc\u003e` elements in a CMIF template and fill in metadata.\n\nOther steps you might require may, of course, be added individually into the endpoint – for example, the selection of which files to include into the CMIF.\n\n### Functions\n\nThe CMIFerator is developed as a function library to make it modular and adaptable to diverse requirements. Only parts of its functionality may be relevant to you. For this use case, all component functions for smaller processing steps are made available in the library. Conceivably, the processing steps proposed by this library might in individual cases be interspersed with other steps.\n\nAt the same time, convenience wrapper functions are provided that wrap several or all processing steps in a single function. An all-in-one function may well be all you require.\n\n#### update-correspAction()\n\nUpdate `\u003cpersName\u003e`, `\u003corgName\u003e` and `\u003cplaceName\u003e` elements within `\u003ccorrespAction\u003e` with normalised information from indices, e.g. regularised name forms or authority controlled identifiers.\n\n##### Configuration parameters used by this function\n\nThis function uses the `\u003cindices\u003e` block in the configuration file. For each type of named entity that can appear in `\u003ccorrespAction\u003e` (persons, organizations, places), an index file path may be provided – either for a single file (`\u003cresource\u003e`) or a folder of files (`\u003ccollection\u003e`).\n\nProviding indices is optional – e.g. if no organizazions figure in your edition, you may omit them in the configuration.\n\nThe elements retrieved from the indices must be inserted as `\u003cpersName\u003e`, `\u003corgName\u003e` and `\u003cplaceName\u003e` into `\u003ccorrespAction\u003e`. Typically, indices rather consist of `\u003cperson\u003e` elements etc. (Or might follow some completely different, project-specific schema.) In this case, configure an XSLT stylesheet path in `\u003cstylesheet\u003e` to transform an individual entry from your project-specific index schema to TEI name elements. Stylesheets which transform ediarum indices are provided in the config-examples.\n\n(If your indices *do* consist of TEI name elements such as `\u003cpersName\u003e`, you may omit the stylesheet configuration parameter.)\n\n#### subset-correspDesc()\n\nSubset `\u003ccorrespDesc\u003e` elements in individual letter files for (strict) conformance to the CMIF standard. This subsetting (implemented in [correspDesc-transform.xsl](library-package/content/correspDesc-transform.xsl)) makes a number of assumptions and choices to ensure CMIF conformance:\n\n* Only one `\u003ccorrespDesc\u003e` element per file is retained – the first one.\n* Only one `\u003cdate\u003e` element per `\u003ccorrespAction\u003e` is retained – the first one.\n* Only date attributes conforming to the CMIF requirements are retained – all others are discarded.\n\nIf this behaviour is too restrictive for your use case, a possible solution might be to first do a project-specific transformation to re-order/select your desired elements.\n\nFor CMIF version 2 compatibility, the `\u003cnote\u003e` and the `\u003cref\u003e` elements in it are passed through by this function. (Currently, the CMIFerator contains no mechanism to create these elements.)\n\n##### Configuration parameter used by this function\n\nCurrently, the CMIFerator makes the hard assumption that the permalinks for your letters will be concatenated from a base URL and the `@xml:id` attribute of the root `\u003cTEI\u003e` element. This may be subject to change in future versions.\n\n#### wrap-CMIF()\n\nThe date of the CMIF file is generated at runtime.\n\n##### Configuration parameters used by this function\n\nThis function uses the `\u003cheader\u003e` block in the configuration file to fill in the `/TEI/teiHeader` template of the CMIF file.\n\n#### Convenience wrappers\n\nThe wrapper `update-subset-wrap()` combines the three processing steps documented above into one convenient function. Similarly, if only parts of the processing flow proposed above apply to your use case, the wrappers `update-subset()` and `subset-wrap()` might cover what you need.\n\n### Configuration (example)\n\nThe configuration file needs to be structured like this example:\n```XML\n\u003cconfiguration xmlns=\"http://www.digitale-akademie.de/cmiferator\"\u003e\n    \n    \u003c!-- metadata of the CMIF file --\u003e\n    \u003cheader\u003e\n    \n        \u003c!-- plain text: title of the CMIF file – may be different from the project name --\u003e\n        \u003ctitle\u003eDie sozinianischen Briefwechsel:\n            Zwischen Theologie, frühmoderner Naturwissenschaft und politischer Korrespondenz\u003c/title\u003e\n        \n        \u003c!-- XML fragment: name as plain text, can include the \u003cemail\u003e element in TEI namespace --\u003e\n        \u003ceditor\u003eJulian Jarosch \u003cemail xmlns=\"http://www.tei-c.org/ns/1.0\"\u003esbw@adwmainz.de\u003c/email\u003e\u003c/editor\u003e\n        \n        \u003c!-- XML fragment: one \u003cref\u003e element in TEI namespace --\u003e\n        \u003cpublisher\u003e\n            \u003cref xmlns=\"http://www.tei-c.org/ns/1.0\" target=\"http://www.adwmainz.de/\"\u003eAkademie der Wissenschaften und der Literatur | Mainz\u003c/ref\u003e\n        \u003c/publisher\u003e\n        \n        \u003c!-- plain text: URL where the CMIF file is available online --\u003e\n        \u003curl\u003ehttps://gitlab.rlp.net/adwmainz/digicademy/sbw/csv-data-dump/-/raw/main/data/cmif/corresp.xml\u003c/url\u003e\n        \n        \u003c!-- plain text: UUID for the source --\u003e\n        \u003cuuid\u003eb3b22a15-9906-406b-aae1-7d7fa2292e71\u003c/uuid\u003e\n        \n        \u003c!-- XML fragment: content of the \u003cbibl\u003e element (in TEI namespace where necessary) --\u003e\n        \u003csource\u003eDie sozinianischen Briefwechsel:\n                Zwischen Theologie, frühmoderner Naturwissenschaft und politischer Korrespondenz,\n                erarbeitet und herausgegeben von Kęstutis Daugirdas und Andreas Kuczera.\n                Johannes a Lasco Bibliothek Emden, 2020.\n                \u003cref xmlns=\"http://www.tei-c.org/ns/1.0\" target=\"https://sozinianer.de\"\u003ehttps://sozinianer.de\u003c/ref\u003e\u003c/source\u003e\n    \u003c/header\u003e\n    \n    \u003c!-- prefix / base URL to construct the permalink --\u003e\n    \u003cnamespace\u003ehttps://sozinianer.de/id/MAIN_\u003c/namespace\u003e\n    \n    \u003c!-- index files --\u003e\n    \u003c!-- all parameters in this block are optional (though some are likely necessary) --\u003e\n    \u003cindices\u003e\n        \u003cpersons\u003e\n            \u003cresource\u003e/db/apps/tei2json/xml/Register/Personen.xml\u003c/resource\u003e\n            \u003ccollection/\u003e\n            \u003cstylesheet\u003e/db/apps/tei2json/CMIFerate-config/persons-ediarum-transform.xsl\u003c/stylesheet\u003e\n        \u003c/persons\u003e\n        \u003corganizations\u003e\n            \u003cresource/\u003e\n            \u003ccollection/\u003e\n            \u003cstylesheet/\u003e\n        \u003c/organizations\u003e\n        \u003cplaces\u003e\n            \u003cresource\u003e/db/apps/tei2json/xml/Register/Orte.xml\u003c/resource\u003e\n            \u003ccollection/\u003e\n            \u003cstylesheet\u003e/db/apps/tei2json/CMIFerate-config/places-ediarum-transform.xsl\u003c/stylesheet\u003e\n        \u003c/places\u003e\n    \u003c/indices\u003e\n    \n\u003c/configuration\u003e\n```\n\n### Using the functions in an API endpoint\n\nAn example API endpoint using the CMIFerator:\n\n```XQuery\nxquery version \"3.1\";\n\nimport module namespace cmiferator = \"http://www.digitale-akademie.de/cmiferator\";\n\ndeclare default element namespace \"http://www.tei-c.org/ns/1.0\";\n\nlet $config-filepath := '/db/apps/tei2json/CMIFerate-config/config.xml'\n\n(: this assumes that all resources will be included in the CMIF – no exclusion criteria :)\nlet $letters := collection('/db/projects/sbw/data/Briefe')/TEI\n\nreturn cmiferator:update-subset-wrap($letters, $config-filepath)\n```\n\nPerhaps some additional output options might prove useful to the API:\n```XQuery\ndeclare namespace output=\"http://www.w3.org/2010/xslt-xquery-serialization\";\ndeclare option output:method \"xml\";\ndeclare option output:media-type \"text/xml\";\ndeclare option output:omit-xml-declaration \"no\";\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdigicademy%2Fcmiferator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdigicademy%2Fcmiferator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdigicademy%2Fcmiferator/lists"}