{"id":41293083,"url":"https://github.com/collectionspace/cspace-config-untangler","last_synced_at":"2026-01-23T03:43:16.732Z","repository":{"id":40006155,"uuid":"238968422","full_name":"collectionspace/cspace-config-untangler","owner":"collectionspace","description":"Generate CollectionSpace data overviews from profile/tenant configs","archived":false,"fork":false,"pushed_at":"2025-12-03T01:17:41.000Z","size":20808,"stargazers_count":0,"open_issues_count":12,"forks_count":4,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-12-06T01:54:14.116Z","etag":null,"topics":["collectionspace","data-migration","metadata"],"latest_commit_sha":null,"homepage":null,"language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/collectionspace.png","metadata":{"files":{"readme":"README.adoc","changelog":"CHANGELOG.adoc","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2020-02-07T16:10:55.000Z","updated_at":"2025-09-18T20:16:01.000Z","dependencies_parsed_at":"2023-01-25T16:30:56.289Z","dependency_job_id":"a87d660f-2a1b-4304-9414-eb094da600b0","html_url":"https://github.com/collectionspace/cspace-config-untangler","commit_stats":null,"previous_names":[],"tags_count":47,"template":false,"template_full_name":null,"purl":"pkg:github/collectionspace/cspace-config-untangler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/collectionspace%2Fcspace-config-untangler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/collectionspace%2Fcspace-config-untangler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/collectionspace%2Fcspace-config-untangler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/collectionspace%2Fcspace-config-untangler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/collectionspace","download_url":"https://codeload.github.com/collectionspace/cspace-config-untangler/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/collectionspace%2Fcspace-config-untangler/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28679263,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-23T01:00:35.747Z","status":"online","status_checked_at":"2026-01-23T02:00:08.296Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["collectionspace","data-migration","metadata"],"created_at":"2026-01-23T03:43:14.362Z","updated_at":"2026-01-23T03:43:16.723Z","avatar_url":"https://github.com/collectionspace.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":":toc:\n:toc-placement!:\n:toclevels: 4\n\nifdef::env-github[]\n:tip-caption: :bulb:\n:note-caption: :information_source:\n:important-caption: :heavy_exclamation_mark:\n:caution-caption: :fire:\n:warning-caption: :warning:\nendif::[]\n\n= CspaceConfigUntangler\n\ntoc::[]\n\n== Conceptual overview\nReads JSON config file output by CollectionSpace application.\n\nGets field definitions (`field_defs`), including repeatability, data type, value source, XML field name and parents, etc.\n\nGets fields as defined for use in forms (`form_fields`), including the panel in which the field is included, and UI hierarchy.\n\nGets messages assigned to fields, panels, and input tables from field_defs and the messages hash under the profile and record types. *It is assumed messages set at profile level will override those at lower levels*\n\nFor a given profile, matches each form_field to its corresponding field_def and creates a `field` object that combines all info for the field. If a form_field represents a field group populated with structured date fields, the individual structured date fields are provided from the extension, and the original form_field is treated as the parent UI grouping.\n\n_Note: there may be field_defs in a profile which do not match any form_fields. Field objects are not created/reported for these, because if a field has not been made available for viewing/editing in a form, it is not considered included in the profile._\n\n== Setup\n\n* Tested with Ruby 3.2 and 3.3. Use of earlier Ruby versions is not recommended\n* REQUIRED IF USING AS INTERNAL LYRASIS STAFF: Github CLI must be installed and your user authorized in that tool\n* Do `bundle --version`\n** If the version of Bundler is lower than 2.2.29, do `gem update bundler`\n** Bundler should come standard with Ruby 2.7.0, but may be an older version. If you get an error that you don't have Bundler installed when you try to check the version, do `gem install bundler`\n* Clone this repo\n* `cd` into cloned directory\n* `bundle/install`\n* Download your configs into the appropriate `data/configs` directory or directories\n* Configure your settings https://github.com/collectionspace/cspace-config-untangler/blob/master/lib/cspace_config_untangler.rb[in `lib/cspace_config_untangler.rb`].\n\n=== Optional: add the `exe` directory to your PATH\n\nThe benefit of this is that you can run `ccu` from the command line anywhere to interact with the application. If you don't do this, you can still use the tool, but must `cd` into the cloned repository directory and use `exe/ccu` when entering a command in your terminal.\n\nThe way you do this is different depending on your operating system, terminal configuration, and whether you want it to be permanent or not, so google it.\n\n=== Optional: create a client connection config file\n\n==== Why/when is this necessary?\nUnfortunately the UI config JSON file does not contain any information about which vocabularies (dynamic term lists) are configured for an instance. The only way to get this data programmatically is via API calls.\n\nFor the purposes of this code base, that means using https://github.com/collectionspace/collectionspace-client[collectionspace-client] to interact with the API, and this requires, at minimum, management of authentication credentials for a user with at least the TENANT_READER role/permission. The purpose of cspace-config-untangler is solely to read available CollectionSpace configuration info, not to make any changes to any CollectionSpace instance, so we recommend against configuring connections with more than TENANT_READER permissions.\n\n*Currently, you only need to create this config file if you (a) are working with non-community-supported profiles; (b) you wish to run commands found beneath `ccu vocabs`; AND (c) you are not Lyrasis staff with access to AWS SSM parameter store.* If you run one of those commands on a profile without a client connection config, it will just fail gracefully, telling you that you need to configure a connection.\n\n*You do not need to create a client connection config file if you are working only with the community supported profiles.* The `Reader` user in those instances is set up following a pattern which works without needing configuration.\n\n==== Client connection config file location\n\nBy default, the application will look for the config file in `(your home directory}/.config/cspace-config-untangler/client_connection_config.yml`. You can override that that by changing the default value of the `:client_connection_config_path` setting in `lib/cspace_config_untangler.rb` from `nil` to your path (a String).\n\n==== Client connection config file format\n\nThe config file must be a valid YAML file.\n\nHere is a sample, which will be explained below:\n\n.[source,yaml]\n----\ninstanced:\n  base_uri: https://collectionspace.institution_d_name.org/cspace-services\n  username: readonlyuser@institution_d_name.org\n  password: randomstring\n  profile: anthro\n----\n\nThe top level key is the profile name the connection will be associated, minus any version suffix associated with the profile/UI config file. For example, you may have a UI config file for instancea_8-0-1. The key here would be just `instancea`.\n\nThe profile key is not required, but can be used to indicate that the instance is using a UI config you have accessible in the Untangler.\n\nWARNING: While we can keep previous versions of UI config files to compare or go back in time, this is not true for client connections. Those are always made to the current, live instance.\n\nNOTE: There is an `--env` option on relevant commands that allow you to specify that the dev or qa instances of community supported profiles should be used.\n\n== Usage\n\nOnce the setup is done, you should be able to `cd` into the cloned directory and type `exe/ccu` (or just `ccu` if you have installed as a gem) at the command prompt to get the list of available functions with their brief descriptions.\n\n[TIP]\n====\nThe best source of info on what each function does and how to use it is the documentation available from the command line interface (CLI).\n\nFor the top-level command groups:\n\n`exe/ccu`\n\nFor an overview of the specific commands inside a group (using the profiles group as an example):\n\n`exe/ccu profiles`\n\nFor details on usage of a specific command (using the profiles compare command as an example):\n\n`exe/ccu profiles help compare`\n====\n\nThere are detailed instructions for some common tasks in the `doc` directory.\n\n\n== Known limitations/issues\n\n=== General\n\nIMPORTANT: This tool can only be used confidently with configs from CollectionSpace 6.1 and newer\n\n* For 5.2 configs, data source values are not consistently supplied for structured date fields. This is because configuration of the structured date fields was not written out to the JSON config in a standard way until 6.0.\n* The 6.1 release further refined the JSON config output allowing the full functionality of this tool\n* Does not currently report on fields in the `ns2:collectionspace_core` namespace\n* Does not currently report on fields in the `rel:relations-common-list` namespace because the way this data is defined in the config is very different from the rest\n* `contact` and `blob` get reported/treated as extensions within the tool, rather than sub-records\n* Does not support fields in custom namespaces added to `contact` or `blob`\n\n=== Working with non-community profiles\n\n* Do `exe/ccu fields csv -p all` and check whether the `data_type` column has any blank values. If so, probably your profile has configured some fields from extensions in an unexpected manner. This can cause `forms/default/props/subpath` values (used to create form_field ids) to not match the `fields/document/.../{fieldname}/[config]/messages/name/id` values (used to create field_def ids) for some fields. The Untangler is then unable to match up form_field info with field_def info to generate the necessary combined field info required for fully-populated fields CSV, CSV template, and RecordMapper output. You'll need to do some hard-coding somewhere in the code to get a match\n* Do you have fields with the same name in different namespaces in the same record type? Use `exe/ccu fields nonunique` to generate a listing of any such fields.\n** The code tries to automatically fix this https://github.com/collectionspace/cspace-config-untangler/blob/16a3da1dec21a80e7658d065d85a3cc548c72292/lib/cspace_config_untangler/record_types.rb#L77-L81[here] but if any non-unique field names are sneaking through, you may need to hard-code something to fix this. Otherwise, you will get two columns in your CSV template with the same header and it won't be clear which field that data should be imported into.\n* If you have record types with (a) *no* required field; or (b) multiple required fields, you will need to hard-code `identifier_field` values in `record_mapper.rb`'s `get_id_field` method.\n* The `mini` template for a record type is ignored as a source for field information. If you have a field that is used only in a `mini` template, it will not be included in the field data, mappers, or CSV templates this tool produces.\n* RECOMMENDED: add your profile name and the last version of that profile that should be handled with fancy column/fieldname style. If you do not configure this for your profile, you will get warnings on the screen and in your log file, and data exported from CollectionSpace for round-tripping with the CSV importer may not be importable without fixing some column headers. See Other topics \u003e Column styles for more explanation.\n\n== Other topics\n\n=== JSON config source files\n\nSince there is no way to programmatically grab the JSON config, this currently requires you to manually download the JSON config files from the following links. The JSON files should be saved as `{profilename}.json` in the `data/configs` directory.\n\nIMPORTANT: You must follow the config naming conventions specified below in order for the Untangler to properly identify profile name and version!\n\n-  https://core.collectionspace.org/cspace/core/config\n-  https://anthro.collectionspace.org/cspace/anthro/config\n-  https://bonsai.collectionspace.org/cspace/bonsai/config\n-  https://botgarden.collectionspace.org/cspace/botgarden/config\n-  https://fcart.collectionspace.org/cspace/fcart/config\n-  https://herbarium.collectionspace.org/cspace/herbarium/config\n-  https://lhmc.collectionspace.org/cspace/lhmc/config\n-  https://materials.collectionspace.org/cspace/materials/config\n-  https://publicart.collectionspace.org/cspace/publicart/config\n\nAnd for the latest dev versions of profiles:\n\n-  https://core.dev.collectionspace.org/cspace/core/config\n-  https://anthro.dev.collectionspace.org/cspace/anthro/config\n-  https://fcart.dev.collectionspace.org/cspace/fcart/config\n-  https://lhmc.dev.collectionspace.org/cspace/lhmc/config\n-  https://publicart.dev.collectionspace.org/cspace/publicart/config\n-  https://materials.dev.collectionspace.org/cspace/materials/config\n-  https://herbarium.dev.collectionspace.org/cspace/herbarium/config\n-  https://botgarden.dev.collectionspace.org/cspace/botgarden/config\n-  https://bonsai.dev.collectionspace.org/cspace/bonsai/config\n\n\nSet `CCU.const_set('MAINPROFILE')` value in `lib/cspace_config_untangler.rb`.\n\n==== Config (and resulting mapper/template) naming conventions\n\nConfig file name must contain the profile name and profile version.\n\nUse `_` (underscore) to separate the profile name and profile version sections of the name.\n\nUse `-` (hyphen) to separate words/numbers within a section.\n\nExamples:\n\n`anthro_4-1-2.json`\n\n`my-custom-config_2-0.json`\n\nThis allows the Untangler to split the config file name on `_` and unambiguously determine profile name vs. profile version.\n\nOutput files follow the same convention, adding the recordtype section:\n\n`anthro_4-1-2_concept-associated.json`\n\n\n=== Column styles (\"last fancy column version\")\n\nThis is related to:\n\n* the field names/column headers in CSVs exported from CollectionSpace\n* the field names/column headers in the CSV templates generated by this tool, and for which mapping instructions are generated for CSV import\n\n[TIP]\n====\nYou can pretty much ignore this if:\n\n* you are using a pre-6.1 release of CollectionSpace, since you are unable to export data in CSV from search results.\n* you are not roundtripping exported data from CollectionSpace back in via the CSV Import Tool\n\nIf you are annoyed by warnings about it on the screen and in your logs, you can configure it, but it won't really matter what you enter as the last fancy column version\n====\n\nThis mainly affects fields which may be populated with terms from multiple authorities, where several columns of CSV data map into one CollectionSpace data field.\n\nPrior to CollectionSpace 7.0, CollectionSpace export and this tool both tried to create shorter, less redundant column names using a more \"fancy\" algorithm, but the two tools ended up creating columns with slightly different names. We realized this, and the fact that it would require more data prep for roundtripping, while building 7.0.\n\nIn CollectionSpace 7.0 and beyond, the column names are longer and sometimes a bit internally redundant, but they are consistent with each other for both export and import.\n\nFor the community profiles, we increment the profile version with each CollectionSpace release, so the version used with 6.1 is enterd in the settings as the last fancy version for each profile.\n\nIf this affects you, add a line for your profile to the `default_last_fancy_column_versions` hash, and include the version of your profile that was used with CollectionSpace 6.1.\n\n[IMPORTANT]\n====\nIf you do not configure this for your profile, the consistent column naming style will be used.\n\nIf you are on 6.1 and configure this correctly, you will get fancy column headers. You may still have to fix some column names for import (the pre-processing step of the import will warn you about them). You would have to fix a lot more column names if you are exporting from 6.1 (fancy export column names), but using the consistent headers in your CSV import data.\n====\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcollectionspace%2Fcspace-config-untangler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcollectionspace%2Fcspace-config-untangler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcollectionspace%2Fcspace-config-untangler/lists"}