Reads JSON config file output by CollectionSpace application.
Gets field definitions (field_defs
), including repeatability, data type, value source, XML field name and parents, etc.
Gets fields as defined for use in forms (form_fields
), including the panel in which the field is included, and UI hierarchy.
Gets messages assigned to fields, panels, and input tables from field_defs and the messages hash under the profile and record types. It is assumed messages set at profile level will override those at lower levels
For a given profile, matches each form_field to its corresponding field_def and creates a field
object that combines all info for the field. If a form_field represents a field group populated with structured date fields, the individual structured date fields are provided from the extension, and the original form_field is treated as the parent UI grouping.
Note: there may be field_defs in a profile which do not match any form_fields. Field objects are not created/reported for these, because if a field has not been made available for viewing/editing in a form, it is not considered included in the profile.
-
Tested with Ruby 3.1.0
-
Do
bundle --version
-
If the version of Bundler is lower than 2.2.29, do
gem update bundler
-
Bundler should come standard with Ruby 2.7.0, but may be an older version. If you get an error that you don’t have Bundler installed when you try to check the version, do
gem install bundler
-
-
Clone this repo
-
cd
into cloned directory -
bundle/install
-
Download your configs into the appropriate
data/configs
directory or directories -
Configure your settings in
lib/cspace_config_untangler.rb
.
The benefit of this is that you can run ccu
from the command line anywhere to interact with the application. If you don’t do this, you can still use the tool, but must cd
into the cloned repository directory and use exe/ccu
when entering a command in your terminal.
The way you do this is different depending on your operating system, terminal configuration, and whether you want it to be permanent or not, so google it.
Unfortunately the UI config JSON file does not contain any information about which vocabularies (dynamic term lists) are configured for an instance. The only way to get this data programmatically is via API calls.
For the purposes of this code base, that means using collectionspace-client to interact with the API, and this requires, at minimum, management of authentication credentials for a user with at least the TENANT_READER role/permission. The purpose of cspace-config-untangler is solely to read available CollectionSpace configuration info, not to make any changes to any CollectionSpace instance, so we recommend against configuring connections with more than TENANT_READER permissions.
Currently, you only need to create this config file if you wish to run commands found beneath ccu vocabs
. If you run one of those commands on a profile without a client connection config, it will just fail gracefully, telling you that you need to configure a connection.
You do not need to create a client connection config file if you are working only with the community supported profiles. The Reader
user in those instances is set up following a pattern which works without needing configuration.
By default, the application will look for the config file in (your home directory}/.config/cspace-config-untangler/client_connection_config.yml
. You can override that that by changing the default value of the :client_connection_config_path
setting in lib/cspace_config_untangler.rb
from nil
to your path (a String).
The config file must be a valid YAML file.
Here is a sample, which will be explained below:
instanced: base_uri: https://collectionspace.institution_d_name.org/cspace-services username: readonlyuser@institution_d_name.org password: randomstring profile: anthro
The top level key is the profile name the connection will be associated, minus any version suffix associated with the profile/UI config file. For example, you may have a UI config file for instancea_8-0-1. The key here would be just instancea
.
The profile key is not required, but can be used to indicate that the instance is using a UI config you have accessible in the Untangler.
|
While we can keep previous versions of UI config files to compare or go back in time, this is not true for client connections. Those are always made to the current, live instance. |
ℹ️
|
There is an --env option on relevant commands that allow you to specify that the dev or qa instances of community supported profiles should be used.
|
Once the setup is done, you should be able to cd
into the cloned directory and type exe/ccu
(or just ccu
if you have installed as a gem) at the command prompt to get the list of available functions with their brief descriptions.
💡
|
The best source of info on what each function does and how to use it is the documentation available from the command line interface (CLI). For the top-level command groups:
For an overview of the specific commands inside a group (using the profiles group as an example):
For details on usage of a specific command (using the profiles compare command as an example):
|
There are detailed instructions for some common tasks in the doc
directory.
❗
|
This tool can only be used confidently with configs from CollectionSpace 6.1 and newer |
-
For 5.2 configs, data source values are not consistently supplied for structured date fields. This is because configuration of the structured date fields was not written out to the JSON config in a standard way until 6.0.
-
The 6.1 release further refined the JSON config output allowing the full functionality of this tool
-
Does not currently report on fields in the
ns2:collectionspace_core
namespace -
Does not currently report on fields in the
rel:relations-common-list
namespace because the way this data is defined in the config is very different from the rest -
contact
andblob
get reported/treated as extensions within the tool, rather than sub-records -
Does not support fields in custom namespaces added to
contact
orblob
-
Do
exe/ccu fields csv -p all
and check whether thedata_type
column has any blank values. If so, probably your profile has configured some fields from extensions in an unexpected manner. This can causeforms/default/props/subpath
values (used to create form_field ids) to not match thefields/document/…/{fieldname}/[config]/messages/name/id
values (used to create field_def ids) for some fields. The Untangler is then unable to match up form_field info with field_def info to generate the necessary combined field info required for fully-populated fields CSV, CSV template, and RecordMapper output. You’ll need to do some hard-coding somewhere in the code to get a match -
Do you have fields with the same name in different namespaces in the same record type? Use
exe/ccu fields nonunique
to generate a listing of any such fields.-
The code tries to automatically fix this here but if any non-unique field names are sneaking through, you may need to hard-code something to fix this. Otherwise, you will get two columns in your CSV template with the same header and it won’t be clear which field that data should be imported into.
-
-
If you have record types with (a) no required field; or (b) multiple required fields, you will need to hard-code
identifier_field
values inrecord_mapper.rb’s `get_id_field
method. -
The
mini
template for a record type is ignored as a source for field information. If you have a field that is used only in amini
template, it will not be included in the field data, mappers, or CSV templates this tool produces. -
RECOMMENDED: add your profile name and the last version of that profile that should be handled with fancy column/fieldname style. If you do not configure this for your profile, you will get warnings on the screen and in your log file, and data exported from CollectionSpace for round-tripping with the CSV importer may not be importable without fixing some column headers. See Other topics > Column styles for more explanation.
Since there is no way to programmatically grab the JSON config, this currently requires you to manually download the JSON config files from the following links. The JSON files should be saved as {profilename}.json
in the data/configs
directory.
❗
|
You must follow the config naming conventions specified below in order for the Untangler to properly identify profile name and version! |
And for the latest dev versions of profiles:
Set CCU.const_set('MAINPROFILE')
value in lib/cspace_config_untangler.rb
.
Config file name must contain the profile name and profile version.
Use _
(underscore) to separate the profile name and profile version sections of the name.
Use -
(hyphen) to separate words/numbers within a section.
Examples:
anthro_4-1-2.json
my-custom-config_2-0.json
This allows the Untangler to split the config file name on _
and unambiguously determine profile name vs. profile version.
Output files follow the same convention, adding the recordtype section:
anthro_4-1-2_concept-associated.json
This is related to:
-
the field names/column headers in CSVs exported from CollectionSpace
-
the field names/column headers in the CSV templates generated by this tool, and for which mapping instructions are generated for CSV import
💡
|
You can pretty much ignore this if:
If you are annoyed by warnings about it on the screen and in your logs, you can configure it, but it won’t really matter what you enter as the last fancy column version |
This mainly affects fields which may be populated with terms from multiple authorities, where several columns of CSV data map into one CollectionSpace data field.
Prior to CollectionSpace 7.0, CollectionSpace export and this tool both tried to create shorter, less redundant column names using a more "fancy" algorithm, but the two tools ended up creating columns with slightly different names. We realized this, and the fact that it would require more data prep for roundtripping, while building 7.0.
In CollectionSpace 7.0 and beyond, the column names are longer and sometimes a bit internally redundant, but they are consistent with each other for both export and import.
For the community profiles, we increment the profile version with each CollectionSpace release, so the version used with 6.1 is enterd in the settings as the last fancy version for each profile.
If this affects you, add a line for your profile to the default_last_fancy_column_versions
hash, and include the version of your profile that was used with CollectionSpace 6.1.
❗
|
If you do not configure this for your profile, the consistent column naming style will be used. If you are on 6.1 and configure this correctly, you will get fancy column headers. You may still have to fix some column names for import (the pre-processing step of the import will warn you about them). You would have to fix a lot more column names if you are exporting from 6.1 (fancy export column names), but using the consistent headers in your CSV import data. |