Skip to content

running the preimport script

craigmcchesney edited this page May 29, 2022 · 24 revisions

Script Overview

The preimport script reads a Kabel workbook, an Excel file containing the cable types and cables for a particular subsystem. It performs validation and creates an output Excel file intended to be used as input to the CDB web portal's import mechanism. It creates a single output workbook whose content depends on whether the script succeeded or failed (further details below).

File locations

The preimport script is called preImport.py, and lives in the following directory of the CDB installation: "tools/developer_tools/utilities/cable_import/". In the github repository, it is found here: https://github.com/AdvancedPhotonSource/ComponentDB/tree/master/tools/developer_tools/utilities/cable_import

The latest input workbook files are maintained in Box at the following location: https://anl.app.box.com/folder/93474654693?s=ss5xelagosenuzs5ttaaai752a1u2tdd

Running Script

To run the script, create a pycharm project or set up a conda environment with the requisite dependencies.

Command Line Parameters

A sample command line looks like the following:

preImport.py --configDir /home/craig/cable-import/run/configDir --kabelWorkbookId "IT 90 PRUITT" --deploymentName LOCAL

required parameters

The "configDir", "kabelWorkbookId", and "deploymentName" command line parameters control the operation of the preimport script.

  • --configDir - Parameter points to the directory containing the script's configuration files. See "Configuration Files", below, for more information.
  • --kabelWorkbookId - Parameter value is the base filename for the workbook-specific config file (located in specified configDir), and the output workbook and log files. The kabelWorkbookId has no meaning, it is simply a symbolic identifier used to lookup and name files. See "Configuration Files", below, for more information and an example.
  • --deploymentName - Specifies the CDB system to use for API processing. The value for this parameter must map to a section in the cdb-deployment-info.conf file, discussed below. In lieu of (or in addition to) this parameter, the url, user, and password may be specified explicitly as discussed below.

optional parameters

In lieu of the deploymentName parameter, the CDB url, user, and password may be specified explicitly on the command line. Explicit command-line values for those parameters override config file values if both options are used.

  • --cdbUrl - URL for the CDB system, e.g., "http://kabelbox.aps.anl.gov:8080/cdb".
  • --cdbUser - CDB user, e.g., "nda".
  • --cdbPassword - CDB password for specified user, e.g., "Ch@ngeMe".

Configuration Files

The script configuration files live in the directory specified in the "configDir" command line parameter.

preimport.conf

Contains options that apply to processing of all kabel workbook input files. It includes three sections: "CableSpecs" and "Cables" (corresponding to the kabel workbook tabs of the same names), and "DEFAULT" which provides default values that apply to both of the other sections. Values specified in the section for a specific sheet override the default values. It is anticipated that this file will not change often.

Here is a sample "preimport.conf" file:

[DEFAULT]
inputDir = /home/craig/cable-import/run/inputDir
outputDir = /home/craig/cable-import/run/outputDir 
cdbProjectId = #APS-U Production
cdbOwnerUserId = #cdb
cdbOwnerGroupId = #APSU_CABLE_CONFIG

[CableSpecs]

[Cables]
cdbMachineDesignRoot = APS-U Facility Design - April 2023

Note that the [CableSpecs] section is provided even though it is empty. This is due to a quirk in the operation of the Python config parser and the fact that we are only using the default values for this section.

  • inputDir - directory containing input Excel workbook files
  • outputDir - directory in which to write output workbook and log files
  • cdbProjectId - default CDB project id for each item
  • cdbOwnerUserId - default cdb owner user id for each item
  • cdbOwnerGroupId - default cdb owner group id for each item
  • cdbMachineDesignRoot - name of the root CDB machine item whose hierarchy will be searched for the rack and device names specified in input kabel workbook

workbook-specific configuration files

Running the script with --kabelWorkbookId "IT 90 PRUITT" means we look for the config file "IT 90 PRUITT.conf" in the configDir. The workbook config file includes the same three sections as described above for the master config file. The kabelWorkbookId is just a symbolic name for the corresponding excel file and settings for reading that file. The only requirement is that there is a config file whose base filename is the "kabelWorkbookId" command line parameter value. When a new input workbook file is generated, you shoule create a new workbook.conf file for it.

Here is a sample file group config file, "IT 90 PRUITT.conf":

[DEFAULT]
inputExcelFile = Kabel Information_Tech 90 PRUITT.xlsx
cdbTechSystemId = #IT
validateOnly = False

[CableSpecs]
kabelWorkbookTechGroup = Information_Tech

[Cables]
ignorePortColumns = False
  • inputExcelFile - Excel workbook file containing the group's cable types and cables
  • cdbTechSystemId - default CDB tech system identifier for each item
  • validateOnly - if True, runs the script in a mode to validate the spreadsheet contents only, doesn't invoke any CDB API's to retrieve data
  • kabelWorkbookTechGroup - specifies the name of the kabel workbook's technical system owner which is used to find the column of the workbook's "CableTypes" sheet that includes the cable types for the technical system (by searching the first row), and the section of the "CableSpecs" that includes details for those cables (by searching the first column).
  • ignorePortColumns - specifies that values in the "from port" and "to port" columns of the kabel workbook's Cables tab should not be included in the cable design import tab of the output workbook. This option is provided for use on a system where the device ports may not yet be defined in CDB, but the port names are specified in the kabel workbook. Note that the output workbook now includes a "Catalog Port Import" tab for creating the catalog item ports used in the kabel workbook.

cdb-deployment-info.conf

Maps a symbolic name for each deployment to the CDB url/user/password settings for that deployment. So if we specify "--deploymentName KABELBOX" on the command line, the script looks for a section [KABELBOX] in the "cdb-deployment-info.conf" file and uses the specified url/user/password. There is a [DEFAULT] section that applies to all deployments. If you want to explicitly specify a cdb url/user/password (or override what's in the deployment config file), you can specify cdbUrl, cdbUser, and cdbPassword on the command line.

Here is a sample deployment info file, "cdb-deployment-info.conf":

[DEFAULT]
cdbUser = cdb
cdbPassword = cdb

[KABELBOX]
cdbUrl = http://kabelbox.aps.anl.gov:8080/cdb

[LOCAL]
cdbUrl = http://localhost:8080/cdb
  • cdbUrl - the URL for the CDB system to connect to
  • cdbUser - CDB userid to use for connection
  • cdbPassword - CDB password to use for connection

Script Outputs

Files created by the script are written to the directory specified by the "outputDir" resource in preimport.conf. The script generates an Excel workbook and a log file, both of which are named using the value of the "kabelWorkbookId" command line option. For example, if we run the script with --kabelWorkbookId "IT 90 PRUITT", the following output files are created:

  • IT 90 PRUITT.xlsx - This is the primary output of the script and its contents are described in more detail below.
  • IT 90 PRUITT.log - a detailed log file detailing the row-by-row processing of the input file. This is a good place to look when there are inexplicable errors.

successful execution

On a successful run, the output workbook contains two summary sheets and several "import sheets". The summary sheets describe processing of the input sheets for Cable Catalog and Cable Design items. The non-empty import sheets should be imported to CDB. Empty import sheets indicate that no new items of that type were encountered for import, but the empty sheets are added to the output workbook for consistency. The output workbook contains the following sheets:

  • CableSpecs Sheet Summary - Summarizes processing of the "CableSpecs" input sheet. The first column "summary messages" includes number of rows in input sheet, number of connector types that exist in CDB or are newly encountered, number of existing/new source items, and number of existing/new cable types. There are additional detail columns with the names of the existing/new items when non-zero.
  • Connector Type Import - CDB import sheet for any new connector types (those not already in CDB)
  • Source Item Import - CDB import sheet for any sources (manufacturers) not already in CDB
  • Cable Catalog Item Compare – CDB import sheet for comparing cable catalog items that already exist in CDB with the kabel workbook cable type of the same name. This sheet can be imported to CDB using "compare" mode to highlight the differences between the kabel workbook and CDB cable types.
  • Cable Catalog Item Import – CDB import sheet for creating the cable types that don’t already exist in CDB
  • Cables Sheet Summary – summarizes processing of Cables tab from kabel workbook. First column "summary messages" includes number of rows in input sheet, number of new cable design items for import to CDB, cable types not defined in CDB (and included in the import sheet above), and warnings about port values ignored in input processing.
  • Catalog Port Import – CDB import sheet for all ports used in the Cables sheet
  • Cable Inventory Item Import - Contains cable inventory items, one per cable design item in the cable design import sheet. Cable inventory items are named using the corresponding cable design item's name. If cable inventory items will be imported, it would be a good idea to fill in the cable design item import sheet's "Assigned Inventory Tag" column with the inventory unit names, which should be a simple matter of copying the values in the "Name" column to the "Assigned Inventory Tag" column.
  • Cable Design Item Compare - This sheet contains one row for each cable design item name from the input sheet that is found to exist in CDB. This sheet can be imported to CDB using "compare" mode to highlight differences between the kabel workbook and CDB items.
  • Cable Design Item Import - CDB import sheet for importing cable design items.  This now includes all items from the Cables tab whether or not they exist in CDB.  The user will decide how to handle cables that already exist in CDB

execution with errors

When the script encounters errors in the input sheet, the output workbook only includes sheets describing the errors. I chose not to add partial import sheets to the output workbook in this case to avoid confusion as to whether the script succeeded or failed. The content depends on the scenario:

  • Processing CableSpecs sheet failed: Output workbook includes a "CableSpecs Sheet Errors" sheet with an "error messages" column with summary data about the errors, and then extra columns with additional detail about the specific errors where appropriate. The "CableSpecs Sheet Row Errors" sheet lists the validation errors encountered for each row in the CableSpecs sheet.
  • Processing CableSpecs succeeded but Cables failed: Output workbook includes "Cables Sheet Errors" and "Cables Sheet Row Errors" as described above.
Clone this wiki locally