From 1319e43a26e23ccb19d7692a8fca163c7db08ebc Mon Sep 17 00:00:00 2001 From: Mitch Harding Date: Wed, 4 Sep 2024 17:22:46 -0400 Subject: [PATCH] CASMINST-6993: Linting (#5346) (cherry picked from commit 1fecb26046ed70a38a30dd09d5d89ac428f15c92) --- operations/firmware/FAS_Use_Cases.md | 10 ++++---- ..._from_a_Liquid_Cooled_Cabinet_EPO_Event.md | 22 ++++++++--------- ...-Cooled_Cabinet_Global_Default_Password.md | 24 +++++++++---------- ...used_by_the_Redfish_Translation_Service.md | 4 ++-- 4 files changed, 30 insertions(+), 30 deletions(-) diff --git a/operations/firmware/FAS_Use_Cases.md b/operations/firmware/FAS_Use_Cases.md index 72eabcb64fa7..be4ece70d9b9 100644 --- a/operations/firmware/FAS_Use_Cases.md +++ b/operations/firmware/FAS_Use_Cases.md @@ -8,11 +8,11 @@ Refer to [FAS Filters](FAS_Filters.md) for more information on the content used The following procedures are included in this section: -1. [Update Liquid-Cooled Compute Node BMC, FPGA, and BIOS](#liquidcooled) -1. [Update Air-Cooled Compute Node BMC, BIOS, iLO 5, and System ROM](#aircooled) -1. [Update Chassis Management Module (CMM) Firmware](#cmm) -1. [Update NCN BIOS and BMC Firmware with FAS](#ncn-bios-bmc) -1. [Compute Node BIOS Workaround for HPE CRAY EX425](#cn-workaround) +* [Update Liquid-Cooled Compute Node BMC, FPGA, and BIOS](#liquidcooled) +* [Update Air-Cooled Compute Node BMC, BIOS, iLO 5, and System ROM](#aircooled) +* [Update Chassis Management Module (CMM) Firmware](#cmm) +* [Update NCN BIOS and BMC Firmware with FAS](#ncn-bios-bmc) +* [Compute Node BIOS Workaround for HPE CRAY EX425](#cn-workaround) > **NOTE:** To update Switch Controllers \(sC\) or `RouterBMC`, refer to the Rosetta Documentation. diff --git a/operations/power_management/Recover_from_a_Liquid_Cooled_Cabinet_EPO_Event.md b/operations/power_management/Recover_from_a_Liquid_Cooled_Cabinet_EPO_Event.md index 1d25e1d44b76..c785b91eec6c 100644 --- a/operations/power_management/Recover_from_a_Liquid_Cooled_Cabinet_EPO_Event.md +++ b/operations/power_management/Recover_from_a_Liquid_Cooled_Cabinet_EPO_Event.md @@ -13,7 +13,7 @@ If a Cray EX liquid-cooled cabinet or cooling group experiences an EPO event, th 1. Check the status of the chassis. ```bash - ncn-m001# cray capmc get_xname_status create --xnames x9000c[1,3] --format toml + ncn-mw# cray capmc get_xname_status create --xnames x9000c[1,3] --format toml ``` Example output: @@ -29,7 +29,7 @@ If a Cray EX liquid-cooled cabinet or cooling group experiences an EPO event, th A cabinet has eight chassis. ```bash - ncn-m001# kubectl logs -n services -l app.kubernetes.io/name=cray-capm -c cray-capmc --tail -1 | grep EPO -A 10 + ncn-mw# kubectl logs -n services -l app.kubernetes.io/name=cray-capmc -c cray-capmc --tail -1 | grep EPO -A 10 ``` Example output: @@ -46,7 +46,7 @@ If a Cray EX liquid-cooled cabinet or cooling group experiences an EPO event, th 1. Disable the `hms-discovery` Kubernetes CronJob. ```bash - ncn-m001# kubectl -n services patch cronjobs hms-discovery -p '{"spec" : {"suspend" : true }}' + ncn-mw# kubectl -n services patch cronjobs hms-discovery -p '{"spec" : {"suspend" : true }}' ``` **CAUTION:** Do not power the system on until it is safe to do so. Determine why the EPO event occurred before clearing the EPO state. @@ -56,7 +56,7 @@ If a Cray EX liquid-cooled cabinet or cooling group experiences an EPO event, th All chassis in cabinets 1000-1003 are forced off in this example. Power off all chassis in a cooling group simultaneously, or the EPO condition may persist. ```bash - ncn-m001# cray capmc xname_off create --xnames x[1000-1003]c[0-7] --force true --format toml + ncn-mw# cray capmc xname_off create --xnames x[1000-1003]c[0-7] --force true --format toml ``` Example output: @@ -69,7 +69,7 @@ If a Cray EX liquid-cooled cabinet or cooling group experiences an EPO event, th The HPE Cray EX EX TDS cabinet contains only two chassis: 1 \(bottom\) and 3 \(top\). ```bash - ncn-m001# cray capmc xname_off create --xnames x9000c[1,3] --force true --format toml + ncn-mw# cray capmc xname_off create --xnames x9000c[1,3] --force true --format toml ``` Example output: @@ -82,14 +82,14 @@ If a Cray EX liquid-cooled cabinet or cooling group experiences an EPO event, th 1. Restart the `hms-discovery` CronJob. ```bash - ncn-m001# kubectl -n services patch cronjobs hms-discovery -p '{"spec" : {"suspend" : false }}' + ncn-mw# kubectl -n services patch cronjobs hms-discovery -p '{"spec" : {"suspend" : false }}' ``` About 5 minutes after `hms-discovery` restarts, the service will power on the chassis enclosures, switches, and compute blades. If components are not being powered back on, then power them on manually. ```bash - ncn-m001# cray capmc xname_on create --xnames x[1000-1003]c[0-7]r[0-7],x[1000-1003]c[0-7]s[0-7] --prereq true --continue true --format toml + ncn-mw# cray capmc xname_on create --xnames x[1000-1003]c[0-7]r[0-7],x[1000-1003]c[0-7]s[0-7] --prereq true --continue true --format toml ``` Example output: @@ -99,12 +99,12 @@ If a Cray EX liquid-cooled cabinet or cooling group experiences an EPO event, th err_msg = "" ``` -1. Bring up the Slingshot Fabric. +1. Verify the Slingshot fabric is up and healthy. - Refer to the following documentation for more information on how to bring up the Slingshot Fabric: + Refer to the following documentation for more information on how to verify the health of the Slingshot Fabric: - - The *Slingshot Administration Guide* PDF for HPE Cray EX systems. - - The *Slingshot Troubleshooting Guide* PDF. + * The *Slingshot Administration Guide* PDF for HPE Cray EX systems. + * The *Slingshot Troubleshooting Guide* PDF. 1. After the components have powered on, boot the nodes using the Boot Orchestration Services \(BOS\). diff --git a/operations/security_and_authentication/Change_EX_Liquid-Cooled_Cabinet_Global_Default_Password.md b/operations/security_and_authentication/Change_EX_Liquid-Cooled_Cabinet_Global_Default_Password.md index d7fe0e817a95..9864a3f2b716 100644 --- a/operations/security_and_authentication/Change_EX_Liquid-Cooled_Cabinet_Global_Default_Password.md +++ b/operations/security_and_authentication/Change_EX_Liquid-Cooled_Cabinet_Global_Default_Password.md @@ -1,47 +1,47 @@ # Change Cray EX Liquid-Cooled Cabinet Global Default Password -This procedure changes the global default `root` credential on HPE Cray EX liquid-cooled cabinet embedded controllers (BMCs). The chassis management module (CMM) controller (cC), node controller (nC), and Slingshot switch controller (sC) are generically referred to as "BMCs" in these procedures. +This procedure changes the global default `root` credential on HPE Cray EX liquid-cooled cabinet embedded controllers (BMCs). +The chassis management module (CMM) controller (cC), node controller (nC), and Slingshot switch controller (sC) are generically referred to as "BMCs" in these procedures. ## Prerequisites - The Cray command line interface (CLI) tool is initialized and configured on the system. See [Configure the Cray Command Line Interface (`cray` CLI)](../configure_cray_cli.md) for more information. - Review procedures in [Manage System Passwords](Manage_System_Passwords.md). -### Procedure +## Procedure 1. If necessary, shut down compute nodes in each cabinet. Refer to [Shut Down and Power Off Compute and User Access Nodes](../power_management/Shut_Down_and_Power_Off_Compute_and_User_Access_Nodes.md). ```screen - ncn-m001# sat bootsys shutdown --stage bos-operations --bos-templates COS_SESSION_TEMPLATE + ncn-mw# sat bootsys shutdown --stage bos-operations --bos-templates COS_SESSION_TEMPLATE ``` -2. Disable the `hms-discovery` Kubernetes cron job. +1. Disable the `hms-discovery` Kubernetes cron job. ```screen - ncn-m001# kubectl -n services patch cronjobs hms-discovery -p '{"spec" : {"suspend" : true }}' + ncn-mw# kubectl -n services patch cronjobs hms-discovery -p '{"spec" : {"suspend" : true }}' ``` -3. Power off all compute slots in the cabinets the passwords are to be changed on. +1. Power off all compute slots in the cabinets the passwords are to be changed on. **Note**: If a chassis is not fully populated, specify each slot individually. Example showing fully populated cabinets 1000-1003: ```screen - ncn-m001# cray capmc xname_off create --xnames x[1000-1003]c[0-7]s[0-7] --format json + ncn-mw# cray capmc xname_off create --xnames x[1000-1003]c[0-7]s[0-7] --format json ``` Check the power status: ```screen - ncn-m001# cray capmc get_xname_status create --xnames x[1000-1003]c[0-7]s[0-7] --format json + ncn-mw# cray capmc get_xname_status create --xnames x[1000-1003]c[0-7]s[0-7] --format json ``` Continue when all compute slots are `Off`. -4. Perform the procedures in [Provisioning a Liquid-Cooled EX Cabinet CEC with Default Credentials](Provisioning_a_Liquid-Cooled_EX_Cabinet_CEC_with_Default_Credentials.md). +1. Perform the procedures in [Provisioning a Liquid-Cooled EX Cabinet CEC with Default Credentials](Provisioning_a_Liquid-Cooled_EX_Cabinet_CEC_with_Default_Credentials.md). -5. Perform the procedures in [Updating the Liquid-Cooled EX Cabinet Default Credentials after a CEC Password Change](Updating_the_Liquid-Cooled_EX_Cabinet_Default_Credentials_after_a_CEC_Password_Change.md). - -6. To update Slingshot switch BMCs, refer to "Change Rosetta Login and Redfish API Credentials" in the *Slingshot Operations Guide (> 1.6.0)*. +1. Perform the procedures in [Updating the Liquid-Cooled EX Cabinet Default Credentials after a CEC Password Change](Updating_the_Liquid-Cooled_EX_Cabinet_Default_Credentials_after_a_CEC_Password_Change.md). +1. To update Slingshot switch BMCs, refer to "Change Rosetta Login and Redfish API Credentials" in the *Slingshot Operations Guide (> 1.6.0)*. diff --git a/operations/security_and_authentication/Update_Default_ServerTech_PDU_Credentials_used_by_the_Redfish_Translation_Service.md b/operations/security_and_authentication/Update_Default_ServerTech_PDU_Credentials_used_by_the_Redfish_Translation_Service.md index a0f14bebd93c..658e86e6a1e5 100644 --- a/operations/security_and_authentication/Update_Default_ServerTech_PDU_Credentials_used_by_the_Redfish_Translation_Service.md +++ b/operations/security_and_authentication/Update_Default_ServerTech_PDU_Credentials_used_by_the_Redfish_Translation_Service.md @@ -7,8 +7,8 @@ ServerTech PDUs which do not natively support Redfish. There are two sets of default credentials that are required for RTS to function: -1. The default credentials to use when new ServerTech PDUs are discovered in the system. -1. The global default credential that RTS uses for its Redfish interface with other CSM services. +- The default credentials to use when new ServerTech PDUs are discovered in the system. +- The global default credential that RTS uses for its Redfish interface with other CSM services. **Important:**: After this procedure is completed **going forward all future ServerTech PDUs** added to the system will be assumed to be already configured with the new global default credential when getting added to the system.