Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASMTRIAGE-7185: Add Power Off Management Cabinets procedure #5254

Merged
merged 6 commits into from
Aug 6, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions operations/power_management/Power_Off_Management_Cabinets.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Power Off Management Cabinets

Power off PDUs and any remaining components in management cabinets which are powered on, such as Slingshot switches, management switches, and a KVM device.
haroldlongley marked this conversation as resolved.
Show resolved Hide resolved

## Power Off Management Cabinet PDU circuit breakers

**CAUTION:** The nodes and switches in management cabinets should only
be powered off when it has been confirmed that the management Kubernetes cluster and any Lustre or Spectrum Scale filesystems in the cabinets have been cleanly shut down. See the procedures in
[Power Off the External File Systems](System_Power_Off_Procedures.md#Power_off_the_External_File_systems)
and [Shut Down and Power Off the Management Kubernetes Cluster](Shut_Down_and_Power_Off_the_Management_Kubernetes_Cluster.md).

1. (Optional) Power down Modular coolant distribution unit (MDCU) in a liquid-cooled HPE Cray EX2000 cabinet.
haroldlongley marked this conversation as resolved.
Show resolved Hide resolved

CAUTION: The modular coolant distribution unit (MDCU) in a liquid-cooled HPE Cray EX2000 cabinet (also
haroldlongley marked this conversation as resolved.
Show resolved Hide resolved
referred to as a Hill or TDS cabinet) typically receives power from its management cabinet PDUs. If the
system includes an EX2000 cabinet, then do not power off the management cabinet PDUs until the MDCU has
been powered off. Powering off the MDCU will cause an emergency power off (EPO) of the cabinet and may
haroldlongley marked this conversation as resolved.
Show resolved Hide resolved
result in data loss or equipment damage.

1. Set each management cabinet PDU circuit breaker to `OFF`.
haroldlongley marked this conversation as resolved.
Show resolved Hide resolved

A slotted screwdriver may be required to open PDU circuit breakers.
haroldlongley marked this conversation as resolved.
Show resolved Hide resolved

1. To power off Motivair liquid-cooled chilled doors and CDUs, locate the power off switch on the CDU control panel and set it to `OFF`.
haroldlongley marked this conversation as resolved.
Show resolved Hide resolved

Refer to vendor documentation for the chilled-door cooling system for power control procedures when chilled doors are installed on standard racks.

## Next step

Return to [System Power Off Procedures](System_Power_Off_Procedures.md) and continue with next step.
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Power off storage nodes and management switches in standard racks.
## Power off standard rack PDU circuit breakers

**CAUTION:** The Lustre or Spectrum Scale (GPFS) file systems on nodes and switches in storage cabinets should only
be powered off once it has been confirmed that the filesystems have been cleanly shut down. See the procedures in
be powered off when it has been confirmed that the filesystems have been cleanly shut down. See the procedures in
[Power Off the External File Systems](System_Power_Off_Procedures.md#Power_off_the_External_File_systems).

1. Set each cabinet PDU circuit breaker to `OFF`.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -421,15 +421,6 @@ documentation (`S-8031`) for instructions on how to acquire a SAT authentication
ipmitool -I lanplus -U "${USERNAME}" -E -H NCN-M001_BMC_HOSTNAME chassis power status
```

1. (Optional) Power down Modular coolant distribution unit (MDCU) in a liquid-cooled HPE Cray EX20000 cabinet.

**CAUTION:** The modular coolant distribution unit \(MDCU\) in a liquid-cooled HPE Cray EX2000 cabinet (also referred to as a Hill or TDS cabinet) typically receives power from its management
cabinet PDUs. If the system includes an EX2000 cabinet, then **do not power off** the management cabinet PDUs. Powering off the MDCU will cause an emergency power off \(EPO\) of the cabinet and
may result in data loss or equipment damage.

1. (Optional) If a liquid-cooled EX2000 cabinet is not receiving MCDU power from this management cabinet, then power off the PDU circuit breakers or disconnect the PDUs from facility power and
follow lock out/tag out procedures for the site.

## Next step

Return to [System Power Off Procedures](System_Power_Off_Procedures.md) and continue with next step.
4 changes: 4 additions & 0 deletions operations/power_management/System_Power_Off_Procedures.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,10 @@ To power off standard racks which have only storage nodes and switches, refer to

To shut down the management Kubernetes cluster, refer to [Shut Down and Power Off the Management Kubernetes Cluster](Shut_Down_and_Power_Off_the_Management_Kubernetes_Cluster.md).

## Power Off Management Cabinets

To power off management cabinets, refer to [Power Off Management Cabinets](Power_Off_Management_Cabinets.md).

## `Lockout Tagout` Facility Power

If facility power must be removed from a single cabinet or cabinet group for maintenance, follow proper `lockout-tagout` procedures for the site.