diff --git a/flight-logs/2024-04-12-cocreate.mdx b/flight-logs/2024-04-12-cocreate.mdx new file mode 100644 index 00000000..42d194ae --- /dev/null +++ b/flight-logs/2024-04-12-cocreate.mdx @@ -0,0 +1,65 @@ +--- +title: Log 19 🛫 +description: Flight Log of Co-Creation Activities +slug: flight-log-19 +tags: [log] +--- + +## Objective +Deploy watsonx.ai on self-managed AWS infrastructure for customer software evaluation + +```mermaid +flowchart LR + A(Deploy bootnode) --> B(Deploy infrastructure) + B -->C(Deploy OCP) + subgraph "You are here" + D(Prepare CP4D & watsonx ai cartdridge) + end + C -->D + D -->E(Install CP4D) + E -->F(Deploy watsonx.ai) +``` + + +## Milestones +1. Deploy and configuration of boot node to establish a beach-head into the customer AWS environment + - Complete +2. Deploy OCP using the documented UPI installation steps + - Complete +3. Install Cloud Pak for Data + - In Progress +4. Deploy and configure watsonx.ai on self-managed AWS infrastructure on ref environment and document + - In Progress + +### Summary +- Awaiting entitlement key approval on customer side + +## Decisions and Action Items (DAI) +- Software evaluation awaiting customer's approval process. This blocks our ability to download software from cp.icr.io + - Customer to provide by EOD Monday +- Worker nodes shutdown until approval comes through +- Drafted and sent instructions for the customer to resize the worker node disks for when the cluster is brought back online +- Drafted and sent instructions for the customer to order a GPU Node + - GPU node to be added to the cluster and then cordoned, drained, and shutdown + +## Lessons Learned +- Preparation for Cloud Pak for Data on OpenShift sizing needed to be adjusted to reflect an under-provisioning of CPU resources +- watsonx.ai service requires larger local disks on worker nodes (500Gb) +- The GPU node required for watsonx.ai seems to be a limited resource + +## Next Steps +- License and configure Cloud Pak for Data + - Cloud Pak Considerations + - Security scans needed on container images + - Customer requires on-prem, offline install + - Customer uses their own container registry that might introduce extra effort or compatability issues + - Version compatibility with OpenShift (e.g. 4.10 required and customer has 4.11) + - Supported storage not available + - Multiple cloudpaks on the same cluster + - custom connections to data sources not supported OOTB + - AWS-specific: IAM users required for install/deploy and are not allowed + - OpenShift specific: CoreOS requirement for control nodes + - Automatic updating of Cloud Pak, this can interrupt engagements (solution is to always remove update polling from operators) + - Resize local disks for worker nodes + - Customer to order a GPU node and attach it to the cluster +- Deploy watsonx.ai \ No newline at end of file