generated from ibm-client-engineering/solution-template
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #82 from ibm-client-engineering/ross-updates
Ross updates
- Loading branch information
Showing
1 changed file
with
65 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
--- | ||
title: Log 19 🛫 | ||
description: Flight Log of Co-Creation Activities | ||
slug: flight-log-19 | ||
tags: [log] | ||
--- | ||
|
||
## Objective | ||
Deploy watsonx.ai on self-managed AWS infrastructure for customer software evaluation | ||
|
||
```mermaid | ||
flowchart LR | ||
A(Deploy bootnode) --> B(Deploy infrastructure) | ||
B -->C(Deploy OCP) | ||
subgraph "You are here" | ||
D(Prepare CP4D & watsonx ai cartdridge) | ||
end | ||
C -->D | ||
D -->E(Install CP4D) | ||
E -->F(Deploy watsonx.ai) | ||
``` | ||
|
||
|
||
## Milestones | ||
1. Deploy and configuration of boot node to establish a beach-head into the customer AWS environment | ||
- Complete | ||
2. Deploy OCP using the documented UPI installation steps | ||
- Complete | ||
3. Install Cloud Pak for Data | ||
- In Progress | ||
4. Deploy and configure watsonx.ai on self-managed AWS infrastructure on ref environment and document | ||
- In Progress | ||
|
||
### Summary | ||
- Awaiting entitlement key approval on customer side | ||
|
||
## Decisions and Action Items (DAI) | ||
- Software evaluation awaiting customer's approval process. This blocks our ability to download software from cp.icr.io | ||
- Customer to provide by EOD Monday | ||
- Worker nodes shutdown until approval comes through | ||
- Drafted and sent instructions for the customer to resize the worker node disks for when the cluster is brought back online | ||
- Drafted and sent instructions for the customer to order a GPU Node | ||
- GPU node to be added to the cluster and then cordoned, drained, and shutdown | ||
|
||
## Lessons Learned | ||
- Preparation for Cloud Pak for Data on OpenShift sizing needed to be adjusted to reflect an under-provisioning of CPU resources | ||
- watsonx.ai service requires larger local disks on worker nodes (500Gb) | ||
- The GPU node required for watsonx.ai seems to be a limited resource | ||
|
||
## Next Steps | ||
- License and configure Cloud Pak for Data | ||
- Cloud Pak Considerations | ||
- Security scans needed on container images | ||
- Customer requires on-prem, offline install | ||
- Customer uses their own container registry that might introduce extra effort or compatability issues | ||
- Version compatibility with OpenShift (e.g. 4.10 required and customer has 4.11) | ||
- Supported storage not available | ||
- Multiple cloudpaks on the same cluster | ||
- custom connections to data sources not supported OOTB | ||
- AWS-specific: IAM users required for install/deploy and are not allowed | ||
- OpenShift specific: CoreOS requirement for control nodes | ||
- Automatic updating of Cloud Pak, this can interrupt engagements (solution is to always remove update polling from operators) | ||
- Resize local disks for worker nodes | ||
- Customer to order a GPU node and attach it to the cluster | ||
- Deploy watsonx.ai |