Skip to content

Latest commit

 

History

History
361 lines (278 loc) · 13.4 KB

setup-developer-osd-cluster.md

File metadata and controls

361 lines (278 loc) · 13.4 KB

How-To setup developer OSD cluster (step by step copy/paste guide)

Pre-requirements

You will require several commands in order to use simple copy/paste.

  1. jq and yq - JSON and YAML query CLI tools.
  2. bw - BitWarden CLI. We need this to get values from BitWarden directly without paste/copy.
  3. ocm - Openshift cluster manager CLI tool. We need it to create OSD cluster and manage it.
  4. oc - Openshift cluster CLI tool (similar to kubectl). We need it to deploy resource into OSD cluster.
  5. ktunnel - Reverse proxy to proxy service from kubernetes to local machine. You can find more info here: https://github.com/omrikiei/ktunnel
  6. watch - (optional) To repeatedly executes specific command.
  7. grpcurl - (optional) Requirement for execute gRPC calls.

Additionally, you will also require quay.io credentials.

Intro

All commands should be executed in root directory of stackrox/acs-fleet-manager project.

Create development OSD Cluster

  1. Create development OSD Cluster with ocm

Export name for your cluster. Prefix it with your initials or something similar to avoid name collisions. i.e. mt-osd-1307

export OSD_CLUSTER_NAME="<your cluster name>"

To create development OSD cluster in OCM staging platform, you should login to staging platform. You should use rhacs-managed-service-dev account. To retrieve token required to login via ocm command, you have to go to: https://console.redhat.com/openshift/token/show# - login there as rhacs-managed-service-dev. You can find rhacs-managed-service-dev login credentials in BitWarden.

The ocm command is aware of differences and defining --url staging is all what is required in order to login to OCM staging platform.

ocm login --url staging --token="<your token from OpenShift console UI - console.redhat.com>

Staging UI is accessible on this URL: https://qaprodauth.cloud.redhat.com

To ensure that we have enough quota on the account, you can run the following command and see the output:

ocm list quota | grep -E "QUOTA|osd"

Create cluster with ocm command

# Get AWS Keys from BitWarden
export AWS_REGION="us-east-1"
export AWS_ACCOUNT_ID=$(bw get item "23a0e6d6-7b7d-44c8-b8d0-aecc00e1fa0a" | jq '.fields[] | select(.name | contains("AccountID")) | .value' --raw-output)
export AWS_ACCESS_KEY_ID=$(bw get item "23a0e6d6-7b7d-44c8-b8d0-aecc00e1fa0a" | jq '.fields[] | select(.name | contains("AccessKeyID")) | .value' --raw-output)
export AWS_SECRET_ACCESS_KEY=$(bw get item "23a0e6d6-7b7d-44c8-b8d0-aecc00e1fa0a" | jq '.fields[] | select(.name | contains("SecretAccessKey")) | .value' --raw-output)

# Execute creation command
ocm create cluster \
  --ccs \
  --aws-access-key-id "${AWS_ACCESS_KEY_ID}" \
  --aws-account-id "${AWS_ACCOUNT_ID}" \
  --aws-secret-access-key "${AWS_SECRET_ACCESS_KEY}" \
  --region "${AWS_REGION}" \
  --multi-az \
  --compute-machine-type "m5a.xlarge" \
  --version "4.11.2" \
  "${OSD_CLUSTER_NAME}"

You will see output of command. Output should contain "ID" of the cluster. Export that ID to CLUSTER_ID environment variable.

export CLUSTER_ID="<ID of the cluster>"

Now, you have to wait for cluster to be provisioned. Check status of cluster creation:

watch --interval 10 ocm cluster status ${CLUSTER_ID}
  1. Add auth provider for OSD cluster

This is required in order to be able to log-in to cluster. In UI or with oc command. You can pick your own admin pass, here we use md5. If you need password for UI login, be sure to store it somewhere.

export OSD_ADMIN_USER="osd-admin"
export OSD_ADMIN_PASS=$(date | md5)
echo $OSD_ADMIN_PASS > ./tmp-osd-admin-pass.txt

ocm create idp \
  --cluster "${CLUSTER_ID}" \
  --type htpasswd \
  --name HTPasswd \
  --username "${OSD_ADMIN_USER}" \
  --password "${OSD_ADMIN_PASS}"

ocm create user \
  --group cluster-admins \
  --cluster "${CLUSTER_ID}" \
  "${OSD_ADMIN_USER}"
  1. Login to OSD cluster with ocm command. This will automatically set the correct context for the oc command.
ocm cluster login "${CLUSTER_ID}"

If login step fails, it can be the case that previously created auth provider and user are not applied yet on the cluster. You can wait few seconds and try again.

Prepare cluster for RHACS Operator

  1. Export defaults
export RHACS_OPERATOR_CATALOG_VERSION="3.71.0"
export RHACS_OPERATOR_CATALOG_NAME="redhat-operators"
  1. Check if the latest version of available ACS Operator is high enough for you. If that is OK for you, you can skip next steps prefixed with (ACS operator from branch).

Execute the following command in separate terminal (new shell).

oc port-forward -n openshift-marketplace svc/redhat-operators 50051:50051

If port-forward step fails with Unable to connect to the server: x509: certificate signed by unknown authority, wait few seconds and try again.

grpcurl -plaintext -d '{"name":"rhacs-operator"}' localhost:50051 api.Registry/GetPackage | jq '.channels[0].csvName'

You can stop port-forward after this.

  1. (ACS operator from branch) Prepare pull secret Important This will change cluster wide pull secrets. It's not advised to use on clusters where credentials can be compromized.

Pay attention: docker-credential-osxkeychain is specific for MacOS. For Linux please check docker-credential-secretservice.

export QUAY_REGISTRY_AUTH_BASIC=$(docker-credential-osxkeychain get <<<"https://quay.io" | jq -r '"\(.Username):\(.Secret)"')

oc get secret/pull-secret -n openshift-config --template='{{index .data ".dockerconfigjson" | base64decode}}' > ./tmp-pull-secret.json
oc registry login --registry="quay.io/rhacs-eng" --auth-basic="${QUAY_REGISTRY_AUTH_BASIC}" --to=./tmp-pull-secret.json
oc set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=./tmp-pull-secret.json
  1. (ACS operator from branch) Deploy catalog

You should find catalog build from your branch or from master branch of stackrox/stackrox repository. You should look at CircleCI job with name build-operator and step Build and push images for quay.io/rhacs-eng. In log, you can find image tag. Something like v3.71.0-16-g3f8fcd60c6. Export that value without v

export RHACS_OPERATOR_CATALOG_VERSION="<Stackrox Operator Index version>"

Run the following command to register new ACS Observability operator catalog.

export RHACS_OPERATOR_CATALOG_NAME="rhacs-operators"

oc apply -f - <<EOF
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: ${RHACS_OPERATOR_CATALOG_NAME}
  namespace: openshift-marketplace
spec:
  displayName: 'RHACS Development'
  publisher: 'Red Hat ACS'
  sourceType: grpc
  image: quay.io/rhacs-eng/stackrox-operator-index:v${RHACS_OPERATOR_CATALOG_VERSION}
EOF

By executing:

oc get pods -n openshift-marketplace

You should be able to see rhacs-operators pod running.

Terraform OSD cluster with Fleet Synchronizer

  1. Export defaults
# Copy static token from BitWarden
export STATIC_TOKEN=$(bw get item "64173bbc-d9fb-4d4a-b397-aec20171b025" | jq '.fields[] | select(.name | contains("JWT")) | .value' --raw-output)

export FLEET_MANAGER_IMAGE=quay.io/app-sre/acs-fleet-manager:main
  1. Prepare namespace
export NAMESPACE=rhacs
export FLEET_MANAGER_ENDPOINT="http://fleet-manager.${NAMESPACE}.svc.cluster.local:8000"

oc create namespace "${NAMESPACE}"
  1. (Optional local fleet synchronizer build) Build and push fleet synchronizer
export IMAGE_TAG=osd-test

make image/push/internal
  1. (Optional local fleet synchronizer build) Get Fleet Manager image name
export FLEET_MANAGER_IMAGE=$(oc get route default-route -n openshift-image-registry -o jsonpath="{.spec.host}")/${NAMESPACE}/fleet-manager:${IMAGE_TAG}
  1. Terraform cluster
helm upgrade --install rhacs-terraform \
  --namespace "${NAMESPACE}" \
  --set fleetshardSync.authType="STATIC_TOKEN" \
  --set fleetshardSync.image="${FLEET_MANAGER_IMAGE}" \
  --set fleetshardSync.fleetManagerEndpoint="${FLEET_MANAGER_ENDPOINT}" \
  --set fleetshardSync.staticToken="${STATIC_TOKEN}" \
  --set fleetshardSync.clusterId="${CLUSTER_ID}" \
  --set acsOperator.enabled=true \
  --set acsOperator.source="${RHACS_OPERATOR_CATALOG_NAME}" \
  --set logging.enabled=false \
  --set acsOperator.version="v${RHACS_OPERATOR_CATALOG_VERSION}" \
  --set observability.enabled=false ./dp-terraform/helm/rhacs-terraform
  1. Create tunnel from cluster to local machine

Execute the following command in separate terminal (new shell). Ensure that you have same namespace as one defined in $NAMESPACE.

export NAMESPACE=rhacs

ktunnel expose --namespace "${NAMESPACE}" fleet-manager 8000:8000 --reuse

Setup local Fleet Manager

  1. Create OSD Cluster config file for fleet manager

Ensure that you are in correct kube context.

export OC_CURRENT_CONTEXT=$(oc config current-context)
export OSD_CLUSTER_DOMAIN=$(ocm get /api/clusters_mgmt/v1/clusters/${CLUSTER_ID} | jq '.dns.base_domain' --raw-output)

cat << EOF > "./${CLUSTER_ID}.yaml"
---
clusters:
 - name: '${OC_CURRENT_CONTEXT}'
   cluster_id: '${CLUSTER_ID}'
   cloud_provider: aws
   region: ${AWS_REGION}
   schedulable: true
   status: ready
   multi_az: true
   central_instance_limit: 10
   provider_type: standalone
   supported_instance_type: "eval,standard"
   cluster_dns: '${OSD_CLUSTER_NAME}.${OSD_CLUSTER_DOMAIN}'
EOF
  1. Build, setup and start local fleet manager

Execute the following command in separate terminal (new shell). Ensure that you have same exported CLUSTER_ID.

# Build binary
make binary

# Setup DB
make db/teardown db/setup db/migrate

# Start local fleet manager
./fleet-manager serve --dataplane-cluster-config-file "./${CLUSTER_ID}.yaml"

Install central

  1. Prepare default values
# Copy static token from BitWarden
export STATIC_TOKEN=$(bw get item "64173bbc-d9fb-4d4a-b397-aec20171b025" | jq '.fields[] | select(.name | contains("JWT")) | .value' --raw-output)

export AWS_REGION="us-east-1"
  1. Call curl to install central
export CENTRAL_ID=$(curl --location --request POST "http://localhost:8000/api/rhacs/v1/centrals?async=true" --header "Content-Type: application/json" --header "Accept: application/json" --header "Authorization: Bearer ${STATIC_TOKEN}" --data-raw "{\"name\":\"test-on-cluster\",\"cloud_provider\":\"aws\",\"region\":\"${AWS_REGION}\",\"multi_az\":true}" | jq '.id' --raw-output)
  1. Check if new namespace is created and if all pods are up and running
export CENTRAL_NAMESPACE="${NAMESPACE}-${CENTRAL_ID}"

oc get pods --namespace "${CENTRAL_NAMESPACE}"

Install sensor to same data plane cluster where central is installed

  1. Fetch sensor configuration
export ROX_ADMIN_PASSWORD=$(oc get secrets -n "${CENTRAL_NAMESPACE}" central-htpasswd -o yaml | yq .data.password | base64 --decode)
roxctl sensor generate openshift --openshift-version=4 --endpoint "https://central-${CENTRAL_NAMESPACE}.apps.${OSD_CLUSTER_NAME}.${OSD_CLUSTER_DOMAIN}:443" --insecure-skip-tls-verify -p "${ROX_ADMIN_PASSWORD}" --admission-controller-listen-on-events=false --disable-audit-logs=true --central="https://central-${CENTRAL_NAMESPACE}.apps.${OSD_CLUSTER_NAME}.${OSD_CLUSTER_DOMAIN}:443" --collection-method=none --name osd-cluster-sensor
  1. Install sensor

This step requires quay.io username and password. Have that prepared.

./sensor-osd-cluster-sensor/sensor.sh
  1. Check that sensor is up and running

Sensor uses stackrox namespace by default.

oc get pods -n stackrox

Run local front-end (UI project)

The front-end is located in the following repo: https://github.com/RedHatInsights/acs-ui. Clone that repo locally.

  1. Prepare /etc/hosts file. Add development host to the hosts file. The grep command ensures that entry is added only once.
sudo sh -c 'grep -qxF "127.0.0.1 stage.foo.redhat.com" /etc/hosts || echo "127.0.0.1 stage.foo.redhat.com" >> /etc/hosts'

Note: If you are unsure what the command will do, be free to manually add the entry 127.0.0.1 stage.foo.redhat.com in the /etc/hosts file.

  1. Install the UI project

Execute the following commands in the root directory of the UI project:

npm install
  1. Start the UI project

Execute the following commands in the root directory of the UI project:

export FLEET_MANAGER_API_ENDPOINT=http://localhost:8000

npm run start:beta

After that, you can open the following URL in your browser: https://stage.foo.redhat.com:1337/beta/application-services/acs

Note: Since staging External RedHat SSO is used for authentication, you may have to create your personal account.

Extend development OSD cluster lifetime to 7 days

By default, staging cluster will be up for 2 days. You can extend it to 7 days. To do that, execute the following command for MacOS:

echo "{\"expiration_timestamp\":\"$(date -v+7d -u +'%Y-%m-%dT%H:%M:%SZ')\"}" | ocm patch "/api/clusters_mgmt/v1/clusters/${CLUSTER_ID}"

Or on Linux:

echo "{\"expiration_timestamp\":\"$(date --iso-8601=seconds -d '+7 days')\"}" | ocm patch "/api/clusters_mgmt/v1/clusters/${CLUSTER_ID}"

Re-deploy new Fleetshard synchronizer

To deploy a new build of Fleetshard synchronizer, you can simply re-build and push the image and after that rollout restart of deployment is sufficient.

GOARCH=amd64 GOOS=linux CGO_ENABLED=0 make image/build/push/internal
oc rollout restart -n "${NAMESPACE}" deployment fleetshard-sync

Re-start new local Fleetshard manager

make binary
./fleet-manager serve --dataplane-cluster-config-file "./${CLUSTER_ID}.yaml"