Skip to content

cf deployment Infrastructure Help

Jochen Ehret edited this page Feb 6, 2023 · 5 revisions

cf-deployment Infrastructure Help

This page is meant to contain instructions on how the current infrastructure is setup. In some cases, we now consume environments from a pool managed by the Toolsmiths team. See the inputs for each environment's deploy job to see whether it consumes a toolsmith's pool.

NOTE: this may turn stale very quickly.

  1. You can find the pipelines to setup infrastructure at runtime-ci.
  2. The environments live here at relint-envs.

Infrastructure Pipelines

Intended Use of cf-d environments

(using destroying and recreating the stable CATS environment as an example)

  1. Use the pipeline job to destroy the environment. Specifically, from the stable tab of the infrastructure pipeline run the destroy-infrastructure-stable job. This will automatically trigger remove-claimed-lock-stable if successful.
  2. bbl up the environment. Specifically, run add-claimed-lock-stable which will auto-trigger the setup-infrastructure-stable job on success.
  3. Update DNS for the new DNS zone that is included in the bbl output. (as an example, the steps for Bellatrix are below. Hermione is on AWS so for that env, one way to find the set of 4 NS records is to search the bbl-state for "awsdns")
    • select the Bellatrix project in GCP
    • select bellatrix-stable-zone
    • make note of the data for the NS record (eg ns-cloud-a1.googledomains.com. ns-cloud-a2.googledomains.com. ns-cloud-a3.googledomains.com. ns-cloud-a4.googledomains.com.)
    • head to the AWS shared DNS account (in lastpass, username shared-dns-account)
    • select the Bellatrix record, click Edit and paste-in the NS data (usually just changing the letter to one of [abcd]
  4. [only necessary for stable because it doesn't deploy itself as part of the standard process] Redeploy using the pipeline job in cf-deployment's Stable group: stable-acquire-pool which will in turn trigger the stable-deploy.

Tips

  1. The variables in our relint pipelines are set in the Concourse's credhub.
    1. Go to relint-envs and target the env by going to environment/ci/concourse.
    2. credhub get -n /concourse/main/relint_ci_pools_readwrite_deploy_key

Certificates and Key variables

This section is for setting up BOSH certificates to used in the CF Deployment Concourse Task in our infrastructure pipeline.

The way concourse will find and interpolate the variables for the certificates and github keys will be through the concourse's credhub. (i.e trelawney_cf_lb_cert.certificate). The way to set this up or read them will be:

  1. Go to relint-envs and target the env by going to environment/ci/concourse.
  2. DIRENV will export the necessary configs to have you targeting the Bosh and Credhub.
  3. The credhub certificates for the environment can be set using Credhub
    • To generate a self-signed certificate you can run:
    credhub generate -n /concourse/main/<VARIABLE NAME in Concourse (i.e trelawney_cf_lb_cert)> \
    -t certificate -c <DOMAIN OF THE CLOUDFOUNDRY (i.e maxime.cf-app.com)> \
    -o'Cloud Foundry' -u 'R&D' -i 'San Francisco' -s 'California' -y US \
    --self-sign
    
    • This will create an entry you can use in the concourse pipeline for both the certificate and the private key. (I.e ((<VARIABLE_NAME>.certificate)) and ((<VARIABLE_NAME>.private_key))

Example: Maxime with a standard bosh director in one az

In the relint-env Maxime, we provide a deploy script to handle this case: https://github.com/cloudfoundry/relint-envs/blob/master/environments/dev/maxime/deploy .

Before this running this script, you'll need to ensure your load balancers are created and certificates provided. For this, we provide an infrastructure job at: https://concourse.wg-ard.ci.cloudfoundry.org/teams/main/pipelines/infrastructure?group=dev .

So the full workflow looks like:

  1. Create the expected load balancer certs in credhub if they do not already exist (or are expired) according to the Certificates and Key variables section
  2. Run the setup-infrastructure-maxime job
  • for snitch (lite) env, the IP address in the DNS entry should be for the director VM (which you can find in the bbl-state file or from VM info in the Snitch GCP project)
  1. Update the environment DNS record in the shared-dns-account on AWS (TODO: move to relint.rocks subdomain)
  2. Check out the desired version of CF-deployment in your workspace directory (TODO: dedupe the gopath one)
  3. Upload your stemcell(s) of choice for the upcoming deploy
  4. Modify the deploy script as desired (the provided script does not have anything special included)
  5. Run the script

External Resources

Resources not deployed by BOSH are generally codified as terraform templates. For example, the external database configuration for the upgrade/trelawney environment exists in the bbl-config/terraform directory so that all infrastructure provisioning happens in the bbl up step of the infrastructure pipeline jobs that manage that environment.

In general, we try to wire these resources into the rest of our CI pipelines to avoid having hard-coded references in unrelated places that have to be kept in sync with the actual infrastructure. For example, the external DB instance name is an output from the terraform templates that create it, which allows it to be consumed via a file when creating or deleting the databases used by the upgrade test.