From 5b33746da8e0c66149ed44316db350b849c08410 Mon Sep 17 00:00:00 2001 From: Henri Rosten Date: Mon, 4 Mar 2024 16:25:50 +0200 Subject: [PATCH] README: Document changing deploy location Signed-off-by: Henri Rosten --- terraform/README.md | 48 ++++++++++++++++++++++++++++++++++++--------- 1 file changed, 39 insertions(+), 9 deletions(-) diff --git a/terraform/README.md b/terraform/README.md index 70e5660f..4991d34d 100644 --- a/terraform/README.md +++ b/terraform/README.md @@ -11,12 +11,12 @@ For architectural description, see [README-azure.md](./README-azure.md) original > The setup uses Nix to build disk images, uploads them to Azure, and then boots > virtual machines off of them. -> +> > Images are considered "appliance images", meant the Nix code describing their > configuration describes the exact same purpose of the machine (no two-staged > deployment process, the machine does the thing it's supposed to do after > bootup), allowing to remove the need for e.g. ssh access as much as possible. -> +> > Machines are considered ephemeral, every change in the appliance image / nixos > configuration causes a new image to be built, and a new VM to be booted with > that new image. @@ -68,7 +68,7 @@ terraform ``` - The `terraform` directory contains the root terraform deployment files with the VM configurations `binary-cache.tf`, `builder.tf`, and `jenkins-controller.tf` matching the components described in [README-azure.md](./README-azure.md) in its [components section](./README-azure.md#components). - The `terraform/azarm` directory contains the terraform configuration for Azure `aarch64` builder which is used from ghaf github-actions [build.yml workflow](https://github.com/tiiuae/ghaf/blob/e81ccfb41d75eda0488b6b4325aeccb8385ce960/.github/workflows/build.yml#L151) to build `aarch64` targets for authorized PRs pre-merge. `azarm` is disconnected from the root terraform module: it's a separate configuration with its own state. -- The `terraform/persistent` directory contains the terraform configuration for parts of the infrastructure that are considered persitent - resources defined under `terraform/persistent` will not be removed even if the ghaf-infra instance is otherwise removed. An example of such persistent ghaf-infra resource is the binary cache storage as well as the binary cache signing key. There may be many 'persistent' infrastructure instances - currently `dev` and `prod` deployments have their own instances of the persistent resources. Section [Multiple Environments with Terraform Workspaces](./README.md#multiple-environments-with-terraform-workspaces) discusses this topic with more details. +- The `terraform/persistent` directory contains the terraform configuration for parts of the infrastructure that are considered persistent - resources defined under `terraform/persistent` will not be removed even if the ghaf-infra instance is otherwise removed. An example of such persistent ghaf-infra resource is the binary cache storage as well as the binary cache signing key. There may be many 'persistent' infrastructure instances - currently `dev` and `prod` deployments have their own instances of the persistent resources. Section [Multiple Environments with Terraform Workspaces](./README.md#multiple-environments-with-terraform-workspaces) discusses this topic with more details. - The `terraform/playground` directory contains tooling to facilitate the usage of terraform workspaces in setting-up distinct copies of the ghaf-infra infrastructure, i.e. 'playground' `dev` environments. It also includes an [example test infrastructure](./playground/test-infra.tf) that allows deploying example infrastructure including just one nix VM, highlighting the use of `terraform/modules` to build and upload the nix image on Azure. - The `terraform/state-storage` directory contains the terraform configuration for the ghaf-infra remote backend state storage using Azure storage blob. See section [Initializing Azure State and Persistent Data](./README.md#initializing-azure-state-and-persistent-data) for more details. - The `terraform/modules` directory contains terraform modules used from the ghaf-infra VM configurations to build, upload, and spin up Azure nix images. @@ -80,7 +80,7 @@ To initialize the backend storage, use the `terraform-init-sh`: ```bash # Inside the terraform directory -$ ./terraform-init.sh +$ ./terraform-init.sh [+] Initializing state storage [+] Initializing persistent data ... @@ -98,10 +98,10 @@ The main reasons for using terraform workspaces include: - Parts of the ghaf-infra infrastructure are persistent and shared between different environments. As an example, private `dev` environments share the binary cache storage. This arrangement makes it possible to treat, for instance, `dev` and private ghaf-infra instances dispensable: ghaf-infra instances can be temporary and short-lived as it's easy to spin-up new environments without losing any valuable data. The persistent data is configured outside the root ghaf-infra terraform deployment in the `terraform/persistent` directory. There may be many 'persistent' infrastructure instances - currently `dev` and `prod` deployments have their own instances of the persistent resources. This means that `dev` and `prod` instances of ghaf-infra do **not** share any persistent data. As an example, `dev` and `prod` deployments of ghaf-infra have a separate binary cache storage. The binding to persistent resources from ghaf-infra is done in the [`main.tf`](./main.tf#L166) based on the terraform workspace name and resource location. Persistent data initialization is automatically done with `terraform-init.sh` script. - Currently, the following resources are defined 'persistent', meaning `dev` and `prod` instances do not share the following resources: - Binary cache storage: [`binary-cache-storage.tf`](./persistent/binary-cache-storage/binary-cache-storage.tf) - - Binray cache signing key: [`binary-cache-sigkey.ft`](./persistent/binary-cache-sigkey/binary-cache-sigkey.tf) + - Binary cache signing key: [`binary-cache-sigkey.ft`](./persistent/binary-cache-sigkey/binary-cache-sigkey.tf) - Builder ssh key: [`builder-ssh-key.tf`](./persistent/builder-ssh-key/builder-ssh-key.tf) -To help facilitate the usage of terraform workspaces in setting-up distinct copies of ghaf-infra, one can [use terraform workspaces from the command line](https://developer.hashicorp.com/terraform/cli/workspaces#managing-cli-workspaces) or consider using the helper script provided at [`playground/terraform-playground.sh`](./playground/terraform-playground.sh). Below, for the sake of example, we use the [`playground/terraform-playground.sh`](./playground/terraform-playground.sh) to setup a private devlopment instance of ghaf-infra: +To help facilitate the usage of terraform workspaces in setting-up distinct copies of ghaf-infra, one can [use terraform workspaces from the command line](https://developer.hashicorp.com/terraform/cli/workspaces#managing-cli-workspaces) or consider using the helper script provided at [`playground/terraform-playground.sh`](./playground/terraform-playground.sh). Below, for the sake of example, we use the [`playground/terraform-playground.sh`](./playground/terraform-playground.sh) to setup a private deployment instance of ghaf-infra: ```bash # Activate private development environment @@ -159,6 +159,36 @@ $ ./playground/terraform-playground.sh destroy ``` The above command removes all the resources that were created for the private development environment. +## Changing Azure Deploy Location +By default, ghaf-infra is deployed to Azure location `northeurope` (North Europe). +However, ghaf-infra resources can be deployed to other Azure locations too, with the following caveats: +- Ghaf-infra has been tested in a limited set of locations. `terraform-init.sh` exits with an error if you try to initialize ghaf-infra in a non-supported (non-tested) location. When deploying to a new, previously unsupported location, you need to modify the [`terraform-init.sh`](./terraform-init.sh#L89). +- For a full list of available Azure location names, run `az account list-locations -o table` in ghaf-infra devshell. +- Not all Azure VM sizes or other resources are available in all Azure locations. You can search the availability of specific resources through the Azure region product page e.g.: https://azure.microsoft.com/en-us/explore/global-infrastructure/products-by-region/?regions=europe-north&products=virtual-machines. Alternatively, you can list the VM sizes per location with `az vm list-sizes` command from the ghaf-infra devshell, for instance: `az vm list-sizes --location 'northeurope' -o table`. +- Your Azure subscription quota limits impact the ability to deploy ghaf-infra, as such, you might need to increase the vCPU quotas for your subscription via the Azure web portal. See more information at https://learn.microsoft.com/en-us/azure/quotas/quotas-overview. You can check your quota usage from the Azure web portal or using `az vm list-usage`, for instance: `az vm list-usage --location "northeurope" -o table`. +- Currently, the terraform state is always deployed to the default Azure location, even if your ghaf-infra instance other resources (i.e. [persistent data](./persistent/main.tf), [root module resources](main.tf)) are deployed to a different location. + +Following shows an example of deploying ghaf-infra to Azure location UAE North: + +```bash +# Initialize terraform state and persistent data, using UAE North as an example location: +$ ./terraform-init.sh -l uaenorth + +# Switch to (and optionally create) a workspace 'devuaen' +$ terraform workspace new devuaen || terraform workspace select devuaen + +# Optionally, run Terraform plan: +$ terraform plan -var="location=uaenorth" + +# Deploy with Terraform apply, specifying the location with a variable: +$ terraform apply -var="location=uaenorth" -auto-approve +``` +When destroying a ghaf-infra instance deployed to a non-default location, you need to specify the location with a variable: +```bash +# Destroy: +$ terraform apply -destroy -var="location=uaenorth" -auto-approve +``` + ## Common Terraform Errors Below are some common Terraform errors with tips on how to resolve each. @@ -186,12 +216,12 @@ $ terraform apply │ Error: creating/updating Image (Subscription: "" │ Resource Group Name: "ghaf-infra-dev" │ Image Name: ""): performing CreateOrUpdate: unexpected status 400 with error: InvalidParameter: The source blob https://.blob.core.windows.net/ghaf-infra-vm-images/.vhd is not accessible. -│ +│ │ with module.builder_image.azurerm_image.default, │ on modules/azurerm-nix-vm-image/main.tf line 22, in resource "azurerm_image" "default": │ 22: resource "azurerm_image" "default" { ``` -Try running `terraform apply` again if you get an error similar to one shown above. +Try running `terraform apply` again if you get an error similar to one shown above. It's unclear why this error occasionally occurs, this issue should be analyzed in detail. #### Error: Disk @@ -201,7 +231,7 @@ $ terraform apply │ Error: Disk (Subscription: "" │ Resource Group Name: "ghaf-infra-persistent-eun" │ Disk Name: "binary-cache-vm-caddy-state-dev") was not found -│ +│ │ with data.azurerm_managed_disk.binary_cache_caddy_state, │ on main.tf line 207, in data "azurerm_managed_disk" "binary_cache_caddy_state": │ 207: data "azurerm_managed_disk" "binary_cache_caddy_state" {