diff --git a/terraform/README.md b/terraform/README.md index 46d25977..f530c6fc 100644 --- a/terraform/README.md +++ b/terraform/README.md @@ -28,20 +28,20 @@ Experimental feature "nix-command" must be enabled. Clone this repository: ```bash -$ git clone https://github.com/tiiuae/ghaf-infra.git -$ cd ghaf-infra +❯ git clone https://github.com/tiiuae/ghaf-infra.git +❯ cd ghaf-infra ``` Bootstrap nix-shell with the required dependencies: ```bash # Start a nix-shell with required dependencies: -$ nix-shell +❯ nix-shell # Authenticate with az login: -$ az login +❯ az login # Terraform comands are executed under the terraform directory: -$ cd terraform/ +❯ cd terraform/ ``` All commands in this document are executed from nix-shell inside the `terraform` directory. @@ -53,10 +53,12 @@ terraform │   ├── binary-cache-sigkey │   ├── binary-cache-storage │   ├── builder-ssh-key +│   ├── resources │   └── workspace-specific ├── state-storage │   └── tfstate-storage.tf ├── modules +│   ├── arm-builder-vm │   ├── azurerm-linux-vm │   └── azurerm-nix-vm-image ├── binary-cache.tf @@ -65,18 +67,15 @@ terraform └── main.tf ``` - The `terraform` directory contains the root terraform deployment files with the VM configurations `binary-cache.tf`, `builder.tf`, and `jenkins-controller.tf` matching the components described in [README-azure.md](./README-azure.md) in its [components section](./README-azure.md#components). -- The `terraform/persistent` directory contains the terraform configuration for parts of the infrastructure that are considered persistent - resources defined under `terraform/persistent` will not be removed even if the ghaf-infra instance is otherwise removed. An example of such persistent ghaf-infra resource is the binary cache storage as well as the binary cache signing key. There may be many 'persistent' infrastructure instances - currently `dev` and `prod` deployments have their own instances of the persistent resources. Section [Multiple Environments with Terraform Workspaces](./README.md#multiple-environments-with-terraform-workspaces) discusses this topic with more details. +- The `terraform/persistent` directory contains the terraform configuration for parts of the infrastructure that are considered persistent - resources defined under `terraform/persistent` will not be removed even if the ghaf-infra instance is otherwise removed. An example of such persistent ghaf-infra resource is the binary cache storage as well as the binary cache signing key. There may be many 'persistent' infrastructure instances - currently `priv` `dev/prod` and `release` deployments have their own instances of the persistent resources. Section [Multiple Environments with Terraform Workspaces](./README.md#multiple-environments-with-terraform-workspaces) discusses this topic with more details. - The `terraform/state-storage` directory contains the terraform configuration for the ghaf-infra remote backend state storage using Azure storage blob. See section [Initializing Azure State and Persistent Data](./README.md#initializing-azure-state-and-persistent-data) for more details. - The `terraform/modules` directory contains terraform modules used from the ghaf-infra VM configurations to build, upload, and spin up Azure nix images. -## Initializing Azure State and Persistent Data -This project stores the terraform state in a remote storage in an azure storage blob as configured in [tfstate-storage.tf](./state-storage/tfstate-storage.tf). The benefits of using such remote storage setup are well outlined in [storing state in azure storage](https://learn.microsoft.com/en-us/azure/developer/terraform/store-state-in-azure-storage) and [terraform backend configuration](https://developer.hashicorp.com/terraform/language/settings/backends/configuration). - -To initialize the backend storage, use the `terraform-init-sh`: - +## Initializing Ghaf-Infra Environment ```bash # Inside the terraform directory -$ ./terraform-init.sh +# Replace 'workspacename' with the name of the workspace you are going to work with +❯ ./terraform-init.sh -w workspacename [+] Initializing state storage [+] Initializing persistent data ... @@ -84,36 +83,28 @@ $ ./terraform-init.sh ``` `terraform-init.sh` will not do anything if the initialization has already been done. In other words, it's safe to run the script many times; it will not destroy or re-initialize anything if the init was already executed. -In addition to the shared terraform state, some of the infrastructure resources are also shared between the ghaf-infra instances. `terraform-init.sh` initializes the persistent configuration defined under `terraform/persistent`. There may be many 'persistent' infrastructure instances: currently `dev` and `prod` deployments have their own instances of the persistent resources. Section [Multiple Environments with Terraform Workspaces](./README.md#multiple-environments-with-terraform-workspaces) discusses this topic with more details. - ## Multiple Environments with Terraform Workspaces To support infrastructure development in isolated environments, this project uses [terraform workspaces](https://developer.hashicorp.com/terraform/cli/workspaces). The main reasons for using terraform workspaces include: -- Different workspaces allow deploying different instances of ghaf-infra. Each instance has a completely separate state data, making it possible to deploy `dev`, `prod`, or even private development instances of ghaf-infra. This makes it possible to first develop and test infrastructure changes in a private development environment, before proposing changes to shared (e.g. `dev` or `prod`) environments. The configuration codebase is the same between all the environments, with the differentiation options defined in the [`main.tf`](./main.tf#L69). -- Parts of the ghaf-infra infrastructure are persistent and shared between different environments. As an example, private `dev` environments share the binary cache storage. This arrangement makes it possible to treat, for instance, `dev` and private ghaf-infra instances dispensable: ghaf-infra instances can be temporary and short-lived as it's easy to spin-up new environments without losing any valuable data. The persistent data is configured outside the root ghaf-infra terraform deployment in the `terraform/persistent` directory. There may be many 'persistent' infrastructure instances - currently `dev` and `prod` deployments have their own instances of the persistent resources. This means that `dev` and `prod` instances of ghaf-infra do **not** share any persistent data. As an example, `dev` and `prod` deployments of ghaf-infra have a separate binary cache storage. The binding to persistent resources from ghaf-infra is done in the [`main.tf`](./main.tf#L166) based on the terraform workspace name and resource location. Persistent data initialization is automatically done with `terraform-init.sh` script. -- Currently, the following resources are defined 'persistent', meaning `dev` and `prod` instances do not share the following resources: - - Binary cache storage: [`binary-cache-storage.tf`](./persistent/binary-cache-storage/binary-cache-storage.tf) - - Binary cache signing key: [`binary-cache-sigkey.ft`](./persistent/binary-cache-sigkey/binary-cache-sigkey.tf) - - Builder ssh key: [`builder-ssh-key.tf`](./persistent/builder-ssh-key/builder-ssh-key.tf) +- Different workspaces allow deploying different instances of ghaf-infra. Each instance has a completely separate state data, making it possible to deploy `dev`, `prod`, `release` or even private development instances of ghaf-infra. This makes it possible to first develop and test infrastructure changes in a private development environment, before proposing changes to shared (e.g. `dev` or `prod`) environments. The configuration codebase is the same between all the environments, with the differentiation options defined in the [`main.tf`](./main.tf#L105). +- Parts of the ghaf-infra infrastructure are persistent and shared between different environments. As an example, private environments share the binary cache storage. This arrangement makes it possible to treat, for instance, private ghaf-infra instances dispensable: ghaf-infra instances can be temporary and short-lived as it's easy to spin-up new environments without losing any valuable data. The persistent data is configured outside the root ghaf-infra terraform deployment in the `terraform/persistent` directory. There may be many 'persistent' infrastructure instances - currently `priv`, `dev/prod` and `release` deployments have their own instances of the persistent resources. This means that `priv`, `dev/prod` and `release` instances of ghaf-infra do **not** share any persistent data. As an example, `priv` and `prod` deployments of ghaf-infra have a separate binary cache storage. The binding to persistent resources from ghaf-infra is done in the [`main.tf`](./main.tf) based on the terraform workspace name and resource location. Persistent data initialization is automatically done with `terraform-init.sh` script. -To help facilitate the usage of terraform workspaces in setting-up distinct copies of ghaf-infra, one can [use terraform workspaces from the command line](https://developer.hashicorp.com/terraform/cli/workspaces#managing-cli-workspaces) or consider using the helper script provided at [`terraform-playground.sh`](./terraform-playground.sh). Below, for the sake of example, we use the [`terraform-playground.sh`](./terraform-playground.sh) to setup a private deployment instance of ghaf-infra: +To help facilitate the usage of terraform workspaces in setting-up distinct copies of ghaf-infra, one can [use terraform workspaces from the command line](https://developer.hashicorp.com/terraform/cli/workspaces#managing-cli-workspaces). Below, for the sake of example, we setup a private deployment instance of ghaf-infra: ```bash -# Activate private development environment -$ ./terraform-playground.sh activate -# ... -[+] Done, use terraform [validate|plan|apply] to work with your dev infra -``` -Which sets-up a terraform workspace for your private development environment: -```bash -# List the current terraform worskapce -$ terraform workspace list -Terraform workspaces: +# Activate private development environment 'henri' +❯ ./terraform-init.sh -w henri +[+] Using state 'ghaf-infra-0-state-eun' +[+] Using persistent 'ghaf-infra-0-persistent-eun' +[+] Initializing workspace-specific persistent +[+] Initializing workspace +[+] Listing workspaces: default - dev -* henrirosten # <-- indicates active workspace + dev0 +* henri # <-- indicates active workspace prod + release ``` ## Terraform workflow @@ -125,65 +116,46 @@ Once your are ready to deploy your terraform or nix configuration changes, the f # Inside the terraform directory # Format the terraform code files: -$ terraform fmt -recursive +❯ terraform fmt -recursive # Validate the terraform changes: -$ terraform validate +❯ terraform validate # Make sure you deploy to the correct ghaf-infra instance. # Use terraform workspace select to switch workspaces -$ terraform workspace list +❯ terraform workspace list default - dev -* henrirosten # <== This example deploys to private dev environment + dev0 +* henri # <-- This example deploys to private dev environment prod + release # Show what actions terraform would take on apply: -$ terraform plan +❯ terraform plan # Apply your configuration changes: -$ terraform apply +❯ terraform apply ``` Once `terraform apply` completes, the private development infrastructure is deployed. You can now play around in your isolated copy of the infrastructure, testing and updating the changes, making sure the changes work as expected before merging the changes. ## Destroying Test Environment -Once the configuration changes have been tested, the private development environment can be destroyed: -```bash -# Destroy the private terraform worskapce using helper script -$ ./terraform-playground.sh destroy - -# Alternatively, you can use terraform command directly -$ terraform workspace select -$ terraform apply -destroy -``` -The above command(s) remove all the resources that were created for the given environment. - -## Changing Azure Deploy Location -By default, ghaf-infra is deployed to Azure location `northeurope` (North Europe). -However, ghaf-infra resources can be deployed to other Azure locations too, with the following caveats: -- Ghaf-infra has been tested in a limited set of locations. `terraform-init.sh` exits with an error if you try to initialize ghaf-infra in a non-supported (non-tested) location. When deploying to a new, previously unsupported location, you need to modify the [`terraform-init.sh`](./terraform-init.sh#L89). -- For a full list of available Azure location names, run `az account list-locations -o table` in ghaf-infra devshell. -- Not all Azure VM sizes or other resources are available in all Azure locations. You can search the availability of specific resources through the Azure region product page e.g.: https://azure.microsoft.com/en-us/explore/global-infrastructure/products-by-region/?regions=europe-north&products=virtual-machines. Alternatively, you can list the VM sizes per location with `az vm list-sizes` command from the ghaf-infra devshell, for instance: `az vm list-sizes --location 'northeurope' -o table`. -- Your Azure subscription quota limits impact the ability to deploy ghaf-infra, as such, you might need to increase the vCPU quotas for your subscription via the Azure web portal. See more information at https://learn.microsoft.com/en-us/azure/quotas/quotas-overview. You can check your quota usage from the Azure web portal or using `az vm list-usage`, for instance: `az vm list-usage --location "northeurope" -o table`. - -Following shows an example of deploying ghaf-infra to Azure location SWE Central: - +Once you no longer need your playground environment, the private development can be destroyed: ```bash -# Initialize terraform state and persistent data, using SWE Central as an example location: -$ ./terraform-init.sh -l swedencentral - -# Switch to (and optionally create) a workspace 'devswec' -$ terraform workspace new devswec || terraform workspace select devswec +# Inside the terraform directory -# Optionally, run Terraform plan: -# (Variable 'envtype' overrides the default environment type) -$ terraform plan -var="envtype=dev" +❯ terraform workspace list + default + dev0 +* henri + prod + release -# Deploy with Terraform apply: -$ terraform apply -var="envtype=dev" -auto-approve +❯ terraform workspace select henri +❯ terraform apply -destroy ``` +The above command(s) remove all the resources that were created for the given environment. ## Common Terraform Errors @@ -191,7 +163,7 @@ Below are some common Terraform errors with tips on how to resolve each. #### Error: A resource with the ID already exists ```bash -$ terraform apply +❯ terraform apply ... azurerm_virtual_machine_extension.deploy_ubuntu_builder: Creating... ╷ @@ -200,37 +172,33 @@ azurerm_virtual_machine_extension.deploy_ubuntu_builder: Creating... Example fix: ```bash -$ terraform import azurerm_virtual_machine_extension.deploy_ubuntu_builder /subscriptions//resourceGroups/rg-name-here/providers/Microsoft.Compute/virtualMachines/testvm/extensions/testvm-vmext +❯ terraform import azurerm_virtual_machine_extension.deploy_ubuntu_builder /subscriptions//resourceGroups/rg-name-here/providers/Microsoft.Compute/virtualMachines/testvm/extensions/testvm-vmext # Ref: https://stackoverflow.com/questions/61418168/terraform-resource-with-the-id-already-exists ``` -#### Error: creating/updating Image +#### Error: Backend configuration changed ```bash -$ terraform apply -... -│ Error: creating/updating Image (Subscription: "" -│ Resource Group Name: "ghaf-infra-dev" -│ Image Name: ""): performing CreateOrUpdate: unexpected status 400 with error: InvalidParameter: The source blob https://.blob.core.windows.net/ghaf-infra-vm-images/.vhd is not accessible. +❯ ./terraform-init.sh -w workspacename +[+] Using state 'ghaf-infra-state-0-eun' +[+] Using persistent 'ghaf-infra-persistent-0-eun' +[+] Initializing workspace-specific persistent +╷ +│ Error: Backend configuration changed +│ +│ A change in the backend configuration has been detected, which may require migrating existing state. │ -│ with module.builder_image.azurerm_image.default, -│ on modules/azurerm-nix-vm-image/main.tf line 22, in resource "azurerm_image" "default": -│ 22: resource "azurerm_image" "default" { +│ If you wish to attempt automatic migration of the state, use "terraform init -migrate-state". +│ If you wish to store the current configuration with no changes to the state, use "terraform init -reconfigure". ``` -Try running `terraform apply` again if you get an error similar to one shown above. -It's unclear why this error occasionally occurs, this issue should be analyzed in detail. -#### Error: Disk +Above error (or similar) is likely caused by changed terraform backend state. +Fix the local state reference by removing local state files and re-running `terraform-init.sh`: + ```bash -$ terraform apply -... -│ Error: Disk (Subscription: "" -│ Resource Group Name: "ghaf-infra-persistent-eun" -│ Disk Name: "binary-cache-vm-caddy-state-dev") was not found -│ -│ with data.azurerm_managed_disk.binary_cache_caddy_state, -│ on main.tf line 207, in data "azurerm_managed_disk" "binary_cache_caddy_state": -│ 207: data "azurerm_managed_disk" "binary_cache_caddy_state" { +# Make sure you don't have any untracked files you want to keep in your working tree +# before running the below command +❯ git clean -ffdx +# Replace 'workspacename' with the name of the workspace you'll work with +❯ ./terraform-init.sh -w workspacename ``` -Above error (or similar) is likely caused by missing initialization for some `persistent` resources. -Fix the persistent initialization by running `terraform-init.sh` then run `terraform apply` again. diff --git a/terraform/jenkins-controller.tf b/terraform/jenkins-controller.tf index 3e331e3e..c1b1916c 100644 --- a/terraform/jenkins-controller.tf +++ b/terraform/jenkins-controller.tf @@ -53,7 +53,7 @@ module "jenkins_controller_vm" { "path" = "/var/lib/rclone-http/env" }, { - content = "AZURE_STORAGE_ACCOUNT_NAME=${azurerm_storage_account.jenkins_artifacts.name}", + content = "AZURE_STORAGE_ACCOUNT_NAME=${data.azurerm_storage_account.jenkins_artifacts.name}", "path" = "/var/lib/rclone-jenkins-artifacts/env" }, # Render /etc/nix/machines with terraform. In the future, we might want to @@ -190,7 +190,7 @@ resource "azurerm_role_assignment" "jenkins_controller_access_storage" { # Allow the VM to *write* to (and read from) the jenkins artifacts bucket resource "azurerm_role_assignment" "jenkins_controller_access_artifacts" { - scope = azurerm_storage_container.jenkins_artifacts_1.resource_manager_id + scope = data.azurerm_storage_container.jenkins_artifacts_1.resource_manager_id role_definition_name = "Storage Blob Data Contributor" principal_id = module.jenkins_controller_vm.virtual_machine_identity_principal_id } diff --git a/terraform/main.tf b/terraform/main.tf index 4086c9c2..bad7c6a5 100644 --- a/terraform/main.tf +++ b/terraform/main.tf @@ -22,11 +22,6 @@ terraform { source = "numtide/secret" } } -} - -################################################################################ - -terraform { # Backend for storing terraform state (see ../state-storage) backend "azurerm" { # resource_group_name and storage_account_name are set by the callee @@ -41,16 +36,6 @@ terraform { # Current signed-in user data "azurerm_client_config" "current" {} -variable "envtype" { - type = string - description = "Set the environment type; determines e.g. the Azure VM sizes" - default = "priv" - validation { - condition = contains(["priv", "dev", "prod"], var.envtype) - error_message = "Must be either \"priv\", \"dev\", or \"prod\"" - } -} - variable "envfile" { type = string description = "Error out if .env-file is missing" @@ -63,7 +48,7 @@ variable "envfile" { variable "convince" { type = bool - description = "Protect against accidental dev or prod environment deployment" + description = "Protect against accidental non-priv environment deployment" default = false } @@ -87,23 +72,31 @@ locals { # https://github.com/claranet/terraform-azurerm-regions/blob/master/REGIONS.md shortloc = module.azure_region.location_short - assert_region_mismatch_1 = regex( - ("${local.shortloc}" != "eun" && local.envs["storage_account_rg_name"] != "ghaf-infra-state-${local.shortloc}") ? - "((Force invalid regex pattern)\n\nERROR: region (non-eun) mismatch, re-run terraform-init.sh" : "", "") - - assert_region_mismatch_2 = regex( - ("${local.shortloc}" == "eun" && local.envs["storage_account_rg_name"] != "ghaf-infra-state") ? - "((Force invalid regex pattern)\n\nERROR: region (eun) mismatch, re-run terraform-init.sh" : "", "") - # Sanitize workspace name ws = substr(replace(lower(terraform.workspace), "/[^a-z0-9]/", ""), 0, 16) + ext_builder_machines = [ + "ssh://remote-build@builder.vedenemo.dev x86_64-linux /etc/secrets/remote-build-ssh-key 32 3 kvm,nixos-test,benchmark,big-parallel - -", + "ssh://remote-build@hetzarm.vedenemo.dev aarch64-linux /etc/secrets/remote-build-ssh-key 40 3 kvm,nixos-test,benchmark,big-parallel - -" + ] + ext_builder_keyscan = ["builder.vedenemo.dev", "hetzarm.vedenemo.dev"] + binary_cache_url_common = "https://ghaf-binary-cache-${local.ws}.${azurerm_resource_group.infra.location}.cloudapp.azure.com" + # TODO: adding multiple urls as comma-and-whitespace separated + # string is more or less a hack. If we plan to have multiple domains + # per host permanently, we could make the below variable a list, and + # do the templating to a comma-and-whitespace separated list of urls + # before we pass it to caddy. + binary_cache_url_dev = "${local.binary_cache_url_common}, https://dev-cache.vedenemo.dev" + binary_cache_url_prod = "${local.binary_cache_url_common}, https://prod-cache.vedenemo.dev" + # Environment-specific configuration options. # See Azure vm sizes and specs at: # https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux # E.g. 'Standard_D2_v3' means: 2 vCPU, 8 GiB RAM opts = { priv = { + builder_sshkey_id = "ext" + persistent_id = "priv" vm_size_binarycache = "Standard_D2_v3" osdisk_size_binarycache = "50" vm_size_builder_x86 = "Standard_D2_v3" @@ -113,16 +106,14 @@ locals { osdisk_size_controller = "150" num_builders_x86 = 0 num_builders_aarch64 = 0 - # 'priv' and 'dev' environments use the same binary cache signing key - binary_cache_public_key = "ghaf-infra-dev:EdgcUJsErufZitluMOYmoJDMQE+HFyveI/D270Cr84I=" - binary_cache_url = "https://ghaf-binary-cache-${local.ws}.${azurerm_resource_group.infra.location}.cloudapp.azure.com" - ext_builder_machines = [ - "ssh://remote-build@builder.vedenemo.dev x86_64-linux /etc/secrets/remote-build-ssh-key 32 3 kvm,nixos-test,benchmark,big-parallel - -", - "ssh://remote-build@hetzarm.vedenemo.dev aarch64-linux /etc/secrets/remote-build-ssh-key 40 3 kvm,nixos-test,benchmark,big-parallel - -" - ] - ext_builder_keyscan = ["builder.vedenemo.dev", "hetzarm.vedenemo.dev"] + binary_cache_public_key = "priv-cache.vedenemo.dev~1:FmJGfAkx+2fhqpzHGT/V3M35VcPm2pfkCuiTo8xQD0A=" + binary_cache_url = local.binary_cache_url_common + ext_builder_machines = local.ext_builder_machines + ext_builder_keyscan = local.ext_builder_keyscan } dev = { + builder_sshkey_id = "ext" + persistent_id = "prod" vm_size_binarycache = "Standard_D4_v3" osdisk_size_binarycache = "250" vm_size_builder_x86 = "Standard_D16_v3" @@ -132,21 +123,33 @@ locals { osdisk_size_controller = "1000" num_builders_x86 = 0 num_builders_aarch64 = 0 - # 'priv' and 'dev' environments use the same binary cache signing key - binary_cache_public_key = "ghaf-infra-dev:EdgcUJsErufZitluMOYmoJDMQE+HFyveI/D270Cr84I=" - # TODO: adding multiple urls as comma-and-whitespace separated - # string is more or less a hack. If we plan to have multiple domains - # per host permanently, we could make the below variable a list, and - # do the templating to a comma-and-whitespace separated list of urls - # before we pass it to caddy. - binary_cache_url = "https://ghaf-binary-cache-${local.ws}.${azurerm_resource_group.infra.location}.cloudapp.azure.com, https://dev-cache.vedenemo.dev" - ext_builder_machines = [ - "ssh://remote-build@builder.vedenemo.dev x86_64-linux /etc/secrets/remote-build-ssh-key 32 3 kvm,nixos-test,benchmark,big-parallel - -", - "ssh://remote-build@hetzarm.vedenemo.dev aarch64-linux /etc/secrets/remote-build-ssh-key 40 3 kvm,nixos-test,benchmark,big-parallel - -" - ] - ext_builder_keyscan = ["builder.vedenemo.dev", "hetzarm.vedenemo.dev"] + # 'dev' and 'prod' use the same binary cache storage and key + binary_cache_public_key = "prod-cache.vedenemo.dev~1:JcytRNMJJdYJVQCYwLNsrfVhct5dhCK2D3fa6O1WHOI=" + binary_cache_url = local.binary_cache_url_dev + ext_builder_machines = local.ext_builder_machines + ext_builder_keyscan = local.ext_builder_keyscan } prod = { + builder_sshkey_id = "ext" + persistent_id = "prod" + vm_size_binarycache = "Standard_D4_v3" + osdisk_size_binarycache = "250" + vm_size_builder_x86 = "Standard_D16_v3" + vm_size_builder_aarch64 = "Standard_D8ps_v5" + osdisk_size_builder = "250" + vm_size_controller = "Standard_E4_v5" + osdisk_size_controller = "1000" + num_builders_x86 = 0 + num_builders_aarch64 = 0 + # 'dev' and 'prod' use the same binary cache storage and key + binary_cache_public_key = "prod-cache.vedenemo.dev~1:JcytRNMJJdYJVQCYwLNsrfVhct5dhCK2D3fa6O1WHOI=" + binary_cache_url = local.binary_cache_url_prod + ext_builder_machines = local.ext_builder_machines + ext_builder_keyscan = local.ext_builder_keyscan + } + release = { + builder_sshkey_id = "release" + persistent_id = "release" vm_size_binarycache = "Standard_D4_v3" osdisk_size_binarycache = "250" vm_size_builder_x86 = "Standard_D64_v3" @@ -156,8 +159,8 @@ locals { osdisk_size_controller = "1000" num_builders_x86 = 1 num_builders_aarch64 = 1 - binary_cache_public_key = "ghaf-infra-prod:DIrhJsqehIxjuUQ93Fqx6gmo4cDgn5srW5dedvMbqD0=" - binary_cache_url = "https://ghaf-binary-cache-${local.ws}.${azurerm_resource_group.infra.location}.cloudapp.azure.com" + binary_cache_public_key = "release-cache.vedenemo.dev~1:kxSUdZvNF8ax7hpJMu+PexEBQGUkZDqeugu+pwz/ACk=" + binary_cache_url = local.binary_cache_url_common ext_builder_machines = [] ext_builder_keyscan = [] } @@ -166,24 +169,23 @@ locals { # Read ssh-keys.yaml into local.ssh_keys ssh_keys = yamldecode(file("../ssh-keys.yaml")) - # This determines the configuration options used in the - # ghaf-infra instance (defines e.g. vm_sizes and number of builders). - # If workspace name is "dev" or "prod" use the workspace name as - # envtype, otherwise, use the value from var.envtype. - conf = local.ws == "dev" || local.ws == "prod" ? local.ws : var.envtype + # Determine the configuration options used in the ghaf-infra instance + # based on the workspace name + is_release = length(regexall("^release.*", local.ws)) > 0 + is_prod = length(regexall("^prod.*", local.ws)) > 0 + is_dev = length(regexall("^dev.*", local.ws)) > 0 + conf = local.is_release ? "release" : local.is_prod ? "prod" : local.is_dev ? "dev" : "priv" - # Protect against accidental dev or prod environment deployment by requiring + # Protect against accidental non-priv environment deployment by requiring # variable -var="convince=true". assert_accidental_deployment = regex( ("${local.conf}" != "priv" && !(var.convince)) ? - "((Force invalid regex pattern\n\nERROR: Deployment to 'prod' or 'dev' requires variable 'convince'" : "", "") - - # Selects the persistent data (see ./persistent) used in the ghaf-infra - # instance; currently either "dev" or "prod" based on the environment type: - # "priv" ==> "dev" - # "dev" ==> "dev" - # "prod" ==> "prod" - persistent_data = local.conf == "priv" ? "dev" : local.conf + "((Force invalid regex pattern\n\nERROR: Deployment to non-priv requires variable 'convince'" : "", "") + + # Selects the persistent data for this ghaf-infra instance (see ./persistent) + persistent_rg = local.envs["persistent_rg_name"] + builder_sshkey_id = "id0${local.opts[local.conf].builder_sshkey_id}${local.shortloc}" + persistent_id = "id0${local.opts[local.conf].persistent_id}${local.shortloc}" } ################################################################################ @@ -196,6 +198,8 @@ resource "azurerm_resource_group" "infra" { ################################################################################ +# Environment specific resources + # Virtual network resource "azurerm_virtual_network" "vnet" { name = "ghaf-infra-vnet" @@ -220,10 +224,7 @@ resource "azurerm_subnet" "builders" { address_prefixes = ["10.0.4.0/28"] } -################################################################################ - # Storage account and storage container used to store VM images - resource "azurerm_storage_account" "vm_images" { name = "img${local.ws}${local.shortloc}" resource_group_name = azurerm_resource_group.infra.name @@ -241,43 +242,22 @@ resource "azurerm_storage_container" "vm_images" { ################################################################################ -# Data sources to access state data, see ./state-storage +# Data sources to access terraform state, see ./state-storage data "azurerm_storage_account" "tfstate" { name = local.envs["storage_account_name"] resource_group_name = local.envs["storage_account_rg_name"] } -# Storage account and storage container used to store Jenkins artifacts -resource "azurerm_storage_account" "jenkins_artifacts" { - name = "art${local.ws}${local.shortloc}" - resource_group_name = azurerm_resource_group.infra.name - location = azurerm_resource_group.infra.location - account_tier = "Standard" - account_replication_type = "LRS" - allow_nested_items_to_be_public = false -} - -resource "azurerm_storage_container" "jenkins_artifacts_1" { - name = "jenkins-artifacts-v1" - storage_account_name = azurerm_storage_account.jenkins_artifacts.name - container_access_type = "private" -} - -# Data sources to access 'persistent' data, see ./persistent +################################################################################ -data "azurerm_storage_account" "binary_cache" { - name = "ghafbincache${local.persistent_data}${local.shortloc}" - resource_group_name = "ghaf-infra-persistent-${local.shortloc}" -} -data "azurerm_storage_container" "binary_cache_1" { - name = "binary-cache-v1" - storage_account_name = data.azurerm_storage_account.binary_cache.name -} +# Data sources to access 'persistent' data +# see ./persistent and ./persistent/resources +# Builder ssh key data "azurerm_key_vault" "ssh_remote_build" { - name = "ssh-builder-${local.persistent_data}-${local.shortloc}" - resource_group_name = "ghaf-infra-persistent-${local.shortloc}" + name = "sshb-${local.builder_sshkey_id}" + resource_group_name = local.persistent_rg provider = azurerm } @@ -293,9 +273,21 @@ data "azurerm_key_vault_secret" "ssh_remote_build_pub" { provider = azurerm } +# Binary cache storage +data "azurerm_storage_account" "binary_cache" { + name = "bches${local.persistent_id}" + resource_group_name = local.persistent_rg +} + +data "azurerm_storage_container" "binary_cache_1" { + name = "binary-cache-v1" + storage_account_name = data.azurerm_storage_account.binary_cache.name +} + +# Binary cache signing key data "azurerm_key_vault" "binary_cache_signing_key" { - name = "bche-sigkey-${local.persistent_data}-${local.shortloc}" - resource_group_name = "ghaf-infra-persistent-${local.shortloc}" + name = "bchek-${local.persistent_id}" + resource_group_name = local.persistent_rg provider = azurerm } @@ -314,14 +306,27 @@ data "azurerm_key_vault" "ghaf_devenv_ca" { # Data sources to access 'workspace-specific persistent' data # see: ./persistent/workspace-specific +# Caddy state disk: binary cache data "azurerm_managed_disk" "binary_cache_caddy_state" { name = "binary-cache-vm-caddy-state-${local.ws}" - resource_group_name = "ghaf-infra-persistent-${local.shortloc}" + resource_group_name = local.persistent_rg } +# Caddy state disk: jenkins controller data "azurerm_managed_disk" "jenkins_controller_caddy_state" { name = "jenkins-controller-vm-caddy-state-${local.ws}" - resource_group_name = "ghaf-infra-persistent-${local.shortloc}" + resource_group_name = local.persistent_rg +} + +# Jenkins artifacts storage +data "azurerm_storage_account" "jenkins_artifacts" { + name = "artifact${local.ws}" + resource_group_name = local.persistent_rg +} + +data "azurerm_storage_container" "jenkins_artifacts_1" { + name = "jenkins-artifacts-v1" + storage_account_name = data.azurerm_storage_account.jenkins_artifacts.name } ################################################################################ diff --git a/terraform/persistent/main.tf b/terraform/persistent/main.tf index 88c1670c..0767a308 100644 --- a/terraform/persistent/main.tf +++ b/terraform/persistent/main.tf @@ -16,11 +16,6 @@ terraform { source = "numtide/secret" } } -} - -################################################################################ - -terraform { # Backend for storing terraform state (see ../state-storage) backend "azurerm" { # resource_group_name and storage_account_name are set by the callee @@ -39,6 +34,11 @@ variable "location" { description = "Azure region into which the resources will be deployed" } +module "azure_region" { + source = "claranet/regions/azurerm" + azure_region = var.location +} + locals { # Raise an error if workspace is 'default', # this is a workaround to missing asserts in terraform: @@ -46,14 +46,14 @@ locals { (terraform.workspace == "default") ? "((Force invalid regex pattern)\n\nERROR: workspace 'default' is not allowed" : "", "") - # Sanitize workspace name: - # Workspace name defines the persistent instance - ws = substr(replace(lower(terraform.workspace), "/[^a-z0-9]/", ""), 0, 16) + # Short name of the Azure region, see: + # https://github.com/claranet/terraform-azurerm-regions/blob/master/REGIONS.md + shortloc = module.azure_region.location_short } # Resource group resource "azurerm_resource_group" "persistent" { - name = "ghaf-infra-persistent-${local.ws}" + name = terraform.workspace location = var.location lifecycle { # Fails any plan that requires this resource to be destroyed. @@ -67,78 +67,14 @@ data "azurerm_client_config" "current" {} ################################################################################ -# Resources - -# secret_resouce must be created on import, e.g.: -# -# nix-store --generate-binary-cache-key foo secret-key public-key -# terraform import secret_resource.binary_cache_signing_key_dev "$(< ./secret-key)" -# terraform apply -# -# Ghaf-infra automates the creation in 'init-ghaf-infra.sh' -resource "secret_resource" "binary_cache_signing_key_dev" { - lifecycle { - prevent_destroy = true - } -} -resource "secret_resource" "binary_cache_signing_key_prod" { - lifecycle { - prevent_destroy = true - } -} - -module "builder_ssh_key_prod" { - source = "./builder-ssh-key" - # Must be globally unique - builder_ssh_keyvault_name = "ssh-builder-prod-${local.ws}" - resource_group_name = azurerm_resource_group.persistent.name - location = azurerm_resource_group.persistent.location - tenant_id = data.azurerm_client_config.current.tenant_id -} - -module "builder_ssh_key_dev" { +# Shared builder ssh key used to access 'external' builders +module "builder_ssh_key" { source = "./builder-ssh-key" - # Must be globally unique - builder_ssh_keyvault_name = "ssh-builder-dev-${local.ws}" + # Must be globally unique, max 24 characters + builder_ssh_keyvault_name = "sshb-id0ext${local.shortloc}" resource_group_name = azurerm_resource_group.persistent.name location = azurerm_resource_group.persistent.location tenant_id = data.azurerm_client_config.current.tenant_id } -module "binary_cache_sigkey_prod" { - source = "./binary-cache-sigkey" - # Must be globally unique - bincache_keyvault_name = "bche-sigkey-prod-${local.ws}" - secret_resource = secret_resource.binary_cache_signing_key_prod - resource_group_name = azurerm_resource_group.persistent.name - location = azurerm_resource_group.persistent.location - tenant_id = data.azurerm_client_config.current.tenant_id -} - -module "binary_cache_sigkey_dev" { - source = "./binary-cache-sigkey" - # Must be globally unique - bincache_keyvault_name = "bche-sigkey-dev-${local.ws}" - secret_resource = secret_resource.binary_cache_signing_key_dev - resource_group_name = azurerm_resource_group.persistent.name - location = azurerm_resource_group.persistent.location - tenant_id = data.azurerm_client_config.current.tenant_id -} - -module "binary_cache_storage_prod" { - source = "./binary-cache-storage" - # Must be globally unique - bincache_storage_account_name = "ghafbincacheprod${local.ws}" - resource_group_name = azurerm_resource_group.persistent.name - location = azurerm_resource_group.persistent.location -} - -module "binary_cache_storage_dev" { - source = "./binary-cache-storage" - # Must be globally unique - bincache_storage_account_name = "ghafbincachedev${local.ws}" - resource_group_name = azurerm_resource_group.persistent.name - location = azurerm_resource_group.persistent.location -} - ################################################################################ diff --git a/terraform/persistent/resources/main.tf b/terraform/persistent/resources/main.tf new file mode 100644 index 00000000..5e917408 --- /dev/null +++ b/terraform/persistent/resources/main.tf @@ -0,0 +1,100 @@ +# SPDX-FileCopyrightText: 2022-2024 TII (SSRC) and the Ghaf contributors +# SPDX-License-Identifier: Apache-2.0 + +provider "azurerm" { + # https://github.com/hashicorp/terraform-provider-azurerm/issues/24804 + skip_provider_registration = true + features {} +} + +terraform { + required_providers { + azurerm = { + source = "hashicorp/azurerm" + } + secret = { + source = "numtide/secret" + } + } + # Backend for storing terraform state (see ../../state-storage) + backend "azurerm" { + # resource_group_name and storage_account_name are set by the callee + # from command line in terraform init, see terraform-init.sh + container_name = "ghaf-infra-tfstate-container" + key = "ghaf-infra-persistent.tfstate" + } +} + +################################################################################ + +# Variables +variable "persistent_resource_group" { + type = string + description = "Parent resource group name" +} + +locals { + # Raise an error if workspace is 'default', + # this is a workaround to missing asserts in terraform: + assert_workspace_not_default = regex( + (terraform.workspace == "default") ? + "((Force invalid regex pattern)\n\nERROR: workspace 'default' is not allowed" : "", "") + + # Sanitize workspace name: + ws = substr(replace(lower(terraform.workspace), "/[^a-z0-9]/", ""), 0, 16) +} + +# Data source to access persistent resource group (see ../main.tf) +data "azurerm_resource_group" "persistent" { + name = var.persistent_resource_group +} + +# Current signed-in user +data "azurerm_client_config" "current" {} + + +################################################################################ + +# Resources + +# secret_resource must be created on import, e.g.: +# +# nix-store --generate-binary-cache-key foo secret-key public-key +# terraform import secret_resource.binary_cache_signing_key "$(< ./secret-key)" +# terraform apply +# +# Ghaf-infra automates the creation in 'terraform-init.sh' +resource "secret_resource" "binary_cache_signing_key" { + lifecycle { + prevent_destroy = true + } +} + +module "builder_ssh_key" { + source = "../builder-ssh-key" + # Must be globally unique, max 24 characters + builder_ssh_keyvault_name = "sshb-id0${local.ws}" + resource_group_name = data.azurerm_resource_group.persistent.name + location = data.azurerm_resource_group.persistent.location + tenant_id = data.azurerm_client_config.current.tenant_id +} + +module "binary_cache_sigkey" { + source = "../binary-cache-sigkey" + # Must be globally unique, max 24 characters + bincache_keyvault_name = "bchek-id0${local.ws}" + secret_resource = secret_resource.binary_cache_signing_key + resource_group_name = data.azurerm_resource_group.persistent.name + location = data.azurerm_resource_group.persistent.location + tenant_id = data.azurerm_client_config.current.tenant_id +} + +module "binary_cache_storage" { + source = "../binary-cache-storage" + # Must be globally unique, max 24 characters + bincache_storage_account_name = "bchesid0${local.ws}" + resource_group_name = data.azurerm_resource_group.persistent.name + location = data.azurerm_resource_group.persistent.location +} + +################################################################################ \ No newline at end of file diff --git a/terraform/persistent/workspace-specific/main.tf b/terraform/persistent/workspace-specific/main.tf index 58b1a97d..2ccf6569 100644 --- a/terraform/persistent/workspace-specific/main.tf +++ b/terraform/persistent/workspace-specific/main.tf @@ -13,11 +13,6 @@ terraform { source = "hashicorp/azurerm" } } -} - -################################################################################ - -terraform { # Backend for storing terraform state (see ../../state-storage) backend "azurerm" { # resource_group_name and storage_account_name are set by the callee @@ -59,6 +54,7 @@ data "azurerm_client_config" "current" {} # Resources +# Caddy state disk: binary cache resource "azurerm_managed_disk" "binary_cache_caddy_state" { name = "binary-cache-vm-caddy-state-${local.ws}" resource_group_name = data.azurerm_resource_group.persistent.name @@ -68,6 +64,7 @@ resource "azurerm_managed_disk" "binary_cache_caddy_state" { disk_size_gb = 1 } +# Caddy state disk: jenkins controller resource "azurerm_managed_disk" "jenkins_controller_caddy_state" { name = "jenkins-controller-vm-caddy-state-${local.ws}" resource_group_name = data.azurerm_resource_group.persistent.name @@ -77,4 +74,20 @@ resource "azurerm_managed_disk" "jenkins_controller_caddy_state" { disk_size_gb = 1 } +# Jenkins artifacts storage account and container +resource "azurerm_storage_account" "jenkins_artifacts" { + name = "artifact${local.ws}" + resource_group_name = data.azurerm_resource_group.persistent.name + location = data.azurerm_resource_group.persistent.location + account_tier = "Standard" + account_replication_type = "LRS" + allow_nested_items_to_be_public = false +} + +resource "azurerm_storage_container" "jenkins_artifacts_1" { + name = "jenkins-artifacts-v1" + storage_account_name = azurerm_storage_account.jenkins_artifacts.name + container_access_type = "private" +} + ################################################################################ diff --git a/terraform/state-storage/tfstate-storage.tf b/terraform/state-storage/tfstate-storage.tf index e0b877e8..67fd0261 100644 --- a/terraform/state-storage/tfstate-storage.tf +++ b/terraform/state-storage/tfstate-storage.tf @@ -10,6 +10,8 @@ terraform { } provider "azurerm" { + # https://github.com/hashicorp/terraform-provider-azurerm/issues/24804 + skip_provider_registration = true features {} } @@ -21,21 +23,27 @@ variable "location" { description = "Azure region into which the resources will be deployed" } +variable "account_name" { + type = string + description = "Storage account name must be globally unique, 3-24 lowercase characters" + default = "" + validation { + condition = length(var.account_name) > 0 + error_message = "Invalid value" + } +} + locals { # Raise an error if workspace is 'default', # this is a workaround to missing asserts in terraform: assert_workspace_not_default = regex( (terraform.workspace == "default") ? "((Force invalid regex pattern)\n\nERROR: workspace 'default' is not allowed" : "", "") - - # Sanitize workspace name: - # Workspace name defines the state-storage instance - ws = substr(replace(lower(terraform.workspace), "/[^a-z0-9]/", ""), 0, 16) } # Resource group resource "azurerm_resource_group" "rg" { - name = "ghaf-infra-state-${local.ws}" + name = terraform.workspace location = var.location lifecycle { prevent_destroy = true @@ -45,7 +53,7 @@ resource "azurerm_resource_group" "rg" { # Storage container resource "azurerm_storage_account" "tfstate" { # This must be globally unique, max 24 characters - name = "ghafinfrastate${local.ws}" + name = var.account_name resource_group_name = azurerm_resource_group.rg.name location = azurerm_resource_group.rg.location account_tier = "Standard" diff --git a/terraform/terraform-init.sh b/terraform/terraform-init.sh index c7fe201a..e8f80eb5 100755 --- a/terraform/terraform-init.sh +++ b/terraform/terraform-init.sh @@ -9,72 +9,96 @@ set -o pipefail # exit if any pipeline command fails ################################################################################ -# This script is a helper to initialize the ghaf terraform infra: -# - init terraform state storage -# - init persistent secrets such as binary cache signing key (per environment) -# - init persistent binary cache storage (per environment) -# - init workspace-specific persistent such as caddy disks (per environment) - -################################################################################ +# This script is a helper to initialize the ghaf terraform infra. MYDIR=$(dirname "$0") MYNAME=$(basename "$0") +RED='' NONE='' ################################################################################ usage () { - echo "Usage: $MYNAME [-h] [-v] [-l LOCATION]" echo "" - echo "Initialize terraform state and persistent storage for ghaf-infra on Azure." - echo "By default, the Azure LOCATION for the ghaf-infra will be initialized to" - echo "North Europe. Use -l option to specify a different LOCATION." + echo "Usage: $MYNAME [-h] [-v] [-l LOCATION] -w WORKSPACE" echo "" - echo "This script will not do anything if the initialization has already been" - echo "done. In other words, it's safe to run this script many times. It will" - echo "not destroy or re-initialize anything if the initialization has already" - echo "taken place." + echo "Initialize ghaf-infra workspace with the given name (-w WORKSPACE). This" + echo "script will not destroy or re-initialize anything if the initialization" + echo "has already been done earlier." echo "" echo "Options:" - echo " -h Print this help message" - echo " -v Set the script verbosity to DEBUG" + echo " -w Init workspace with the given WORKSPACE name. Name must be 3-16" + echo " lowercase characters or numbers i.e. [a-z0-9]{3,16}. Does not" + echo " create a new workspace if WORKSPACE already exists in the terraform" + echo " remote state, but switches to the existing workspace" echo " -l Azure location name on which the infra will be initialized. See the" echo " Azure location names with command 'az account list-locations -o table'." - echo " By default, the LOCATION is set to 'northeurope' i.e. '-l northeurope'" + echo " The default LOCATION is 'northeurope' i.e. '-l northeurope'" + echo " -v Set the script verbosity to DEBUG" + echo " -h Print this help message" + echo "" + echo "Other options:" + echo " -s Init state storage" + echo " -p Init persistent storage" echo "" echo "Example:" echo "" - echo " Following command initializes terraform state and persistent storage" - echo " on Azure location uaenorth (United Arab Emirates North):" + echo " Following command initializes ghaf-infra instance 'myghafinfra'" + echo " on the default Azure location (northeurope):" echo "" - echo " $MYNAME -l uaenorth" + echo " $MYNAME -w myghafinfra" echo "" } ################################################################################ +print_err () { + printf "${RED}Error:${NONE} %b\n" "$1" >&2 +} + argparse () { - DEBUG="false"; LOCATION="northeurope"; OPTIND=1 - while getopts "hvl:" copt; do + # Colorize output if stdout is to a terminal (and not to pipe or file) + if [ -t 1 ]; then RED='\033[1;31m'; NONE='\033[0m'; fi + # Parse arguments + OUT="/dev/null"; LOCATION="northeurope"; WORKSPACE=""; OPT_s=""; OPT_p=""; + OPTIND=1 + while getopts "hw:l:spv" copt; do case "${copt}" in h) usage; exit 0 ;; v) - DEBUG="true" ;; + set -x; OUT=/dev/stderr ;; l) LOCATION="$OPTARG" ;; + w) + WORKSPACE="$OPTARG" + if ! [[ "$WORKSPACE" =~ ^[a-z0-9]{3,16}$ ]]; then + print_err "invalid workspace name: '$WORKSPACE'"; + usage; exit 1 + fi + ;; + s) + OPT_s="true" ;; + p) + OPT_p="true" ;; *) - echo "Error: unrecognized option"; usage; exit 1 ;; + print_err "unrecognized option"; usage; exit 1 ;; esac done shift $((OPTIND-1)) if [ -n "$*" ]; then - echo "Error: unsupported positional argument(s): '$*'"; exit 1 + print_err "unsupported positional argument(s): '$*'"; exit 1 + fi + if [[ -z "$WORKSPACE" && -z "$OPT_s" && -z "$OPT_p" ]]; then + # Intentionally don't promote '-s' or '-p' usage. They are safe to run + # but most users of this script should not need to run (or re-run) them + print_err "missing mandatory option '-w'" + usage; exit 1 fi } exit_unless_command_exists () { if ! command -v "$1" &>"$OUT"; then - echo "Error: command '$1' is not installed (Hint: are you inside a nix-shell?)" >&2 + print_err "command '$1' is not installed (Hint: are you inside a nix-shell?)" exit 1 fi } @@ -93,113 +117,102 @@ azure_location_to_shortloc () { echo "[+] Unsupported location '$LOCATION'" exit 1 fi - echo "[+] Using location short name '$SHORTLOC'" +} + +set_env () { + # Assign variables STATE_RG, STATE_ACCOUNT and PERSISTENT_RG: these + # variables are used to select the remote state storage and persistent + # data used in this ghaf-infra instance. + STATE_RG="ghaf-infra-0-state-${SHORTLOC}" + STATE_ACCOUNT="ghafinfra0state${SHORTLOC}" + PERSISTENT_RG="ghaf-infra-0-persistent-${SHORTLOC}" + echo "[+] Using state '$STATE_RG'" + echo "[+] Using persistent '$PERSISTENT_RG'" + echo "storage_account_rg_name=$STATE_RG" >"$MYDIR/.env" + echo "storage_account_name=$STATE_ACCOUNT" >>"$MYDIR/.env" + echo "persistent_rg_name=$PERSISTENT_RG" >>"$MYDIR/.env" } init_state_storage () { echo "[+] Initializing state storage" - # Assign variables STATE_RG and STATE_ACCOUNT: these variables are - # used to select the remote state storage used in this instance. - if [ "$SHORTLOC" = "eun" ]; then - # State file for 'eun' is named differently to not have to move - # all existing state files to new 'ghaf-infra-state-eun' resource - # group from the one currently used (see below). - # At some point we might want to rename the 'eun' state also, - # but it would require manual actions to import existing environments. - STATE_RG="ghaf-infra-state" - STATE_ACCOUNT="ghafinfratfstatestorage" - else - STATE_RG="ghaf-infra-state-$SHORTLOC" - STATE_ACCOUNT="ghafinfrastate$SHORTLOC" - fi - echo -e "storage_account_rg_name=$STATE_RG\nstorage_account_name=$STATE_ACCOUNT" >"$MYDIR/.env" - # See: ./state-storage pushd "$MYDIR/state-storage" >"$OUT" terraform init -upgrade >"$OUT" - terraform workspace select "$SHORTLOC" &>"$OUT" || terraform workspace new "$SHORTLOC" >"$OUT" - if ! terraform apply -var="location=$LOCATION" -auto-approve &>"$OUT"; then + terraform workspace select -or-create "$STATE_RG" >"$OUT" + if az resource list -g "$STATE_RG" &>"$OUT"; then echo "[+] State storage is already initialized" + popd >"$OUT"; return fi + terraform apply -var="location=$LOCATION" -var="account_name=$STATE_ACCOUNT" -auto-approve >"$OUT" popd >"$OUT" } import_bincache_sigkey () { - env="$1" - echo "[+] Importing binary cache signing key '$env'" - # Skip import if signing key is imported already - if terraform state list | grep -q secret_resource.binary_cache_signing_key_"$env"; then + key_name="$1" + echo "[+] Importing binary cache signing key '$key_name'" + if terraform state list | grep -q secret_resource.binary_cache_signing_key; then echo "[+] Binary cache signing key is already imported" return fi - # Generate and import the key - nix-store --generate-binary-cache-key "ghaf-infra-$env" sigkey-secret.tmp "sigkey-public-$env.tmp" - terraform import secret_resource.binary_cache_signing_key_"$env" "$(< ./sigkey-secret.tmp)" + echo "[+] Generating binary cache signing key '$key_name'" + # https://nix.dev/manual/nix/latest/command-ref/nix-store/generate-binary-cache-key + nix-store --generate-binary-cache-key "$key_name" sigkey-secret.tmp "sigkey-public-$key_name.tmp" + var_rg="-var=persistent_resource_group=$PERSISTENT_RG" + terraform import "$var_rg" secret_resource.binary_cache_signing_key "$(< ./sigkey-secret.tmp)" + terraform apply "$var_rg" -auto-approve >"$OUT" rm -f sigkey-secret.tmp } -init_persistent () { - echo "[+] Initializing persistent" - # See: ./persistent +run_terraform_init () { + # Run terraform init declaring the remote state + opt_state_rg="-backend-config=resource_group_name=$STATE_RG" + opt_state_acc="-backend-config=storage_account_name=$STATE_ACCOUNT" + terraform init -upgrade "$opt_state_rg" "$opt_state_acc" >"$OUT" +} + +init_persistent_storage () { + echo "[+] Initializing persistent storage" pushd "$MYDIR/persistent" >"$OUT" - # Init persistent, setting the backend resource group and storage account - # names from command-line: - # https://developer.hashicorp.com/terraform/language/settings/backends/azurerm#backend-azure-ad-user-via-azure-cli - terraform init -upgrade -backend-config="resource_group_name=$STATE_RG" -backend-config="storage_account_name=$STATE_ACCOUNT" >"$OUT" - terraform workspace select "$SHORTLOC" &>"$OUT" || terraform workspace new "$SHORTLOC" >"$OUT" - import_bincache_sigkey "prod" - import_bincache_sigkey "dev" - echo "[+] Applying possible changes in ./persistent" + run_terraform_init + terraform workspace select -or-create "$PERSISTENT_RG" >"$OUT" + if az resource list -g "$PERSISTENT_RG" &>"$OUT"; then + echo "[+] Persistent storage is already initialized" + popd >"$OUT"; return + fi terraform apply -var="location=$LOCATION" -auto-approve >"$OUT" popd >"$OUT" +} + +init_persistent_resources () { + echo "[+] Initializing persistent resources" + pushd "$MYDIR/persistent/resources" >"$OUT" + run_terraform_init + for env in "release" "prod" "priv"; do + ws="$env${SHORTLOC}" + terraform workspace select -or-create "$ws" >"$OUT" + import_bincache_sigkey "$env-cache.vedenemo.dev~1" + done + popd >"$OUT" +} +init_workspace_persistent () { echo "[+] Initializing workspace-specific persistent" - # See: ./persistent/workspace-specific pushd "$MYDIR/persistent/workspace-specific" >"$OUT" - terraform init -upgrade -backend-config="resource_group_name=$STATE_RG" -backend-config="storage_account_name=$STATE_ACCOUNT" >"$OUT" - echo "[+] Applying possible changes in ./persistent/workspace-specific" - for ws in "dev${SHORTLOC}" "prod${SHORTLOC}" "$WORKSPACE"; do - terraform workspace select "$ws" &>"$OUT" || terraform workspace new "$ws" >"$OUT" - var_rg="persistent_resource_group=ghaf-infra-persistent-$SHORTLOC" - if ! terraform apply -var="$var_rg" -auto-approve &>"$OUT"; then - echo "[+] Workspace-specific persistent ($ws) is already initialized" - fi - # Stop terraform from tracking the state of 'workspace-specific' - # persistent resources. This script initially creates such resources - # (above), but tells terraform to not track them (below). This means, - # for instance, that removing such resources would need to happen - # manually through Azure web UI or az cli client. We assume the - # workspace-specific persistent resources really are persistent, - # meaning, it would be a rare occasion when they had to be - # (manually) removed. - # - # Why do we not track the state with terraform? - # If we let terraform track the state of such resources, we would - # end-up in a conflict when someone adds a new workspace-specific - # resource due the shared nature of 'prod' and 'dev' workspaces. In - # such a conflict condition, someone running this script with the - # old version of terraform code (i.e. version that does not include - # adding the new resource) would always remove the resource on - # `terraform apply`, whereas, someone running this script - # with the new workspace-specific resource would always add the new - # resource on apply, due to the shared 'dev' and 'prod' workspaces. - while read -r line; do - if [ -n "$line" ]; then - terraform state rm "$line" >"$OUT"; - fi - # TODO: remove the 'binary_cache_caddy_state' filter from the below - # grep when all users have migrated to the version of this script - # on which the `terraform state rm` is included - done < <(terraform state list | grep -vP "(^data\.|binary_cache_caddy_state)") - done + run_terraform_init + terraform workspace select -or-create "$WORKSPACE" >"$OUT" + terraform apply -var="persistent_resource_group=$PERSISTENT_RG" -auto-approve >"$OUT" popd >"$OUT" } -init_terraform () { - echo "[+] Running terraform init" +init_workspace () { + echo "[+] Initializing workspace" pushd "$MYDIR" >"$OUT" - terraform init -upgrade -backend-config="resource_group_name=$STATE_RG" -backend-config="storage_account_name=$STATE_ACCOUNT" >"$OUT" - # By default, switch to the private workspace - activate_workspace + run_terraform_init + terraform workspace select -or-create "$WORKSPACE" + echo "[+] Listing workspaces:" + terraform workspace list + echo "[+] Use 'terraform workspace select ' to select a"\ + "workspace, then 'terraform [validate|plan|apply]' to work with the"\ + "given ghaf-infra environment" popd >"$OUT" } @@ -207,32 +220,23 @@ init_terraform () { main () { argparse "$@" - if [ "$DEBUG" = "true" ]; then - set -x - OUT=/dev/stderr - else - OUT=/dev/null - fi - exit_unless_command_exists az - exit_unless_command_exists terraform - exit_unless_command_exists nix-store exit_unless_command_exists grep - - # Assigns $WORKSPACE variable - # shellcheck source=/dev/null - source "$MYDIR/terraform-playground.sh" &>"$OUT" - generate_azure_private_workspace_name - + exit_unless_command_exists nix-store + exit_unless_command_exists terraform azure_location_to_shortloc - init_state_storage - init_persistent - init_terraform - echo "[+] Listing workspaces:" - terraform workspace list - echo "[+] Done, use 'terraform workspace select ' to select a"\ - "workspace, then 'terraform [validate|plan|apply]' to work with the"\ - "given ghaf-infra environment" + set_env + if [ -n "$OPT_s" ]; then + init_state_storage + fi + if [ -n "$OPT_p" ]; then + init_persistent_storage + init_persistent_resources + fi + if [ -n "$WORKSPACE" ]; then + init_workspace_persistent + init_workspace + fi } main "$@" diff --git a/terraform/terraform-playground.sh b/terraform/terraform-playground.sh deleted file mode 100755 index c40bc4e9..00000000 --- a/terraform/terraform-playground.sh +++ /dev/null @@ -1,136 +0,0 @@ -#!/usr/bin/env bash - -# SPDX-FileCopyrightText: 2022-2024 TII (SSRC) and the Ghaf contributors -# SPDX-License-Identifier: Apache-2.0 - -set -e # exit immediately if a command fails -set -u # treat unset variables as an error and exit -set -o pipefail # exit if any pipeline command fails - -################################################################################ - -MYNAME=$(basename "$0") -usage () { - echo "Usage: $MYNAME [activate|destroy|list]" - echo "" - echo "This script is a thin wrapper around terraform workspaces to enable private" - echo "development environment setup for testing Azure infra changes." - echo "" - echo "COMMANDS" - echo " activate Activate private infra development environment" - echo " destroy Destroy private infra development environment" - echo " list List current terraform workspaces" - echo "" - echo "" - echo " EXAMPLE:" - echo " ./$MYNAME activate" - echo "" - echo " Activate and - unless already created - create a new terraform workspace" - echo " to allow testing the infra setup in a private development environment." - echo "" - echo "" - echo " EXAMPLE:" - echo " ./$MYNAME destroy" - echo " " - echo " Deactivate and destroy the private development infra that was previously" - echo " created with the 'activate' command. This command deletes all the infra" - echo " resources." - echo "" -} - -################################################################################ - -exit_unless_command_exists () { - if ! command -v "$1" &> /dev/null; then - echo "Error: command '$1' is not installed" >&2 - exit 1 - fi -} - -generate_azure_private_workspace_name () { - # Generate workspace name based on azure signed-in-user: - # - .userPrincipalName returns the signed-in azure username - # - cut removes everything up until the first '@' - # - sed keeps only letter and number characters - # - final cut keeps at most 16 characters - # - tr converts the string to lower case - # Thus, given a signed-in user 'foo.bar@baz.com', the workspace name - # becomes 'foobar'. - # Below command errors out with the azure error message if the azure user - # is not signed-in. - WORKSPACE=$(az ad signed-in-user show | jq -cr .userPrincipalName | cut -d'@' -f1 | sed 's/[^a-zA-Z0-9]//g' | cut -c 1-16 | tr '[:upper:]' '[:lower:]') - # Check WORKSPACE is non-empty and not 'default' - if [ -z "$WORKSPACE" ] || [ "$WORKSPACE" = "default" ]; then - echo "Error: invalid workspace name: '$WORKSPACE'" - exit 1 - fi -} - -activate_workspace () { - echo "[+] Activating workspace: '$WORKSPACE'" - if terraform workspace list | grep -qP "\s$WORKSPACE\$"; then - terraform workspace select "$WORKSPACE" - else - terraform workspace new "$WORKSPACE" - terraform workspace select "$WORKSPACE" - fi -} - -destroy_workspace () { - if ! terraform workspace list | grep -qP "\s$WORKSPACE\$"; then - echo "[+] Devenv workspace '$WORKSPACE' does not exist, nothing to destroy" - exit 0 - fi - echo "[+] Destroying workspace: '$WORKSPACE'" - terraform workspace select "$WORKSPACE" - terraform apply -destroy -auto-approve -} - -################################################################################ - -main () { - if [ $# -ne 1 ]; then - usage - exit 0 - fi - if [ "$1" != "activate" ] && [ "$1" != "destroy" ] && [ "$1" != "list" ]; then - echo "Error: invalid command: '$1'" - usage - exit 1 - fi - - exit_unless_command_exists az - exit_unless_command_exists terraform - exit_unless_command_exists nix-store - exit_unless_command_exists jq - exit_unless_command_exists sed - exit_unless_command_exists cut - exit_unless_command_exists tr - exit_unless_command_exists grep - - # Assigns $WORKSPACE variable - generate_azure_private_workspace_name - - # It is safe to run terraform init multiple times - terraform init -upgrade - - # Run the given command - if [ "$1" == "activate" ]; then - activate_workspace - echo "[+] Done, use terraform [validate|plan|apply] to work with your dev infra" - fi - if [ "$1" == "destroy" ]; then - destroy_workspace - fi - if [ "$1" == "list" ]; then - echo "Terraform workspaces:" - terraform workspace list - fi -} - -# Do not execute main() if this script is being sourced -if [ "${0}" = "${BASH_SOURCE[0]}" ]; then - main "$@" -fi - -################################################################################