Skip to content

Commit

Permalink
README: Document changing deploy location
Browse files Browse the repository at this point in the history
Signed-off-by: Henri Rosten <henri.rosten@unikie.com>
  • Loading branch information
henrirosten committed Mar 5, 2024
1 parent df186ee commit 5b33746
Showing 1 changed file with 39 additions and 9 deletions.
48 changes: 39 additions & 9 deletions terraform/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,12 @@ For architectural description, see [README-azure.md](./README-azure.md) original

> The setup uses Nix to build disk images, uploads them to Azure, and then boots
> virtual machines off of them.
>
>
> Images are considered "appliance images", meant the Nix code describing their
> configuration describes the exact same purpose of the machine (no two-staged
> deployment process, the machine does the thing it's supposed to do after
> bootup), allowing to remove the need for e.g. ssh access as much as possible.
>
>
> Machines are considered ephemeral, every change in the appliance image / nixos
> configuration causes a new image to be built, and a new VM to be booted with
> that new image.
Expand Down Expand Up @@ -68,7 +68,7 @@ terraform
```
- The `terraform` directory contains the root terraform deployment files with the VM configurations `binary-cache.tf`, `builder.tf`, and `jenkins-controller.tf` matching the components described in [README-azure.md](./README-azure.md) in its [components section](./README-azure.md#components).
- The `terraform/azarm` directory contains the terraform configuration for Azure `aarch64` builder which is used from ghaf github-actions [build.yml workflow](https://github.com/tiiuae/ghaf/blob/e81ccfb41d75eda0488b6b4325aeccb8385ce960/.github/workflows/build.yml#L151) to build `aarch64` targets for authorized PRs pre-merge. `azarm` is disconnected from the root terraform module: it's a separate configuration with its own state.
- The `terraform/persistent` directory contains the terraform configuration for parts of the infrastructure that are considered persitent - resources defined under `terraform/persistent` will not be removed even if the ghaf-infra instance is otherwise removed. An example of such persistent ghaf-infra resource is the binary cache storage as well as the binary cache signing key. There may be many 'persistent' infrastructure instances - currently `dev` and `prod` deployments have their own instances of the persistent resources. Section [Multiple Environments with Terraform Workspaces](./README.md#multiple-environments-with-terraform-workspaces) discusses this topic with more details.
- The `terraform/persistent` directory contains the terraform configuration for parts of the infrastructure that are considered persistent - resources defined under `terraform/persistent` will not be removed even if the ghaf-infra instance is otherwise removed. An example of such persistent ghaf-infra resource is the binary cache storage as well as the binary cache signing key. There may be many 'persistent' infrastructure instances - currently `dev` and `prod` deployments have their own instances of the persistent resources. Section [Multiple Environments with Terraform Workspaces](./README.md#multiple-environments-with-terraform-workspaces) discusses this topic with more details.
- The `terraform/playground` directory contains tooling to facilitate the usage of terraform workspaces in setting-up distinct copies of the ghaf-infra infrastructure, i.e. 'playground' `dev` environments. It also includes an [example test infrastructure](./playground/test-infra.tf) that allows deploying example infrastructure including just one nix VM, highlighting the use of `terraform/modules` to build and upload the nix image on Azure.
- The `terraform/state-storage` directory contains the terraform configuration for the ghaf-infra remote backend state storage using Azure storage blob. See section [Initializing Azure State and Persistent Data](./README.md#initializing-azure-state-and-persistent-data) for more details.
- The `terraform/modules` directory contains terraform modules used from the ghaf-infra VM configurations to build, upload, and spin up Azure nix images.
Expand All @@ -80,7 +80,7 @@ To initialize the backend storage, use the `terraform-init-sh`:

```bash
# Inside the terraform directory
$ ./terraform-init.sh
$ ./terraform-init.sh
[+] Initializing state storage
[+] Initializing persistent data
...
Expand All @@ -98,10 +98,10 @@ The main reasons for using terraform workspaces include:
- Parts of the ghaf-infra infrastructure are persistent and shared between different environments. As an example, private `dev` environments share the binary cache storage. This arrangement makes it possible to treat, for instance, `dev` and private ghaf-infra instances dispensable: ghaf-infra instances can be temporary and short-lived as it's easy to spin-up new environments without losing any valuable data. The persistent data is configured outside the root ghaf-infra terraform deployment in the `terraform/persistent` directory. There may be many 'persistent' infrastructure instances - currently `dev` and `prod` deployments have their own instances of the persistent resources. This means that `dev` and `prod` instances of ghaf-infra do **not** share any persistent data. As an example, `dev` and `prod` deployments of ghaf-infra have a separate binary cache storage. The binding to persistent resources from ghaf-infra is done in the [`main.tf`](./main.tf#L166) based on the terraform workspace name and resource location. Persistent data initialization is automatically done with `terraform-init.sh` script.
- Currently, the following resources are defined 'persistent', meaning `dev` and `prod` instances do not share the following resources:
- Binary cache storage: [`binary-cache-storage.tf`](./persistent/binary-cache-storage/binary-cache-storage.tf)
- Binray cache signing key: [`binary-cache-sigkey.ft`](./persistent/binary-cache-sigkey/binary-cache-sigkey.tf)
- Binary cache signing key: [`binary-cache-sigkey.ft`](./persistent/binary-cache-sigkey/binary-cache-sigkey.tf)
- Builder ssh key: [`builder-ssh-key.tf`](./persistent/builder-ssh-key/builder-ssh-key.tf)

To help facilitate the usage of terraform workspaces in setting-up distinct copies of ghaf-infra, one can [use terraform workspaces from the command line](https://developer.hashicorp.com/terraform/cli/workspaces#managing-cli-workspaces) or consider using the helper script provided at [`playground/terraform-playground.sh`](./playground/terraform-playground.sh). Below, for the sake of example, we use the [`playground/terraform-playground.sh`](./playground/terraform-playground.sh) to setup a private devlopment instance of ghaf-infra:
To help facilitate the usage of terraform workspaces in setting-up distinct copies of ghaf-infra, one can [use terraform workspaces from the command line](https://developer.hashicorp.com/terraform/cli/workspaces#managing-cli-workspaces) or consider using the helper script provided at [`playground/terraform-playground.sh`](./playground/terraform-playground.sh). Below, for the sake of example, we use the [`playground/terraform-playground.sh`](./playground/terraform-playground.sh) to setup a private deployment instance of ghaf-infra:

```bash
# Activate private development environment
Expand Down Expand Up @@ -159,6 +159,36 @@ $ ./playground/terraform-playground.sh destroy
```
The above command removes all the resources that were created for the private development environment.

## Changing Azure Deploy Location
By default, ghaf-infra is deployed to Azure location `northeurope` (North Europe).
However, ghaf-infra resources can be deployed to other Azure locations too, with the following caveats:
- Ghaf-infra has been tested in a limited set of locations. `terraform-init.sh` exits with an error if you try to initialize ghaf-infra in a non-supported (non-tested) location. When deploying to a new, previously unsupported location, you need to modify the [`terraform-init.sh`](./terraform-init.sh#L89).
- For a full list of available Azure location names, run `az account list-locations -o table` in ghaf-infra devshell.
- Not all Azure VM sizes or other resources are available in all Azure locations. You can search the availability of specific resources through the Azure region product page e.g.: https://azure.microsoft.com/en-us/explore/global-infrastructure/products-by-region/?regions=europe-north&products=virtual-machines. Alternatively, you can list the VM sizes per location with `az vm list-sizes` command from the ghaf-infra devshell, for instance: `az vm list-sizes --location 'northeurope' -o table`.
- Your Azure subscription quota limits impact the ability to deploy ghaf-infra, as such, you might need to increase the vCPU quotas for your subscription via the Azure web portal. See more information at https://learn.microsoft.com/en-us/azure/quotas/quotas-overview. You can check your quota usage from the Azure web portal or using `az vm list-usage`, for instance: `az vm list-usage --location "northeurope" -o table`.
- Currently, the terraform state is always deployed to the default Azure location, even if your ghaf-infra instance other resources (i.e. [persistent data](./persistent/main.tf), [root module resources](main.tf)) are deployed to a different location.

Following shows an example of deploying ghaf-infra to Azure location UAE North:

```bash
# Initialize terraform state and persistent data, using UAE North as an example location:
$ ./terraform-init.sh -l uaenorth

# Switch to (and optionally create) a workspace 'devuaen'
$ terraform workspace new devuaen || terraform workspace select devuaen

# Optionally, run Terraform plan:
$ terraform plan -var="location=uaenorth"

# Deploy with Terraform apply, specifying the location with a variable:
$ terraform apply -var="location=uaenorth" -auto-approve
```
When destroying a ghaf-infra instance deployed to a non-default location, you need to specify the location with a variable:
```bash
# Destroy:
$ terraform apply -destroy -var="location=uaenorth" -auto-approve
```
## Common Terraform Errors
Below are some common Terraform errors with tips on how to resolve each.
Expand Down Expand Up @@ -186,12 +216,12 @@ $ terraform apply
│ Error: creating/updating Image (Subscription: "<SUBID>"
│ Resource Group Name: "ghaf-infra-dev"
│ Image Name: "<NAME>"): performing CreateOrUpdate: unexpected status 400 with error: InvalidParameter: The source blob https://<INSTANCE>.blob.core.windows.net/ghaf-infra-vm-images/<IMANE>.vhd is not accessible.
│ with module.builder_image.azurerm_image.default,
│ on modules/azurerm-nix-vm-image/main.tf line 22, in resource "azurerm_image" "default":
│ 22: resource "azurerm_image" "default" {
```
Try running `terraform apply` again if you get an error similar to one shown above.
Try running `terraform apply` again if you get an error similar to one shown above.
It's unclear why this error occasionally occurs, this issue should be analyzed in detail.
#### Error: Disk
Expand All @@ -201,7 +231,7 @@ $ terraform apply
│ Error: Disk (Subscription: "<SUBID>"
│ Resource Group Name: "ghaf-infra-persistent-eun"
│ Disk Name: "binary-cache-vm-caddy-state-dev") was not found
│ with data.azurerm_managed_disk.binary_cache_caddy_state,
│ on main.tf line 207, in data "azurerm_managed_disk" "binary_cache_caddy_state":
│ 207: data "azurerm_managed_disk" "binary_cache_caddy_state" {
Expand Down

0 comments on commit 5b33746

Please sign in to comment.