-
Notifications
You must be signed in to change notification settings - Fork 1.7k
How Terraform (sort of) works
This document is written for provider developers on TPG. It's both incomplete and inaccurate. However, it's a useful model for explaining interactions between Terraform Core and the Terraform Plugin SDK. Particularly, the Resource Instance Change Lifecycle document is a great summary of how the Terraform binary ("Core") understands interactions (most of the state terms are drawn from there), but doesn't map well to the SDK's provider framework that we use.
This model will freely cross between Core and the SDK- the goal is to map what users see to what developers write, not to explain the protocol, Core, or SDK accurately.
Terraform users will generally use five commands to interact with Terraform: terraform apply
, terraform plan
, terraform import
, terraform refresh
, and terraform destroy
. In addition, they'll have a statefile in their current directory (terraform.tfstate
generally) and 1+ config files like main.tf
. Prior to any command, Terraform will perform a validation step where it calls ValidateFunc
s directly on the raw values written in a user's config. If values are "unknown", such as values drawn from an interpolation, they're not validated.
Each of those 5 user-facing commands are roughly made up of some combination of two actions- "refresh" and "apply". A refresh is when Terraform finds the current state of a resource from the API and keeps track of it as the "prior state". It knows what resource and region to use based on the statefile. For most providers it only needs the id
, which will contain a unique identifier, however in TPG we tend to draw directly from fields like project
and name
. An apply is when Terraform performs the appropriate actions to bring the resource from it's prior state to a new state, "planned new state". The real state at the end of an apply is called a "new state".
We can model those 5 commands with apply and refresh like the following:
-
terraform refresh
performs a refresh, writing the prior state into the statefile, replacing the old contents -
terraform import
writes the supplied id into the statefile and then performs a refresh, writing the prior state into the statefile -
terraform plan
performs a refresh, and compares the prior state to the "proposed new state" to create the planned new state. It displays the difference between the prior state and planned new state to the user -
terraform apply
implicitly performsterraform plan
. If the user approves the change, it performs an apply. -
terraform destroy
is a convenient way to callterraform apply
where the "proposed new state" is empty (indicating that the resource should be deleted)
It's clear how Terraform uses the statefile- it's roughly a serialized state. However, the user's config isn't consumed directly. Instead, Terraform uses that to build the proposed new state. To do so, Terraform copies config values directly into the proposed state and copies Computed
values from the prior state (when not present there, they get the special value "unknown"). Any other values are assumed to have the corresponding zero value for their type (""
, 0
, false
, etc.) Optional+Computed
values are treated like a Computed
value when unset, and a normal value when set.
It's also worth noting that any time Terraform creates a state, it will run StateFunc
functions on each field if there are any, allowing the state to be modified. They allow us to modify values in the state, but they only have access to that single field's value. In practice we don't use them much.
During Apply, Terraform assumes that the planned new state and new state are identical by default. The ResourceData
d
has a few different meanings depending on the CRUD method.
-
In Create,
d.Get
draws from the planned new state andd.Set
sets the new state -
In Delete,
d.Get
draws from the prior state -
In Update,
d.Get
draws from the planned new state,d.GetChange
from (prior state, planned new state).d.Set
sets the new state
Otherwise, Apply isn't very exciting- Terraform calls the appropriate provider methods.
During terraform plan
, the planned new state is created and then diffed against the prior state to show a diff to the user. During an apply, the planned new state is the desired state for the resource to reach.
During terraform plan
, the planned new state is created by:
- Filling in unset values in the proposed new state that have a
Default
with that value. - Running
CustomizeDiff
- Running
DiffSuppressFunc
s (DSFs)
DSFs were added to the provider SDK before CD. They're very constrained, and can only see the value of the current field (or subfields if that field is a block). They can return true
to indicate that both values for the field are identical. If they are, Terraform discards the proposed new state's value, and replaces it with the prior state's value.
CustomizeDiff
is much more flexible. The ResourceDiff
diff
is available, which is roughly a superset of d
. In addition to d
's ability to read a whole resource state, diff
can modify the planned new state, diff
return errors, and clear the diffs on fields like a DSF. In theory, ValidateFunc
DiffSuppressFunc
, Default
could all be implemented in terms of CustomizeDiff.
As highlighted above, you've got a number of tools to modify a user's config and make it useful. To summarise them again, they're:
-
ValidateFunc
s allow you to reject configs based on a single field being invalid -
Default
values which fill in unset values duringterraform plan
diffs andterraform apply
-
StateFunc
s let you canonicalise values when states are created (but TPG doesn't use them much) -
DiffSuppressFunc
s let you tell Terraform to keep the value from the prior state if it's identical to the one in config -
Optional+Computed
fields tell Terraform to handle them as if they're output-only when unset, and configurable when set
Finally, CustomizeDiff is incredibly powerful, effectively allowing the provider to perform arbitrary transformations. It's somewhat dangerous to use, as it's very easy to make a transformation that Terraform Core will reject. Based on that, other more focused tools should be preferred. These are some cases where CustomizeDiff can solve otherwise unsolvable problems:
-
Conditionally setting fields as
ForceNew
to indicate the resource should be recreated. For example, allowing disks to size up but not down. -
Adding custom error messages.
- For example, App Engine applications can't be moved once created. Instead of an erroneous
ForceNew
, TPG returns an error if a user attempts to move one. - Complex validations. For example, asserting that one value must not be greater than another or that if one field is set, another must be.
- For example, App Engine applications can't be moved once created. Instead of an erroneous
-
Adding conditional defaults based on the value of another field.
-
Rewriting the planned new state for a value. For example, reordering a list to match the prior state when possible.
The split between Core / SDK is still somewhat new, and we're in the middle of growing pains. Terraform 0.12
included a major overhaul of Core according to their holistic view of how providers and the SDK should work, even when that view went against SDK convention or impossible to fulfill. For example, the Resource Instance Change Lifecycle page lists many assertions that values stay the same between states that providers today do not fulfill. Today they're all warnings, but Core's assertions are intended to become errors in the future.
One particularly amusing example is Default
values. As implemented by the SDK, they cause the following error: - .port: planned value cty.NumberIntVal(80) does not match config value cty.NullVal(cty.Number)
.
Another is Optional + Computed
. It was never intended to work quite the way it does, especially when used with TypeList
and TypeSet
. However, there's no viable alternative.
For TPG, the issue Fit our state-setting model to the protocol is the largest divide between the providers / SDK (and the model presented here) and Core.