Terraform Fundamentals
Table of contents
- Configuration Management VS Infrastructure Orchestration
- What is Terraform?
- Authentication VS Authorization
- Resources and Providers
- Terraform State File
- Terraform refresh
- Cross Resource Attribute Reference
- Output Values
- Terraform Variables
- Count
- for_each
- each object
- Conditional Expression
- Local Values
- Terraform Functions
- Data Sources
- Debugging Terraform
- Terraform Format
- Terraform Validate
- Load Order and Semantics
- Dynamic Block
- Iterators
- Terraform Taint/Replace
- Splat Expression
- Terraform Graph
- Saving Terraform Plan to a File
- Terraform Output Cmd
- Terraform Settings (terraform block)
- Handling Resource Refresh in Terraform
- Zipmap Terraform Function
- Comments in Terraform
- Terraform Provisioners
- Terraform Module
- Terraform Registry
- Terraform Workspace
- Terraform Module Sources
- Terraform Backend
- Sensitive Parameter
- HashiCorp Vault
- Terraform Cloud
- Air Gap
Configuration Management VS Infrastructure Orchestration
Ansible, Chef, and Puppet are configuration management tools which means that they are designed to install and manage software on existing servers.
Terraform and CloudFormation are infrastructure orchestration tools which means that they can provide the servers and infrastructure by themselves.
Configuration management tools can do some degree of infrastructure provisioning, but the focus here is that some tools are going to be a better fit for certain types of tasks.
What is Terraform?
Terraform is a free, open-source Infrastructure as Code (IaC) software tool that allows you to build, change, and version infrastructure safely and efficiently.
Terraform supports multiple platforms and has hundreds of providers.
Terraform enables developers to use a high-level configuration language called HCL (HashiCorp Configuration Language) to describe the desired end-state cloud or on-premises infrastructure for running an application.
It can be easily integrated with configuration management tools like Ansible.
Authentication VS Authorization
Authentication is the process of verifying who the user is.
Authorization is the process of verifying what they have access to.
Terraform needs access credentials with relevant permissions to create and manage the environment.
Depending on the providers, the type of access credentials would also change.
AWS - Access Key and Secret Key
GitHub - Tokens
Kubernetes - Kubeconfig file, Credentials config
Digital Ocean - Tokens
Resources and Providers
Resources
Resource block describes one or more infrastructure objects associated with the provider.
A resource block declares a resource of a given type (aws_instance) with a given local name (myec2).
Resource Type (aws_instance) and Name (myec2) together serve as an identifier for a given resource and so must be unique.
Ex:
resource "aws_instance" "myec2" {
ami = "xxxxxxxxxx",
instance_type = "t2.micro"
}
Providers
A provider is a plugin that lets Terraform manage an external API.
When we run terraform init, plugins required for the provider are automatically downloaded and saved locally in a .terraform directory.
Ex:
provider aws {}
provider azurerm {}
Provider Tiers
There are 3 primary types of provider tiers in Terraform.
Official - Owned and maintained by HashiCorp.
Partner - Owned and maintained by a Technology Company that maintains a direct partnership with HashiCorp.
Community - Owned and maintained by individual contributors.
Provider Namespace
Namespaces are used to help users identify the organization or publisher responsible for the integration.
Official - hashicorp.
Partner - Third-party organization, e.g. mongodb/MongoDB atlas.
Community - Maintainer's individual or organization account, e.g. DeviaVir/suite.
Provider Versioning
Provider plugins are released separately from Terraform itself, i.e. we have separate Terraform versions and provider versions.
During terraform init, if the version argument is not specified, the most recent/latest version is downloaded during initialization.
For production use, you must explicitly specify the provider version so that the code will not break if a newer version is available with the exclusion of features that you have used.
There are multiple ways for specifying the version of a provider.
\>=1.0 - Greater than or equal to the specified version
<=1.0 - Less than or equal to the specified version
~>1.0 - Any version in the 1.X range.
\>=1.10, <=1.30 - Any version between 1.10 and 1.30.
Terraform Provider - Resources in Multiple Regions
If we need to provision resources in more than one region, in that case, we have to use the alias parameter in the provider block.
alias gives us the option to rename the provider and use that name in creating a resource in the defined alias region.
alias = "mumbai"
----------
provider = "aws.mumbai" => aws is the provider and mumbai is the alias
Terraform Provider - Resources in Multiple Accounts/Profiles
- If we need to provision resources in more than one account/profile, in that case, we have to use the profile parameter in the provider block.
Dependency Lock File (.terraform.lock.hcl)
Terraform dependency lock file allows us to lock to a specific version of a provider (created automatically on terraform init).
If a particular provider already has a selection recorded in the lock file, Terraform will always re-select that version for installation, even if a newer version has become available.
You can override that behaviour by adding
-upgrade
option when you run terraform init.
Note
Terraform requires explicit source information for any providers that are not HashiCorp maintained, using a new syntax in the required_providers nests block inside the terraform configuration block.
We can also use required_providers and terraform block for HashiCorp maintained providers as well, but it is optional and can be used when we want to customize or use a specific version of that provider.
HashiCorp Maintained:
provider "aws" {
region = "us-east-1",
access_key = "<your-access-key>"
secret_key = "<your-secret-key>"
}
Not HashiCorp Maintained:
terraform {
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
version = "1.0.0"
}
}
}
provider "digitalocean" {
token = "<your-token>"
}
Terraform State File
Terraform State File (terraform.tfstate) stores the state of the infrastructure that is being created from the TF files.
This state allows Terraform to map real-world resources to your existing configuration.
Terraform state file contains the information associated with the resources that are currently live.
Desired State and Current State
Terraform's primary function is to create, modify, and destroy infrastructure resources to match the desired state described in a Terraform configuration.
The current state is the actual state of a resource that is currently deployed.
It is not always considered that the desired state is equal to the current state, as someone can manually modify the infrastructure.
Terraform tries to ensure that the deployed infrastructure is based on the desired state.
If there is a difference between the two, terraform plan presents a description of the changes necessary to achieve the desired state.
Note
Terraform matches the desired state described in a Terraform configuration only i.e. the parameters or options that are not provided in the configuration will not be considered as the desired state by Terraform.
That is why it is generally recommended that, while creating a resource, do not just specify minimum options, but try to mention/specify all the important options that are necessary as part of your Terraform configuration.
Terraform refresh
Terraform creates the infrastructure based on the configuration you specified, but the infrastructure may get modified manually.
The
terraform refresh
command checks the latest state of your infrastructure and updates the state file accordingly.You do not explicitly need to use this command, because Terraform automatically refreshes the state in both
terraform plan
andterraform apply
command as a first step.You should avoid running terraform refresh-related commands manually as it can remove your entire TF State file.
The
terraform refresh
command is deprecated in a newer version of Terraform.The -refresh-only option for terraform plan and terraform apply was introduced in Terraform v0.15.4.
Cross Resource Attribute Reference
It can happen that in a single terraform file, you are defining multiple resources.
However, resource 2 might be dependent on some value of resource 1.
Terraform allows us to reference the attribute of one resource to be used in a different resource.
Ex:
resource "aws_security_group" "allows_tls" {
----
cidr_blocks = [aws_eip.myeip.public_ip]
----
}
This indicates the Terraform that aws_security_group resource is dependent on aws_eip and must be created after the creation of aws_eip.
Output Values
Output values make information about your infrastructure available on the command line, and can expose information for other Terraform configurations to use.
Output values defined in Project A can be referenced in code from Project B as well.
Terraform Variables
- We can have a central source from which we can import the values.
Approaches for Variable Assignment
Variables in Terraform can be assigned values in multiple ways.
Environment variables
Terraform searches the environment of its process for environment variables named
TF_VAR_
followed by the name of a declared variable.Ex: setx TF_VAR_<variable_name> <variable_value> (Windows)
export TF_VAR_<variable_name>=<variable_value> (Linux/MacOS)
From a file (.tfvars)
Create a file with the name terraform.tfvars and assign values to all the declared variables in that file itself.
If you create a file with a different name, let's say custom.tfvars, then you will explicitly need to pass that file as an option in terraform commands.
Ex: terraform plan -var-file="custom.tfvars"
Default variables values
Declare default values while creating a variable.
Ex: variable "xyz" {
default = "test"
}
Command line flags
- If you do not provide any of the above options for declared variables in terraform commands then terraform will explicitly ask to input values for those variables in the command line itself when you run terraform commands.
Data Types of Variables
The type argument in a variable block allows you to restrict the type of value that will be accepted as the value for a variable.
If no type constraint is set then a value of any type is accepted.
variable "xyz" {
type = "string"
}
Accepted data types are:
string
number
bool
list (array)
map (object)
Count
Count Parameter
The count parameter on resources can simplify configurations and let you scale resources by simply incrementing a number.
Suppose we have a use case where we need to provision/create n number of ec2 instances, one way to solve this problem is by creating n number of ec2 resource blocks one after another but that is not an optimal solution.
Another way to solve this is by using the count parameter.
With the count parameter, we can simply specify the count value and the resource can be scaled accordingly.
Ex: resource "aws_instance" "instance" {
count = 10
ami = "xxxxxxxxxxxxxx"
instance_type = "t2.micro"
}
Count Index
In the resource block where the count is set, an additional count object is available in expressions, so can modify the configuration of each instance.
This object has one attribute: count.index.
count.index allows us to fetch the index of each iteration in the loop.
Ex: resource "aws_instance" "instance" {
count = 10
name = "instancetype-${count.index}" => instancetype-0, instancetype-1, etc.
ami = "xxxxxxxxxxxxxx"
instance_type = "t2.micro"
}
Problems with Count Parameter
Suppose that we have a list of names for ec2 instances stored in a variable. [name1, name2, name3]
If you add a new one at the start of the list, the entire configuration gets messed up and disturbed. [name0, name1, name2, name3]
Because the count parameter will not consider name0 as a replaced name for the name1 ec2 instance, which will ultimately perform an update operation on all the created ec2 instances.
To tackle this, Terraform provides another option i.e. for_each.
for_each
for_each makes use of map/set as an index value of the created resource.
each object
In blocks where for_each is set, an additional each object is available.
each object has 2 attributes.
each.key => The map key corresponding to the instance.
eahc.value => The map value corresponding to the instance.
Conditional Expression
A conditional expression uses a value of a bool expression to select one of two values.
Use it with count
count = condition ? 1 : 0
Syntax: condition ? true_value : false_value
Local Values
A local value assigns a name to an expression, allowing it to be used multiple times within a module without repeating it.
-
Local values can be used for multiple different use cases like having a conditional expression, functions, etc.
Terraform Functions
Terraform language includes several built-in functions that you can use to transform and combine values.
Syntax: function_name(arg1, arg2, ...)
Ex: max(5,34,7)
Terraform does not support user-defined functions., it only supports built-in functions.
There are various categories into which multiple functions are divided:
numeric
string
collection
encoding
file system
date and time
hash and crypto
IP network
type conversion
Data Sources
Data sources allow data to be fetched or computed for use elsewhere in Terraform configuration.
Debugging Terraform
Terraform has detailed logs which can be enabled by setting the TF_LOG environment variable to any value.
You can set TF_LOG to one of the log levels TRACE, DEBUG, INFO, WARN, or ERROR to change the verbosity of the logs.
TRACE is the most verbose and it is the default if TF_LOG is set to something other than a log-level name.
To persist log output you can set TF_LOG_PATH to force the log to always be appended to a specific file when logged is enabled.
export TF_LOG_PATH=/tmp/terraform.log
Terraform Format
terraform fmt
command is used to rewrite Terraform configuration files to take care of the overall formatting.
Terraform Validate
terraform validate
primarily checks whether a configuration is syntactically valid.It can check various aspects including unsupported arguments, undeclared variables, etc.
-
Terraform's plan ultimately validates the configuration behind the scene.
Load Order and Semantics
Terraform generally loads all the configuration files within the directory in alphabetical order.
The files loaded must end in either .tf or .tf.json to specify the format that is in use.
Dynamic Block
Dynamic block allows to dynamically construct repeatable nested blocks which is supported inside resource, data, provider, and provisioner blocks.
Iterators
The iterator argument (optional) set the name of a temporary variable that represents the current of the complex value.
If omitted, the name of the variable defaults to the label of the dynamic block.
Terraform Taint/Replace
The
-replace
option with Terraform apply force Terraform to replace an object even though there are no configuration changes that would require it.A similar kind of functionality was achieved using
terraform taint
command in older versions of Terraform.For Terraform v0.15.2 and later, HashiCorp recommended using
-replace
option with terraform apply.
Splat Expression
Splat expression allows one to get a list of all the attributes.
Terraform Graph
The terraform graph command allows one to generate a visual representation of either a configuration or execution plan.
The output of this is in the DOT format, which can easily be converted to an image.
Ex:
terraform graph > graph.dot
We will need a tool to convert a dot file into an image, Graphviz is one of the tools for this that allows us to quicky perform the visualization as well as conversion.
Saving Terraform Plan to a File
The generated terraform plan can be saved to a specific path.
This plan can then be used with terraform apply to be certain that only the changes shown in this plan are applied.
Ex:
terraform plan -out=<path>
The file in which terraform plan stores the plan is a binary file.
Terraform Output Cmd
The
terraform output
command is used to extract the value of an output variable from the state file
Terraform Settings (terraform block)
The special terraform configuration block type is used to configure some behaviours of Terraform itself, such as requiring a minimum Terraform version to apply the configuration.
Terraform settings are gathered together into terraform blocks.
Setting 1 - Terraform Version
The required_version setting accepts a version constraint string, which specifies which versions of Terraform can be used with your configuration.
If the running version of Terraform doesn't match the constraints specified, Terraform will produce an error and exit.
Setting 2 - Provider Version
The required_providers block specified all of the providers required by the current module, mapping each local provider name to a source address and a version constraint.
Handling Resource Refresh in Terraform
Suppose we have a Terraform project in which we have 100s of resources within a single .tf file.
So when we try to perform any operation like terraform plan, the entire configuration gets called to check whether the current state is meeting with the desired state or not.
This will ultimately take much more time to provide the result, so to tackle this we have 3 options.
First, separate the entire project into sub-projects with only related resources like 1 folder for ec2, 1 for rds, etc.
The second is to use the
-refresh=false
flag.The third is to directly specify the specific target
terraform plan -target=<resource>
. It is generally used as a means to operate on isolated portions of very large configurations.terraform plan -refresh=false -target=aws_instance.myec2
Zipmap Terraform Function
The zipmap function constructs a map from a list of keys and a corresponding list of values.
Syntac: zipmap(keylist, valuelist)
Comments in Terraform
Terraform supports 3 different syntaxes for comments.
# => single line
// => single line (same as #)
/* and */ => multiline
Terraform Provisioners
- Provisioners are used to execute scripts on a local or remote machine as a part of resource creation or destruction.
Types of Provisioners
- There are multiple provisioners available in Terraform but some of the most common are:
local-exec
local-exec provisioners allow us to invoke local executables after the resource is created.
One of the most used approaches of local-exec is to run ansible-playbooks on the created server after the resource is created.
remote-exec:
remote-exec provisioners allow us to invoke scripts directly on the remote server.
Creation-Time Provisioners
Creation-Time Provisioners are only run during creation, not during updating or any other lifecycle.
If a creation-time provisioner fails, the resource is marked as tainted.
Destroy-Time Provisioners
Destroy-Time Provisioners run before the resource is destroyed.
If
when=destroy
is specified, the provisioner will run when the resource it is defined within is destroyed.
Provisioner Failure Behaviour
By default, the provisioner that fails will also cause the Terraform apply itself to fail.
The on_failure setting can be used to change this. The allowed rules are:
continue - Ignore the error and continue with creation or destruction.
fail - Raise an error and stop applying (the default behaviour). If this is a creation provisioner, taint the resource.
Null Resource
- The null_resource implements the standard resource lifecycle but takes no further action.
Terraform Module
- need to add here
Module Outputs
In a parent module, outputs of child modules are available in expressions as module.<module_name>.<output_name>.
Terraform Registry
The Terraform Registry is a repository of modules written by Terraform community.
Within Terraform Registry, you can find verified modules that are maintained by various third-party vendors.
These modules are available for various resources like AWS VPS, RDS, MSSQL, etc.
To use Terraform Registry Module within code, we can make use of the source argument that contains the module path.
Below are code references to the ec2 instance module within terraform registry
Publishing Modules
Anyone can publish and share modules on the Terraform Registry.
Published modules support versioning, automatically generate documentation, allow browsing version histories, show samples and readmes, etc.
Terraform Workspace
Terraform allows us to have multiple workspaces, with each of the workspaces we can have a different set of environment variables associated.
terraform workspace -h
gives all the info about terraform workspace commands.If we have multiple workspaces, Terraform maintains separate tfstate files for each workspace.
Terraform Module Sources
The source argument in a module block tells Terraform where to find the source code for the desired child module.
Supported module sources:
Local paths
A local path must begin with ./ or ../ to indicate that a local path is intended.
Terraform registry
Github
Arbitrary Git repositories can be used by prefixing the address with the special git:: prefix (can also be used directly without specifying prefix and https).
After the prefix, any valid Git URL can be specified to select one of the protocols supported by Git.
-
By default, Terraform will clone the default branch in the selected repository but can override this using the ref argument (branch+tag).
Bitbucket
Generic Git, Mercurial repositories
HTTP URLs
S3 Buckets
GCS Buckets, etc.
Terraform Backend
Terraform backends primarily determine where Terraform stores its state.
By default, Terraform implicitly uses a backend called local to store state as a local file on disk.
Terraform su[[prts multiple backends that allow remote service-related operations. Some of them are:
S3
Consul
Azurerm
Kubernetes
HTTP
ETCD, etc.
State File Locking
Whenever you are performing a write operation, Terraform locks the state file.
This is very important as otherwise during your ongoing terraform apply operations, if others also try for the same, it can corrupt the state file.
-
Not all backend supports locking.
Force Unlocking State
- Terraform has a
force-unlock
command to manually unlock the state if unlocking failed.
Terraform State Management
It is not advised to modify the state file directly. Instead, make use of terraform state command.
Multiple sub-commands can be used with terraform state.
list
The terraform state list command is used to list resources within a Terraform state.
terraform state list
mv
The mv command is used to move items in a Terraform state.
It is mainly used to rename an existing resource without destroying and recreating it.
Due to the destructive nature of this command, this command will output a backup copy of the state prior to saving any changes.
terraform state mv [options] SOURCE DESTINATION
pull
The pull command is used to manually download and output the state from the remote state.
This is useful for reading values out of state files.
push
The push command is used to manually upload a local state to a remote state.
It is used very rarely.
rm
The rm command is used to remove items from the terraform state.
Items removed from the Terraform state are not physically destroyed but they are only no longer managed by Terraform.
Ex: If you remove an AWS instance from the state, the AWS instance will continue running, but terraform plan will no longer see that instance.
show
- The show command is used to show the attributes of a single resource in the Terraform state.
Connecting Remote States
The terraform_remote_state data source retrieves the root module output values from some other Terraform configuration, using the latest snapshot from the remote backend.
-
Terraform Import
Terraform has an import block to fetch the details of resources created manually and store that details in .tf file.
import block just needs 2things:
to - specifies the resource_type_name.local_name
id - specifies the id of a resource
To generate .tf file, we have to pass an option in terraform plan command i.e.
-generate-config-out
.terraform plan -generate-config-out=<file_name.tf>
After this, we just need to run terraform plan command to generate the state file automatically.
Sensitive Parameter
When working with a field that contains information likely to be considered sensitive, it is best to set the sensitive property on its schema to true.
Setting the sensitive to true will prevent the field's value from showing up in the CLI output and Terraform cloud, however, it will encrypt or obscure the value in the state file.
HashiCorp Vault
- HashiCorp Vault allows organizations to securely store secrets like API tokens, passwords, encryption keys, AWS access/secret keys, certificates, etc. along with access management for protecting secrets.
Vault Provider
The vault provider allows Terraform to read from, write to, and configure HashiCorp Vault.
Interacting with the vault from Terraform causes any secrets that you read and write to be persisted in the state file.
Terraform Cloud
- Terraform Cloud manages Terraform runs in a consistent and reliable environment with various features like access controls, a private registry for sharing modules, policy controls and others.
Sentinel
Sentinel is a policy-as-code framework integrated with the HashiCorp enterprise products.
It enables fine-grained, logic-based policy decisions, and can be extended to use information from external sources.
It is a paid feature.
Backend
The remote backend stores Terraform states and can be used to run operations in Terraform Cloud.
Terraform Cloud can also be used with local operations, in which case only the state is stored in the Terraform Cloud backend.
Air Gap
An Air Gap is a network security measure employed to ensure that a secure computer network is physically isolated from unsecured networks, such as the public internet.
Terraform Enterprise installs using either an online or air-gapped method and as the name infer, one requires internet connectivity, and the other does not.