I’ve been working on a complex Terraform setup where:
We have multiple environments (e.g., dev, staging, and prod), each using different workspaces.
- Each workspace has its own state file and backend configuration (S3 + DynamoDB lock for prod, and local state for dev).
- We use several shared modules across these environments, such as VPCs, EC2 instances, RDS databases, and security groups, with configurations varying between environments.
- For production (prod), I need to ensure zero downtime when updating critical resources like ALBs, EC2 instances, and RDS.
- We’re running into issues with state consistency when changes are applied across environments. For example, when applying changes in dev, resources tied to shared modules inadvertently affect prod.
- Additionally, we need to ensure that changes in prod are staged in such a way that they are rolled out with blue-green deployment patterns (e.g., for EC2 instances) while still using the same Terraform modules.
Key challenges:
-
How to safely and consistently handle Terraform state across environments with multiple workspaces and varying backend configurations?
-
What’s the best way to manage shared resources (e.g., VPC, RDS) between environments without risk of one environment’s state interfering with another’s?
-
How to implement a blue-green deployment pattern using Terraform for critical services like ALB and EC2, while ensuring that no downtime occurs during the cutover?
-
I’ve explored options like using terraform import for shared resources, but I’m concerned about the maintainability and safety of this approach.
-
How can we ensure that each environment is isolated yet still shares common modules, and how should state be handled in this scenario?
2
To give an answer to these questions:
-
There are multiple ways in which you can set boundaries, however I would not recommend terraform workspaces unless you know what you are doing. Try to set boundaries in AWS. A good recommendation is to separate dev, acc and prd environments in different accounts. If you set permissions correctly, it provides a natural safe boundary for your deployments.
-
You could create a separate environment called shared and give the other environments read-only access to the state file. So that those environments can read the outputs with a terraform_remote_state data sources but not change the state.
-
There is an excellent tutorial about this specific topic on the Hashicorp website.
-
Check out bullet number two for my prefferred way of sharing resources between environments.
-
Each environment can be isolated on the AWS Account level as I recommended under the first bullet. The is one of the easiest methods of isolation resources.
- You’ll do an terraform init with a different backend configuration for each environment. Each environment stores the state in a different s3 bucket and different account, so that you get full isolation.
- For terraform plan and apply commands you’ll supply a different tfvars file for each environment. Meaning that the configuration of each environment is separated.
- You will then have a common set of terraform code that is the same for all environments. The only thing that actually differs is the input variables in tfvars.
- You can make minor deviations between environments with count blocks. In a production environment you might want to have more EC2s than in a development environment for example. But be careful, because deviations lead to a less predictable environment.
1