I’ve been having this issue with Terraform on Azure: when I build some infra that is a bit more involved (multiple resource groups, storage accounts etc) it very often happens that apply fails for any number of reasons (e.g. resources temporary unavailable in the region).
The problem is, this abort usually happens only after TF already created Resource Group(s). Because of that, on the next run it/they already exists, and this time Terraform fails telling me that it has just tried to create resource group, but it already exists.
I know I could just import this into TF state, but it’s quite cumbersome and failure-prone on its own. Luckily, this usually happens during initial run (on subsequent runs TF doesn’t recreate RG, after all), so my “fix” is just to delete the RG and allow TF to create it again.
Still, I think there must be a better solution. Why TF cannot either roll back changes in case of failure, or at least update the state file with whatever was deployed successfully?