Resolve "Add retry for RKE2 cluster creation/destruction"
Summary
Testing adding retry
logic to our nightly infra CI pipelines. So if terraform or the runners have an issue, the stage can attempt to retry before marking everything as a failure.
Part of #713 (closed)
Terraform is NOT happy being re-ran if it encounters an issue creating or verifying a resource the first time: yikes, and will require some better terraform logic from someone smarter than I.
This MR includes some per-stage retry logic for all stages of a master nightly pipeline run so that if the runner has an issue, times-out, or fails to schedule it will retry. So still an attempt to make things more resilient!
Edited by Ryan Garcia