[P1BIGROCKS-2284] Dogfood Cluster stability improvements
[P1BIGROCKS-2284](https://jira.il2.dso.mil/browse/P1BIGROCKS-2284)
- [ ] Cluster is not using our Gitlab runners for running jobs: https://repo1.dso.mil/platform-one/big-bang/customers/bigbang/-/blob/master/apps/gitlab-runners/base/gitlab-helm-repo.yaml
- [ ] CI Jobs fail randomly and work when restarted.
- [ ] BigBang Team should be getting running the cluster as we'd recommend:
- [ ] Flux alerts (add docs to BB docs)
- [ ] Prometheus/alertmanager alerts
- [ ] Retros for downtime
epic