Validate deployment of operator style custom resources (e.g. Kiali, Jaeger, Istio)
Currently, the helm chart deploys an operator configuration fast since it does not check if the operator actually creates anything. Flux will continue as if the helm chart deployed successfully. But, the operator is still trying to deploy things.
For example, when IstioOperator
is created by the Istio Control Plane Helm chart, the helm chart completes before istiod
, and ingressgateways
have been deployed. We need a way to validate these items deployed correctly. In our CI pipeline, we sleep for 10 seconds and then check all statefulsets, deployments, and daemonsets. But, how do we know that operators have finished their work? If, for example Kiali wasn't deployed in time and our
Need to research how we could find a way to identify work completion and possibly cause the Helm Release to wait or fail on that condition. Possible solutions:
- Helm hooks with a short-lived job to check for resource completion
- Use "status" field in custom resource
- Scan logs in operator for completion
- Use "checker" pod to validate resources using probes.
To do:
Verify wait.sh for:
-
istio -
eck-operator
Create wait.sh for:
-
jaeger -
kiali -
Mattermost
Then:
-
Ensure the wait.sh scripts are merged
The above should address the tests for package CI. Then integrate them into the bigbang deployment:
-
Edit bigbang deployment scripts to run the tests/wait.sh for each package (if it exists) when deploying bigbang -
Ensure deployment changes are approved/merged