Helm upgrade does not correctly deploy upgrade jobs

Summary

Helm upgrade via direct helm or bigbang (fluxv2+HelmReleases) does not correctly upgrade version as DB upgrade jobs never run and new pods are not able to successfully rollout.

Steps to Reproduce

helm install tag 1.9.5-bb.2 helm upgrade to tag 1.11.0-bb.0 (either via bigbang values or helm upgrade to new chart/values in new tag)

Issues Noticed

new pods fail to startup successfully, and helm rollback also does not succeed.

What we Expect

New chart and anchore-engine version rollout successfully.

Logs/Screenshots

POST-UPGRADE:

5s         Warning   Unhealthy               pod/anchore-anchore-engine-simplequeue-5f548dc8ff-4mlb7    Readiness probe failed: Get "http://10.42.2.14:8083/health": dial tcp 10.42.2.14:8083: connect: connection refused
30s         Warning   Unhealthy               pod/anchore-anchore-engine-policy-79667b45b8-h5tw4         Readiness probe failed: Get "http://10.42.2.15:8087/health": dial tcp 10.42.2.15:8087: connect: connection refused
26s         Warning   Unhealthy               pod/anchore-anchore-engine-policy-79667b45b8-h5tw4         Liveness probe failed: Get "http://10.42.2.15:8087/health": dial tcp 10.42.2.15:8087: connect: connection refused
26s         Warning   Unhealthy               pod/anchore-anchore-engine-simplequeue-5f548dc8ff-4mlb7    Liveness probe failed: Get "http://10.42.2.14:8083/health": dial tcp 10.42.2.14:8083: connect: connection refused
25s         Warning   Unhealthy               pod/anchore-anchore-engine-api-55c5cb4676-wdkdg            Readiness probe failed: Get "http://10.42.3.12:8228/health": dial tcp 10.42.3.12:8228: connect: connection refused
24s         Warning   Unhealthy               pod/anchore-anchore-engine-catalog-5577b95d68-ftm56        Liveness probe failed: Get "http://10.42.3.11:8082/health": dial tcp 10.42.3.11:8082: connect: connection refused
21s         Warning   Unhealthy               pod/anchore-anchore-engine-api-55c5cb4676-wdkdg            Liveness probe failed: Get "http://10.42.3.12:8228/health": dial tcp 10.42.3.12:8228: connect: connection refused
21s         Warning   Unhealthy               pod/anchore-anchore-engine-catalog-5577b95d68-ftm56        Readiness probe failed: Get "http://10.42.3.11:8082/health": dial tcp 10.42.3.11:8082: connect: connection refused
17s         Warning   Unhealthy               pod/anchore-anchore-engine-analyzer-796f9c74cd-8nrdf       Liveness probe failed: Get "http://10.42.1.15:8084/health": dial tcp 10.42.1.15:8084: connect: connection refused
14s         Warning   Unhealthy               pod/anchore-anchore-engine-analyzer-796f9c74cd-gjj4n       Readiness probe failed: Get "http://10.42.1.14:8084/health": dial tcp 10.42.1.14:8084: connect: connection refused
9s          Warning   Unhealthy               pod/anchore-anchore-engine-analyzer-796f9c74cd-8nrdf       Readiness probe failed: Get "http://10.42.1.15:8084/health": dial tcp 10.42.1.15:8084: connect: connection refused
8s          Warning   Unhealthy               pod/anchore-anchore-engine-analyzer-796f9c74cd-gjj4n       Liveness probe failed: Get "http://10.42.1.14:8084/health": dial tcp 10.42.1.14:8084: connect: connection refused

kubectl get job -A                                                                                                                                                                                              
No resources found

Possible fixes

upgrade Job resources running as post-upgrade hook, moving to pre-upgrade allows for successful DB upgrade and new helmRelease rollout.

/cc @micah.nagel @jasonkrause

Edited Mar 01, 2021 by Ryan Garcia

Admin message