jaeger CrashLoopBackOff - Bigbang r1.33.0 (Azure)
Bug
Description
Describe the problem, what were you doing when you noticed the bug? Running https://repo1.dso.mil/platform-one/big-bang/customers/template per readme instructions translated into attached script. Platform is Azure public (non-gov) cloud using Cloudshell, 3 nodes. Error from log tail (jaegerLogsTail) is repeatable: "Failed to init storage factory","error":"failed to create primary Elasticsearch client: health check timeout".
Also ran a log tail on jaeger jaeger-jaeger-jaeger-operator-59f68b8cd4-sfr62 (see tab jaegerCertLogsTail). It fails with "failed to call webhook: Post "https://jaeger-jaeger-jaeger-operator-webhook-service.jaeger.svc:443/mutate-jaegertracing-io-v1-jaeger?timeout=10s": x509: certificate signed by unknown authority". Eventually "watch Hr" command reveals: "bigbang helmrelease.helm.toolkit.fluxcd.io/jaeger Helm install failed: failed pre-install: timed out waiting for the condition"
Output Tabs:
BBHrPoWatch from kubectl get hr,po -A
jaegerPodDescribe from kubectl describe -n jaeger pods jaeger-d44684b95-wt8zc
describeNodes from kubectl describe nodes -A
jaegerLogsTail from kubectl logs -n jaeger pod/jaeger-d44684b95-wt8zc --tail=20
jaegerCertLogsTail from kubectl logs -n jaeger jaeger-jaeger-jaeger-operator-59f68b8cd4-sfr62 --tail=20
Input Tab:
bbDeployBashScript - script to create AKS, create repo, and deploy BB to AKS
Provide any steps possible used to reproduce the error (ideally in an isolated fashion).
Pre-requisites: Azure account in non-gov eastus region; github account; Platform One account.
copy bbDeployBashScript script in attachment to ~/my_bbdeploy.sh
vi ~/my_bbdeploy.sh
Go to the top of the file (type gg)
Replace the 10 variables with your values and exit the VI editor (:wq)
Type "~/my_bbdeploy.sh" in the Cloudshell command window. Your AKS DSOP/Bigbang deployment will begin.
**## BigBang Version
What version of BigBang were you running?**
1.33.0
jaegerCrash.xlsx