PeerAuthentication to enable STRICT mTLS
Merge request reports
Activity
changed milestone to %1.32.0
added fluentbit label
assigned to @toladipupo
added statusreview label
changed milestone to %1.31.0
requested review from @ryan.j.garcia, @michaelmartin, and @echuang
requested review from @micah.nagel
- Resolved by Micah Nagel
(I haven't actually tested but) I think we need an exception for fluentbit metrics here since Prometheus scrapes over HTTP. Should just be
{{ .Values.service.port }}
and conditional on{{- if and .Values.istio.enabled (eq .Values.istio.mtls.mode "STRICT") (.Capabilities.APIVersions.Has "monitoring.coreos.com/v1") .Values.serviceMonitor.enabled }}
.Edited by Micah Nagel
- Resolved by Michael Martin
- Resolved by Micah Nagel
- Resolved by Micah Nagel
- Resolved by Micah Nagel
With regards to calling the file "default" since its a default...while true that it does apply to the whole pod and add an exception only for the port - the only concern I have is that it could make it harder to understand when exceptions are in place. If an end user takes a look at all
peerAuthentications
in their cluster they would just seedefault-fluentbit
but that is really masking the fact that we have a port level exception in place. So while it might seem duplicative have a true "default only" and then a secondary exception would allow us better visibility and make it easier in the future to try and mitigate/remove exceptions.
s3
[2022/04/08 02:56:46] [ info] [output:s3:s3.2] Successfully uploaded object /fluent-bit-logs/kube.var.log.containers.logging-loki-0_logging_istio-proxy-924f947b7f1d900b6cfdc85a7c710fccef34fd25684471c7b9cd463edefb2ad6.log/2022/04/08/02/55/44-object9A1ahXBO [2022/04/08 02:57:07] [ info] [output:s3:s3.2] upload_timeout reached for kube.var.log.containers.monitoring-monitoring-prometheus-node-exporter-cbqkt_monitoring_istio-proxy-84fb08b530be7af8fe3611d66048d8c7a888953415337f0907ebad65285b5bb5.log [2022/04/08 02:57:07] [ info] [output:s3:s3.2] Successfully uploaded object /fluent-bit-logs/kube.var.log.containers.monitoring-monitoring-prometheus-node-exporter-cbqkt_monitoring_istio-proxy-84fb08b530be7af8fe3611d66048d8c7a888953415337f0907ebad65285b5bb5.log/2022/04/08/02/56/06-object8En0a6cZ ^C ➜ bigbang git:(master) ✗ k get peerauthentication fluentbit -owide NAME MODE AGE fluentbit STRICT 17m ➜ bigbang git:(master) ✗ aws s3 ls s3://tunde-kops/fluent-bit-logs/kube.var.log.containers.alertmanager-monitoring-monitoring-kube-alertmanager-0_monitoring_istio-proxy-f83e05912b2fc0b4cf054c3685e133e78dcde570ab67bef5d5d95ec8db922beb.log/2022/04/08/ PRE 03/
loki
022/04/08 03:57:12] [ info] [input:storage_backlog:storage_backlog.2] queueing tail.0:1-1649389987.674928572.flb [2022/04/08 03:57:12] [ info] [input:storage_backlog:storage_backlog.2] queueing tail.0:1-1649389987.796737831.flb [2022/04/08 03:57:13] [ info] [engine] flush backlog chunk '1-1649389987.93016145.flb' succeeded: task_id=0, input=storage_backlog.2 > output=loki.2 (out_id=2) [2022/04/08 03:57:13] [ info] [engine] flush backlog chunk '1-1649389987.93016145.flb' succeeded: task_id=0, input=storage_backlog.2 > output=loki.0 (out_id=0) [2022/04/08 03:57:14] [ info] [engine] flush backlog chunk '1-1649389987.674928572.flb' succeeded: task_id=1, input=storage_backlog.2 > output=loki.2 (out_id=2) [2022/04/08 03:57:14] [ info] [engine] flush backlog chunk '1-1649389987.674928572.flb' succeeded: task_id=1, input=storage_backlog.2 > output=loki.0 (out_id=0) #after external loki shutdown =0) [2022/04/08 04:10:41] [error] [output:loki:loki.2] 15.200.206.172:3100, HTTP status=503 upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: delayed connect error: 111 [2022/04/08 04:10:41] [ warn] [engine] failed to flush chunk '1-1649391041.128831379.flb', retry in 7 seconds: task_id=0, input=tail.0 > output=loki.2 (out_id=2) [2022/04/08 04:10:48] [error] [output:loki:loki.2] 15.200.206.172:3100, HTTP status=503 upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: delayed connect error: 111 [2022/04/08 04:10:48] [ warn] [engine] failed to flush chunk '1-1649391041.128831379.flb', retry in 20 seconds: task_id=0, input=tail.0 > output=loki.2 (out_id=2)
added stale label
mentioned in commit 92e74117