Feature Request: Configure Alertmanager/Prometheus to support alerts/recording rules from Loki
Feature Request
Configure Alertmanager/Prometheus to support alerts/recording rules from Loki
Why
I would like to setup Loki alerts/recording rules but need Alertmanager/Prometheus to be configured to allow the network connections.
A basic use case is to use Loki to detect when there are too many failed login attempts to Keycloak. Using Loki, we can detect and alert on too many failed attempts.
Proposed Solution
There are a couple changes that are required.
I've only tested this with Loki in monolith
mode. I haven't verified if any of the pod selectors need to change if running in the scalable
mode that bigbang supports.
Network policy to allow Loki to talk to Alertmanager:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-loki-alertmanager-ingress
namespace: monitoring
spec:
ingress:
- from:
- namespaceSelector:
matchLabels:
app.kubernetes.io/name: logging
podSelector:
matchLabels:
app.kubernetes.io/name: logging-loki
ports:
- port: 9093
protocol: TCP
podSelector:
matchLabels:
app.kubernetes.io/name: alertmanager
policyTypes:
- Ingress
Network policy to allow Loki to talk to Prometheus:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-loki-prometheus-ingress
namespace: monitoring
spec:
ingress:
- from:
- namespaceSelector:
matchLabels:
app.kubernetes.io/name: logging
podSelector:
matchLabels:
app.kubernetes.io/name: logging-loki
ports:
- port: 9090
protocol: TCP
podSelector:
matchLabels:
app.kubernetes.io/name: prometheus
policyTypes:
- Ingress
Authorization Policy to allow Loki to talk to Prometheus' write endpoint (this one could be updated to add a pod label selector for Loki specifically):
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: loki-allow-metrics-push
namespace: monitoring
spec:
action: ALLOW
rules:
- from:
- source:
namespaces:
- logging
to:
- operation:
methods:
- POST
paths:
- /api/v1/write
selector:
matchLabels:
app: prometheus
Enable the remote-write endpoint on Prometheus via monitoring values.yaml:
prometheus:
prometheusSpec:
additionalArgs:
- name: "web.enable-remote-write-receiver" # enable write receiver for loki
Configure Loki to enable pushing alerts/recording rules to Alertmanager/Prometheus via values.yaml. I believe the wal file can be stored in S3, so there may need to be some thought on how to configure that accordingly:
loki:
rulerConfig:
# this wal is used for prometheus
wal:
dir: /var/loki/ruler-wal # Located on the PVC
# configure loki for writing records to prometheus
remote_write:
enabled: true
client:
url: http://monitoring-monitoring-kube-prometheus.monitoring.svc.cluster.local:9090/api/v1/write
alertmanager_url: http://monitoring-monitoring-kube-alertmanager.monitoring.svc.cluster.local:9093
external_labels:
origin: loki
storage:
type: local
local:
directory: /rules
rule_path: rules/fake # 'fake' is default tenant for single-tenant mode
# enable sidecar for pulling in rules from configmaps/secrets dynamically
sidecar:
rules:
enabled: true
folder: /rules/fake # 'fake' is default tenant for single-tenant mode
# enable service account for sidecar to get configmaps/secrets
serviceAccount:
automountServiceAccountToken: true
Assuming all of the above as been applied, deploying a configmap or secret with one of the example rules/alerts in the upstream docs should work for debugging