UNCLASSIFIED - NO CUI

Skip to content

Feature Request: Configure Alertmanager/Prometheus to support alerts/recording rules from Loki

Feature Request

Configure Alertmanager/Prometheus to support alerts/recording rules from Loki

Why

I would like to setup Loki alerts/recording rules but need Alertmanager/Prometheus to be configured to allow the network connections.

A basic use case is to use Loki to detect when there are too many failed login attempts to Keycloak. Using Loki, we can detect and alert on too many failed attempts.

Proposed Solution

There are a couple changes that are required.

I've only tested this with Loki in monolith mode. I haven't verified if any of the pod selectors need to change if running in the scalable mode that bigbang supports.

Network policy to allow Loki to talk to Alertmanager:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-loki-alertmanager-ingress
  namespace: monitoring
spec:
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          app.kubernetes.io/name: logging
      podSelector:
        matchLabels:
          app.kubernetes.io/name: logging-loki
    ports:
    - port: 9093
      protocol: TCP
  podSelector:
    matchLabels:
      app.kubernetes.io/name: alertmanager
  policyTypes:
  - Ingress

Network policy to allow Loki to talk to Prometheus:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-loki-prometheus-ingress
  namespace: monitoring
spec:
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          app.kubernetes.io/name: logging
      podSelector:
        matchLabels:
          app.kubernetes.io/name: logging-loki
    ports:
    - port: 9090
      protocol: TCP
  podSelector:
    matchLabels:
      app.kubernetes.io/name: prometheus
  policyTypes:
  - Ingress

Authorization Policy to allow Loki to talk to Prometheus' write endpoint (this one could be updated to add a pod label selector for Loki specifically):

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: loki-allow-metrics-push
  namespace: monitoring
spec:
  action: ALLOW
  rules:
  - from:
    - source:
        namespaces:
        - logging
    to:
    - operation:
        methods:
        - POST
        paths:
        - /api/v1/write
  selector:
    matchLabels:
      app: prometheus

Enable the remote-write endpoint on Prometheus via monitoring values.yaml:

prometheus:
  prometheusSpec:
    additionalArgs:
      - name: "web.enable-remote-write-receiver" # enable write receiver for loki

Configure Loki to enable pushing alerts/recording rules to Alertmanager/Prometheus via values.yaml. I believe the wal file can be stored in S3, so there may need to be some thought on how to configure that accordingly:

loki:
  rulerConfig:
    # this wal is used for prometheus
    wal:
      dir: /var/loki/ruler-wal # Located on the PVC
    # configure loki for writing records to prometheus
    remote_write:
      enabled: true
      client:
        url: http://monitoring-monitoring-kube-prometheus.monitoring.svc.cluster.local:9090/api/v1/write
    alertmanager_url: http://monitoring-monitoring-kube-alertmanager.monitoring.svc.cluster.local:9093
    external_labels:
      origin: loki
    storage:
      type: local
      local:
        directory: /rules
    rule_path: rules/fake # 'fake' is default tenant for single-tenant mode
# enable sidecar for pulling in rules from configmaps/secrets dynamically
sidecar:
  rules:
    enabled: true
    folder: /rules/fake # 'fake' is default tenant for single-tenant mode
# enable service account for sidecar to get configmaps/secrets
serviceAccount:
  automountServiceAccountToken: true

Assuming all of the above as been applied, deploying a configmap or secret with one of the example rules/alerts in the upstream docs should work for debugging

Edited by Daniel Palmer