UNCLASSIFIED - NO CUI

Document and Test Production Recommendations for multi-replica Loki

IronBank is amazing and has been testing Loki while it's been in beta, they have been testing it with recs from BigBang and Grafana documentation for running Loki with more than a single replica and using scalable cloud back-ends like S3 and DynamoDB. Here is their current recommendation for a working production setup:

Looks like Loki package will also require an external-egress networkPolicy to be able to reach out to AWS endpoints and resources.

  • Increase resource values to 1Gi MEM and 300m CPU
  • Add external egress 0.0.0.0/0 NetworkPolicy template
  • Document the below value recommendations in the docs/production.md file and as commented out values in the chart/values.yaml
replicas: 3
extraArgs:
  # https://github.com/grafana/loki/issues/5021
  target: all,table-manager
resources:
  requests:
    cpu: 125m
    memory: 1Gi
  limits:
    cpu: 400m
    memory: 3Gi
config:
  auth_enabled: false
  ingester:
    chunk_target_size: 196608
    lifecycler:
      ring:
        kvstore:
          store: inmemory
        replication_factor: 1
  schema_config:
    configs:
    - from: 2022-01-01 # Anything in the past
      store: aws
      object_store: s3
      schema: v11
      index:
        prefix: loki_
        period: 168h
  table_manager:
    retention_deletes_enabled: true
    retention_period: 8736h
  common:
    storage:
      s3:
        region: "us-gov-west-1"
        sse_encryption: true
  storage_config:
    aws:
      region: "us-gov-west-1"
      sse_encryption: true
      dynamodb:
        dynamodb_url: "dynamodb://us-gov-west-1"
  chunk_store_config:
    chunk_cache_config:
      redis:
        endpoint: "loki-redis-headless.logging.svc.cluster.local:6379"
        db: 0
  compactor:
    working_directory: /data/loki/boltdb-shipper-compactor
    shared_store: s3
Edited by kevin.wilder