Document and Test Production Recommendations for multi-replica Loki
IronBank is amazing and has been testing Loki while it's been in beta, they have been testing it with recs from BigBang and Grafana documentation for running Loki with more than a single replica and using scalable cloud back-ends like S3 and DynamoDB. Here is their current recommendation for a working production setup:
Looks like Loki package will also require an external-egress networkPolicy to be able to reach out to AWS endpoints and resources.
-
Increase resource values to 1Gi MEM and 300m CPU -
Add external egress 0.0.0.0/0
NetworkPolicy template -
Document the below value recommendations in the docs/production.md
file and as commented out values in thechart/values.yaml
replicas: 3
extraArgs:
# https://github.com/grafana/loki/issues/5021
target: all,table-manager
resources:
requests:
cpu: 125m
memory: 1Gi
limits:
cpu: 400m
memory: 3Gi
config:
auth_enabled: false
ingester:
chunk_target_size: 196608
lifecycler:
ring:
kvstore:
store: inmemory
replication_factor: 1
schema_config:
configs:
- from: 2022-01-01 # Anything in the past
store: aws
object_store: s3
schema: v11
index:
prefix: loki_
period: 168h
table_manager:
retention_deletes_enabled: true
retention_period: 8736h
common:
storage:
s3:
region: "us-gov-west-1"
sse_encryption: true
storage_config:
aws:
region: "us-gov-west-1"
sse_encryption: true
dynamodb:
dynamodb_url: "dynamodb://us-gov-west-1"
chunk_store_config:
chunk_cache_config:
redis:
endpoint: "loki-redis-headless.logging.svc.cluster.local:6379"
db: 0
compactor:
working_directory: /data/loki/boltdb-shipper-compactor
shared_store: s3
Edited by kevin.wilder