On some of our long running clusters we are noticing that the file path given for the [SERVICE] tag in Fluentbit has been filling up to an excess of 200G and removing nodes due to lack of storage. We attempted to override with storage.total_limit_size to possibly remedy this, but that does not appear to be a value passed in for the tag.
Designs
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
@justinguidry11
The value storage.total_limit_size looks to be the value that's needed to alleviate this issue. Could you test these values passed through to fluentbit and let us know if this works as intended:
That value needs to go in the output configuration, and as a result the entire outputs needs to be captured or it will be overwritten to just the single value.
fluentbit:values:config:outputs:|[OUTPUT]Name esMatch kube.*# -- Pointing to Elasticsearch service installed by ECK, based off EK name "logging-ek", update elasticsearch.name above to update.Host {{ .Values.elasticsearch.name }}-es-httpHTTP_User elasticHTTP_Passwd ${FLUENT_ELASTICSEARCH_PASSWORD}Logstash_Format OnRetry_Limit FalseReplace_Dots Ontls Ontls.verify Ontls.ca_file /etc/elasticsearch/certs/ca.crtstorage.total_limit_size 2G[OUTPUT]Name esMatch host.*# -- Pointing to Elasticsearch service installed by ECK, based off EK name "logging-ek", update elasticsearch.name above to update.Host {{ .Values.elasticsearch.name }}-es-httpHTTP_User elasticHTTP_Passwd ${FLUENT_ELASTICSEARCH_PASSWORD}Logstash_Format OnLogstash_Prefix nodeRetry_Limit Falsetls Ontls.verify Ontls.ca_file /etc/elasticsearch/certs/ca.crtstorage.total_limit_size 2G
We will also be testing and working towards implementing something similar for the package and any info you can report back will help a lot! Thanks.
The code @ryan.j.garcia gave worked just fine, the only thing I did on our side was lower the size limit to 50 M as shown here:
fluentbit:values:config:outputs:|[OUTPUT]Name esMatch kube.*# -- Pointing to Elasticsearch service installed by ECK, based off EK name "logging-ek", update elasticsearch.name above to update.Host {{ .Values.elasticsearch.name }}-es-httpHTTP_User elasticHTTP_Passwd ${FLUENT_ELASTICSEARCH_PASSWORD}Logstash_Format OnRetry_Limit FalseReplace_Dots Ontls Ontls.verify Ontls.ca_file /etc/elasticsearch/certs/ca.crtstorage.total_limit_size 50M[OUTPUT]Name esMatch host.*# -- Pointing to Elasticsearch service installed by ECK, based off EK name "logging-ek", update elasticsearch.name above to update.Host {{ .Values.elasticsearch.name }}-es-httpHTTP_User elasticHTTP_Passwd ${FLUENT_ELASTICSEARCH_PASSWORD}Logstash_Format OnLogstash_Prefix nodeRetry_Limit Falsetls Ontls.verify Ontls.ca_file /etc/elasticsearch/certs/ca.crtstorage.total_limit_size 50M
It did require performing watch "du -sh /var/log/flb-storage" within nodes that had this storage location to notice that it was working as intended.
Any thoughts here on a sane default value (I'm thinking like 10G)? We don't want to go too small, because that means log message may get discarded/lost once the limit is hit. We don't want to go too big either, so a smaller host node file system isn't filed up.
@michaelmartin I think 8-10G is fine. We should also callout in the main README that this value is present to prevent that directory from getting too big, and if you need to update, here's how (with a YAML example).