Attention Iron Bank Customers: On March 27, 2025, we are moving SBOM artifacts from the Anchore Scan job to the Build job to streamline the container hardening pipeline. If you currently download SBOMs from the Anchore Scan job, you can still get them from the Build job and from other sources, including IBFE and image attestations.
Project 'platform-one/big-bang/bigbang' was moved to 'big-bang/bigbang'. Please update any links and bookmarks that may still have the old path.
Can reproduce by deployed using P1 RKE2 Distro Repo using TF
You can see this on both:
hardened STIGd node with selinux enabled
and non stiged with selinux disabled
No Kubernetes Network Policy Objects exist in the cluster.
If you manually exec into elasticsearch pod and run curl (so you're running curl from inner cluster network), you'll get pretty much the message the shows up in the GUI.
If you exec into kubeproxy and curl it's own localhost, you do see the prometheus metrics.
The 4 control plane pods run on the host network. <-- this might be key information.
These services running on a control-plane node will, by default, only bind to localhost/127.0.0.1. This is done for security in order to not expose metrics to the internet/lan or other unwanted networks that may have connectivity to the control-plane nodes.
These settings are dependent on the K8s runtime and configuration being used.
Firewall/Policy settings on the control-plane nodes will enforce access to these metrics.
There are several closed/open upstream tickets with people having issues scraping metrics due to the default address bindings.
The long response:
These information below is from a running vanilla k8s on bare metal which shows the same issues as running RKE2.
Note: etcd now exposes metrics on port 2381 by default. My deployment is using this override:
We can verify the ip addresses being used to expose services by running this command on a control plane:
ss -alnp | grep-E"10257|2381|10249|10259"
We see that all these services bind to 127.0.0.1 only, and are not accessible to external connections.
There are a few ways we can fix this.
Change the services to bind to all/specific ip address
Add firewall rules or a proxy server to forward connections to the 127.0.0.1 bound ports
Under a vanilla k8s installation, changing the bind addresses for etcd/kube-controller/kube-scheduler requires modifying these files under /etc/kubernetes/manifests and changing the bind address.
NOTE: this opens up control-plane node prometheus metrics to the world, and is only to demonstrate a solution if the nodes are secured.
Before:
After Edits:
After saving these changes, k8s will restart these deployments and prometheus running on a node in the cluster will be able to scrape the control-plane nodes. We see all but the kube-proxy services are now listening on all addresses, so external clients will be able to scrape metrics.
Kube-proxy is problematic, and I didn't find a way to get it to bind to the host's public ip address. I was able to create a firewall rule to port-forward to the kube-proxy metrics, so scraping was successful.
To fix kube-proxy, I added this firewall rule to the control-plane node telling to send any incoming connection to port 10249 to be sent to 127.0.0.1 port 10249:
RKE2 Notes. RKE2 works the same way as vanilla with these services default binding to 127.0.0.1. I was unable to find clear documentation or ways to change the default bindings for these services with RKE2. For RKE2 deployments, users may need to setup specific firewall rules, tcp-proxy, or a way to pass the command line flags to bind to all addresses or a specific address.
While binding to all addresses provides a solution, a cleaner and more secure solution would be to setup firewall port forwarding rules to only allow incoming connections from specific networks to access the prometheus exposed metrics.