Incompatible with legacy iptables hosts

Summary

Current implementation does not include iptables-legacy which is required for use on hosts that are using iptables instead of nft.

Steps to reproduce

Create a RHEL 7 Kubernetes node using kubeadm. Apply a known working CNI (e.g. open source Calico) Note that Calico and CoreDNS never achieve Ready status

What is the current bug behavior?

Calico and CoreDNS do not achieve a Ready status.

What is the expected correct behavior?

(What you should see instead)

Relevant logs and/or screenshots

kube-proxy logs:

W0126 03:20:04.242266       1 server_others.go:559] Unknown proxy mode "", assuming iptables proxy
I0126 03:20:04.256974       1 node.go:136] Successfully retrieved node IP: xx.xx.xx.xx
I0126 03:20:04.257017       1 server_others.go:186] Using iptables Proxier.
I0126 03:20:04.259406       1 server.go:583] Version: v1.18.14-rc.0
I0126 03:20:04.260621       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
I0126 03:20:04.260674       1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0126 03:20:04.260809       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I0126 03:20:04.260889       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I0126 03:20:04.262116       1 config.go:315] Starting service config controller
I0126 03:20:04.262146       1 shared_informer.go:223] Waiting for caches to sync for service config
I0126 03:20:04.262191       1 config.go:133] Starting endpoints config controller
I0126 03:20:04.262202       1 shared_informer.go:223] Waiting for caches to sync for endpoints config
I0126 03:20:04.362341       1 shared_informer.go:230] Caches are synced for endpoints config
I0126 03:20:04.362342       1 shared_informer.go:230] Caches are synced for service config

calico-node logs:

2021-01-26 04:23:01.728 [INFO][10] startup/startup.go 454: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: i/o timeout

The other pods, calico-kube-controller and coredns, don't start, but get events:

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "c91f6c4edc326e56c4a2fee20746277277099d0d6cb26da2976316aee5649a0a": stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/

I went to a node and checked journalctl -u kubelet. I see the sandbox ones and this:

Jan 26 03:24:30 ip-xx-xx-xx-xx.us-gov-west-1.compute.internal kubelet[2977]: E0126 03:24:30.198382    2977 driver-call.go:266] Failed to unmarshal output for command: init, output: "", error: unexpected end of JSON input
Jan 26 03:24:30 ip-xx-xx-xx-xx.us-gov-west-1.compute.internal kubelet[2977]: W0126 03:24:30.198398    2977 driver-call.go:149] FlexVolume: driver call failed: executable: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~u

Possible fixes

Calico node had similar issues and also relies on iptables for functionality. It was resolved by including /usr/sbin/xtables-legacy-multi from the open source calico/node image, recreating the symbolic links for iptables[6]-legacy*, and setting up iptables[6] in /etc/alternatives.

dsop/opensource/calico/node#6 (closed)

Defintion of Done

Bug has been identified and corrected within the container

/cc @ironbank-notifications/bug

Edited Jun 29, 2021 by Vickie Shen

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information