ptpis used to create veth links,
host-localto allocate IPs and
portmapto configure port mappings. The configuration file gets generated by each of the
kindnetddaemons on startup.
The diagram below shows how a fully converged routing table will look like:
This plugin is built into the Lab cluster by default, so the only thing required is to bring up the Lab environment
make setup && make up
Here’s how to validate and verify the above diagram in the Lab environment, using the second Node as an example:
Pod IP should have a /24 subnet mask (same as
PodCIDR) and the default route pointing to the first IP of that subnet.
$ NODE=k8s-guide-worker2 make tshoot bash-5.0# ip -br -4 add show eth0 eth0@if5 UP 10.244.2.8/24 bash-5.1# ip route default via 10.244.2.1 dev eth0 10.244.2.0/24 via 10.244.2.1 dev eth0 src 10.244.2.8 10.244.2.1 dev eth0 scope link src 10.244.2.8
Note how the Pod routing is set up so that all the traffic, including the intra-subnet Pod-to-Pod communication, is sent over the same next-hop. This allows for all Pods to be interconnected via L3 without relying on a bridge or ARP for neighbor discovery.
It should contain one /32 host-route per local Pod and one /24 per peer node.
docker exec -it k8s-guide-worker2 bash root@k8s-guide-worker2:/# ip route default via 172.18.0.1 dev eth0 10.244.0.0/24 via 172.18.0.10 dev eth0 10.244.1.0/24 via 172.18.0.11 dev eth0 10.244.2.2 dev vethf821f7f9 scope host 10.244.2.3 dev veth87514986 scope host 10.244.2.4 dev veth9829983c scope host 10.244.2.5 dev veth010c83ae scope host 10.244.2.8 dev vetha1079faf scope host
One notable thing is that the root namespace side of all veth links has the same IP address:
root@k8s-guide-worker2:/# ip -br -4 addr show | grep veth vethf821f7f9@if3 UP 10.244.2.1/32 veth87514986@if3 UP 10.244.2.1/32 veth9829983c@if3 UP 10.244.2.1/32 veth010c83ae@if3 UP 10.244.2.1/32 vetha1079faf@if3 UP 10.244.2.1/32
They each act as the default gateway for their peer Pods and don’t have to be attached to a bridge.
Let’s track what happens when Pod-1 tries to talk to Pod-3.
We’ll assume that the ARP and MAC tables are converged and fully populated.
10.244.0.5. Its network stack looks up the routing table to find the NextHop IP:
$ kubectl exec -it net-tshoot-wxgcw -- ip route get 10.244.0.5 10.244.0.5 via 10.244.1.1 dev eth0 src 10.244.1.3 uid 0
$ docker exec -it k8s-guide-worker ip route get 10.244.0.5 10.244.0.5 via 172.18.0.10 dev eth0 src 172.18.0.11 uid 0
kindbridge and enters the control-plane’s root network namespace:
docker exec -it k8s-guide-control-plane ip route get 10.244.0.5 10.244.0.5 dev veth9f517bf3 src 10.244.0.1 uid 0
kubectl exec -it net-tshoot-x6wv9 -- ip route get 10.244.0.5 local 10.244.0.5 dev lo src 10.244.0.5 uid 0
In addition to the main CNI functionality,
kindnet also sets up a number of IP masquerade (Source NAT) rules. These rules allow Pods to access the same networks as the hosting Node (e.g. Internet). The new
KIND-MASQ-AGENT chain is inserted into the NAT’s
POSTROUTING chain and includes a special
RETURN rule to exclude all traffic in the cluster-cidr range (10.244.0.0/16):
root@k8s-guide-worker2:/# iptables -t nat -nvL | grep -B 4 -A 4 KIND-MASQ Chain POSTROUTING (policy ACCEPT 3073 packets, 233K bytes) pkts bytes target prot opt in out source destination 61703 4686K KUBE-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */ 0 0 DOCKER_POSTROUTING all -- * * 0.0.0.0/0 172.18.0.1 54462 4060K KIND-MASQ-AGENT all -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type !LOCAL /* kind-masq-agent: ensure nat POSTROUTING directs all non-LOCAL destination traffic to our custom KIND-MASQ-AGENT chain */ Chain KIND-MASQ-AGENT (1 references) pkts bytes target prot opt in out source destination 46558 3587K RETURN all -- * * 0.0.0.0/0 10.244.0.0/16 /* kind-masq-agent: local traffic is not subject to MASQUERADE */ 7904 473K MASQUERADE all -- * * 0.0.0.0/0 0.0.0.0/0 /* kind-masq-agent: outbound traffic is subject to MASQUERADE (must be last in chain) */