While trying to debug some really weird issue, I’ve noticed that you really really really should use bridge_hw or a pre-up/up/post-up whatever rule to assign a permanent MAC to your bridge.
Assume two hosts namely node1 and node2. From node1 I do ping an IPv6 address which is set on a bridge in node2. The following configuration in node2:
auto vmbr0 iface vmbr0 inet static address x.x.x.4 netmask 255.255.255.224 broadcast x.x.x.31 gateway x.x.x.1 bridge_ports eth1 bridge_maxwait 0 bridge_fd 2 bridge_ageing 0 bridge_stp off iface vmbr0 inet6 static address yyyy:yyyy:yy::3 netmask 64 gateway yyyy:yyyy:yy::1
There are ~30 connected containers to this bridge. When starting and restarting containers it somehow happened that the bridge took over the MAC of such a container.
If I do ping the IPv6 address ::3 from node1 I can see those icmp echo requests quite fine in tcpdump. They’re however never answered. I found a stackoverflow article in which one wrote „check the mac“ without any helpful „why“ „for“ .. So I couldn’t do anything with the information at a first glance. Later I noticed that the MAC the request was sent to matches the MAC of a container. Why?
It seems black magic happens here. A bridge in Linux will use the MAC of the first connected port. Usually that will be eth0/eth1 whatever physical device you attach to it. But there’s more: A bridge in Linux seems to use the lowest MAC of the enslaved interfaces.
That does explain another problem I am having, too. If I stop my container with ID 14 the result is, that my whole system is not reachable for a few seconds up to minutes. Surely, if the bridge had the same MAC the container had and the container is stopped (so the MAC is gone) you’ll have a problem.
The „fix“ for this is to assign the MAC. You can do so in Debian when using bridge_hw. First find out the MAC of the primary device you’re enslaving (most likely eth0, here it is eth1)
ip link show eth1 | awk '/ether/{print $2}' xx:xx:xx:xx:xx:xx
then in your /etc/network/interfaces add
bridge_hw xx:xx:xx:xx:xx:xx
among the other settings you’re using for your bridge like bridge_stp, bridge_fd etc. Reboot (or restart networking) and you’re fine. IPv6 pings from node1 do work now as they are expected to.
No Comments