Remix.run Logo
bc569a80a344f9c 2 days ago

It says network routing issue.

Network routes consist of a network (a range of IPs) and a next hop to send traffic for that range to.

These can overlap. Sometimes that’s desirable, sometimes it is not. When routers have two routes that are exactly the same they often load balance (in some fairly dumb, stateless fashion) between possible next hops, when one of the routes is more specific, it wins.

Routes get injected by routers saying “I am responsible for this range” and setting themselves as the next hop, others routers that connect to them receive this advertisement and propagate it to their own router peers further downstream.

An example would be advertising 192.168.0.0/23, which is the range of 192.168.0.0-192.168.1.255.

Let’s say that’s your inference backend in some rows in a data center.

Then, through some misconfiguration, some other router starts announcing 192.168.1.0/24 (192.168.1.0-192.168.1.255). This is more specific, that traffic gets sent there, and half of the original inference pod is now unreachable.

disqard 2 days ago | parent [-]

Thank you for that explanation!