Remix.run Logo
wouldbecouldbe a day ago

Trying to understand what this means.

Did the bad route cause an overload? Was there a code error on that route that wasn’t spotted? Was it a code issue or an instance that broke?

bc569a80a344f9c a day ago | parent | next [-]

It says network routing issue.

Network routes consist of a network (a range of IPs) and a next hop to send traffic for that range to.

These can overlap. Sometimes that’s desirable, sometimes it is not. When routers have two routes that are exactly the same they often load balance (in some fairly dumb, stateless fashion) between possible next hops, when one of the routes is more specific, it wins.

Routes get injected by routers saying “I am responsible for this range” and setting themselves as the next hop, others routers that connect to them receive this advertisement and propagate it to their own router peers further downstream.

An example would be advertising 192.168.0.0/23, which is the range of 192.168.0.0-192.168.1.255.

Let’s say that’s your inference backend in some rows in a data center.

Then, through some misconfiguration, some other router starts announcing 192.168.1.0/24 (192.168.1.0-192.168.1.255). This is more specific, that traffic gets sent there, and half of the original inference pod is now unreachable.

disqard a day ago | parent [-]

Thank you for that explanation!

mattdeboard a day ago | parent | prev [-]

it means their servers were unreachable due to network misconfig.