Remix.run Logo
arter45 8 hours ago

I've had to read the RCA a couple of times to (probably) get what happened, even if I'm reasonably familiar with BGP.

Basically, my understanding (simplified) is:

- they originally had a Miami router advertise Bogota prefixes (=subnets) to Cloudflare's peers. Essentially, Miami was handling Bogota's subnets. This is not an issue.

- because you don't normally advertise arbitrary prefixes via BGP, policies were used. These policies are essentially if/then statements, carrying out certain actions (advertise or not, add some tags or remove them,...) if some conditions are matched. This is completely normal.

- Juniper router configuration for this kind of policy is (simplifying):

set <BGP POLICY NAME> from <CONDITION1>

set <BGP POLICY NAME> from <CONDITION2>

set <BGP POLICY NAME> then <ACTION1>

set <BGP POLICY NAME> then <ACTION2>

...

- prior to the incident, CF changed its network so that Miami didn't have to handle Bogota subnets (maybe Bogota does it on its own, maybe there's another router somewhere else)

- the change aimed at removing the configurations on Miami which were advertising Bogota subnets

- the change implementation essentially removed all lines from all policies containing "from IP in the list of Bogota prefixes". This is somewhat reasonable, because you could have the same policy handling both Bogota and, say, Quito prefixes, so you just want to remove the Bogota part.

HOWEVER, there was at least one policy like this:

(Before)

set <BGP POLICY NAME> from is_internal(prefix) == True

set <BGP POLICY NAME> from prefix in bogota_prefix_list

set <BGP POLICY NAME> then advertise

(After)

set <BGP POLICY NAME> from is_internal(prefix) == True

set <BGP POLICY NAME> then advertise

Which basically means: if you have an internal prefix advertise it

- an "internal prefix" is any prefix that was not received by another BGP entity (autonomous system)

- BGP routers in Cloudflare exchange routes to one another. This is again pretty normal.

- As a result of this change, all routes received by Miami through some other Cloudflare router were readvertised by Miami

- the result is CF telling the Internet (more accurately, its peers) "hey, you know that subnet? Go ask my Miami router!"

- obviously, this increases bandwidth utilization and latency for traffic crossing the Miami router.

erredois 7 hours ago | parent [-]

I am not very familiar with Juniper config, but this phrase summarizes it well. "This means we (AS13335) took the prefix received from Meta (AS32934), our peer, and then advertised it toward Lumen (AS3356), one of our upstream transit providers. " basically you should not receive a prefix from an eBGP session ( different AS) and advertize to an eBGP session. As they mention at the next steps, good use of communities could help avoiding it, in case of other misconfigurations.