| ▲ | Aeolun 3 hours ago | |||||||
> I think it's quite rare for any company to have exact similar scale and size of storage in stage as in prod. We’re like a millionth the size of cloudflare and we have automated tests for all (sort of) queries to see what would happen with 20x more data. Mostly to catch performance regressions, but it would work to catch these issues too. I guess that doesn’t say anything about how rare it is, because this is also the first company at which I get the time to go to such lengths. | ||||||||
| ▲ | mewpmewp2 3 hours ago | parent [-] | |||||||
But now consider how much extra data Cloudflare at its size would have to have just for staging, doubling or more their costs to have stage exactly as production. They would have to simulate similar amount of requests on top of themselves constantly since presumably they have 100s or 1000s of deployments per day. In this case it seems the database table in question seemed modest in size (the features for ML) so naively thinking they could have kept stage features always in sync with prod at the very least, but could be they didn't consider that 55 rows vs 60 rows or similar could be a breaking point given a certain specific bug. It is much easier to test with 20x data if you don't have the amount of data cloudflare probably handles. | ||||||||
| ||||||||