| ▲ | 46Bit 3 hours ago | |
What we're doing at Cloudflare (including some of what the author works on) samples adaptively. Each log batch is bucketed based on a few fields, and in each bucket if there's lots of logs in each bucket we only keep the sqrt or log of the number of input logs. It works really well... but part of why it works well is we always have blistering rates of logs, so can cope with spikes in event rates without the sampling system itself getting overwhelmed. | ||