| ▲ | tuetuopay 2 hours ago | |
Pretty much, which was incredible for a half day rewrite, learning ebpf in rust included. The effort to result ratio is simply incredible. A few cleanups and optimizations later and I was pretty much convinced I would not need to touch DPDK again (so was the company). Following this experiments, I wrote some actual production grade eBPF routers at this company that are in production, much more complex, but still able to reach 200Mpps on a $500 CPU (EPYC 9015). As for why Cloudflare uses eBPF where the GFW uses DPDK I can see a few reasons: - DPDK was the only game in town when the GFW started, while eBPF was the hot new thing for Cloudflare's recent endeavors. GFW did not have any choice. - Cloudflare has a performance focus, but still has a bit of "hardware is cheap, engineers are expensive", making eBPF more than fine. - The GFW runs on dedicated machines on the traffic path, while I would expect most of Cloudflare's eBPF endeavors run directly on mixed-workloads machines. One of their first blogpost about it (dropping x Mpps) specifically calls the reason was to protect an end machine directly on said machine, by preventing bad packets from reaching the kernel stack - Most of the operational advantages I already mentioned. GFW is fine with a "drop traffic if DPDK down", but Cloudflare is absolutely not, making the operational simplicity a bit win. I bet Cloudflare does have quite a hefty DPDK application used for the traffic scrubbing part of their anti-ddos; but they don't publicize it because it's not as shiny as eBPF. There are also other advantages to eBPF that makes it better suited to a multi-product company like cloudflare that don't weigh as much as in a mono-product org like the GFW. Take for example the much easier testing, dev env on any laptop, ... Or that eBPF probes can be written in Rust, getting the same featureful language to run in the kernel and in userspace (the classic combo is Go in userspace, C in kernelspace). | ||