| ▲ | toast0 6 hours ago | |||||||
The big things to avoid are crossing the user/kernel divide and communication across cores. Staying in the kernel is approximately the same as bypassing the kernel (caveats apply); for a packet filtering / smoothing use case, I don't think kernel bypass is needed. You probably want to tune NIC hashing so that inbound traffic for a given shaping queue arrives in the same NIC rx queue; but you probably want that in a kernel bypass case as well. Userspace is certainly nicer during development, as it's easier to push changes, but in 2026, it feels like traffic shaping has pretty static requirements and letting the kernel do all the work feels reasonable to me. Otoh, OpenBSD is pretty far behind the curve on SMP and all that (I think their PF now has support for SMP, but maybe it's still in development?; I'd bet there's lots of room to reduce cross core communication as well, but I haven't examined it). You can't pin userspace cores to cpus, I doubt their kernel datastructures are built to reduce communications, etc. Kernel bypass won't help as much as you would hope, if it's available, which it might not be, because you can't control the userspace to limit cross core communications. | ||||||||
| ▲ | rpcope1 4 hours ago | parent [-] | |||||||
Just a single data point, but the BSDs in general, as much as people like to jerk them off, having tested both recent FreeBSD (which should be much faster than OpenBSD) and Debian on I guess the now kind of elderly APU2s I have, netfilter is noticably faster (and I find nftables to be frankly less challenging than pf) and gets those devices right at gigabit line speed even with complex firewall rules, where as pf leaves performance on the table. It probably has to do with the fact it's an older 4 core design that wasn't super high power to begin with (does still does its job extremely well), but still. | ||||||||
| ||||||||