Remix.run Logo
shivanshvij 16 hours ago

We absolutely ran into these issues.

A couple notes that help quite a bit:

1. Always build the eBPF programs in a container - this is great for reproducibility of course, but also makes DevX on MacOS better for those who prefer to use that.

2. You actually can do a full checksum! You need to limit the MTU but you can:

  static __always_inline void tcp_checksum(const struct iphdr *ip_header, struct tcphdr *tcp_header, const __u16 tcp_len, const void *data_end) {
    __u32 sum = 0;
    __u16 *buf = (void *)tcp_header;
    ip_header_pseudo_checksum(ip_header, tcp_len, &sum);
    tcp_header->check = 0;
    __u16 max_packet_size = tcp_len;
    if (max_packet_size > MAX_TCP_PACKET_SIZE) {
        max_packet_size = MAX_TCP_PACKET_SIZE;
    }
    for (int i = 0; i < max_packet_size / 2; i++) {
        if ((void *)(buf + 1) > data_end) {
            break;
        }
        sum += *buf;
        buf++;
    }
    if ((void *)buf + 1 <= data_end && ((__u8 *)buf - (__u8 *)tcp_header) < max_packet_size) {
        sum += *(__u8 *)buf;
    }
    tcp_header->check = csum_fold_helper(sum);
  }
With that being said, it's not lost on me that XDP in general is something you should only reach for once you hit some sort of bottleneck. The original version of our network migration was actually implemented in userspace for this exact reason!
cptnntsoobv 16 hours ago | parent | next [-]

> You actually can do a full checksum

Indeed! This is what I had in mind when I wrote "cumbersome" :).

It's been a while for me to be able to recall whether the problem was the verifier or me, and things may have improved since, but I recall having the verifier choke on a static size limit too. Have you been able to use this trick successfully?

> Always build the eBPF programs in a container

That should work generally but watch out for any weirdness due to the fact that in a container you are already inside a couple of layers of networking (bridge, netns etc.).

tptacek 15 hours ago | parent | prev | next [-]

Different kernels will be different levels of fussy about the bounded loop you're using there. Bounded loops are themselves a relatively recent feature.

Of course, checksum fixups in eBPF are idiomatically incremental.

mgaunard 16 hours ago | parent | prev [-]

How do containers help when bpf is mostly a matter of kernel version?

beanjuiceII 13 hours ago | parent [-]

they don't its just the poster wanting people to do what they prefer

tanelpoder 10 hours ago | parent [-]

I figure it’s one way to keep your compiler version unchanged for eBPF work, while you might update/upgrade your dev OS packages over time for other reasons. The title of the linked issue is this:

“Checksum code does not work for LLVM-14, but it works for LLVM-13”

Newer compilers might use new optimizations that the verifier won’t be happy with. I guess the other option would be to find some config option to disable that specific incompatible optimization.