Your first point appears to be about physical layer concerns. My suggestion was not meant to operate at that level. The proposed model assumes the physical layer guarantees point-point delivery of a distinct packet between adjacent nodes in the network with MTU limits manifesting as either discarding or rejecting the trailing portions of the packet.

I said there were no “problems” if there are no layering violations because you argued that recalculating checksums would be a layering violation. Either we say layering violations are unacceptable at which point my argument stands. Or we say layering violations are par for the course and you can just recalculate the checksums if you need to.

Unidirectional protocols with no back channel must assume the network channel parameters such as MTU. Adding truncation information which can be picked up at a different layer is just strictly more information you can feed into your protocol if it is designed to handle that. You can just not use it and act as if truncation is dropping if you want to. This is just strictly more data you can use for decisions.

You can get still get authenticated transport in the presence of truncation if your protocol generates a authentication tag for the “original” length and puts it at the start of the message. Then you can authenticate the length field and verify truncation otherwise you can drop it.

I did not bother with tunnels because I do not see how it is a distinct problem. Tunnels already need to figure out how to manage their MTUs. Either the tunnel is transparently managing how it fragments data and can be enhanced to support truncation (though it does not need to, it can just drop truncation/malformed as they currently do) or it tells tunnel parameters to the endpoints so that the endpoints keep themselves in bound at which point the endpoints can detect whatever the MTU of the tunnel is.

And again, you can always just ignore truncated packets and act as if they are malformed which everybody already does. This is strictly more functionality which does not require changing all existing systems which can be used to support more efficient MTU discovery by systems and networks that supported it. And if they do not, you just fallback to the current, crusty way.

▲

zamadatix 8 months ago | parent [-]

> Your first point appears to be about physical layer concerns. My suggestion was not meant to operate at that level.

The proposal doesn't operate at that level but it must be compatible with the operations of that level. I.e. that the physical layer can also cause truncation of layers riding on top of it needs to be accounted for in the way those upper layers consider what truncation means. The same is true for possible intermediate layers (which sorta aligns with the later conversations regarding tunnels, which are basically just more complicated forms of intermediate layers).

> The proposed model assumes the physical layer guarantees point-point delivery of a distinct packet between adjacent nodes in the network with MTU limits manifesting as either discarding or rejecting the trailing portions of the packet.

Then proposed isn't applicable to IP since an upper layer protocol cannot make guarantees about the behavior of lower level protocols it may be transported on.

In addition, discarding trailing portions of the packet still results in the aforementioned problems with consistency checks and forwarding behavior limitations for lower level layers which did abide by this behavior.

> Unidirectional protocols with no back channel

One cannot guarantee bidirectional protocols will be able/allowed to form a back channel either, I just used unidirectional as a more clear-cut example.

> You can just not use it and act as if truncation is dropping if you want to. This is just strictly more data you can use for decisions.

Well sure, the same is true of the ICMP method or an active probing method. The concern is less with sessions you don't care to PMTUD in the first place and more with how the truncation design affects the designs of such other use cases.

> You can get still get authenticated transport in the presence of truncation if your protocol generates a authentication tag for the “original” length and puts it at the start of the message. Then you can authenticate the length field and verify truncation otherwise you can drop it.

I totally agree one can include an HMAC tag in your client<->client protocol to validate unmodified packets are authentic. This is regardless of whether truncation, ICMP packet too big, active PMTUD probing, or any other method is in place as, to this point, this is only about validating delivered packets which did fit in the MTU.

What isn't clicking is when a truncated message arrives how a (now invalid) HMAC helps you authenticate if this packet was completely spoofed by a malicious actor or really truncated by a middlebox. All you know is it was supposed to be longer and now something claims it needs to be shorter, how do you know that's not because of the same malicious actor who was supposed to be sending the fake ICMP packet too big rather than a middlebox really trying to signal the packet truly needed to be truncated?

> I did not bother with tunnels because I do not see how it is a distinct problem.

As highlighted earlier, tunnels may either encapsulate other protocols or encapsulate protocols which are expecting truncation. If the only things which existed in the world were client network interfaces it wouldn't be a problem, once more network devices become involved then you have to consider the impact on those too. The main thing to keep in mind is very few network middleboxes or tunnel protocols have the ability to do fragmentation on behalf of tunneled data, particularly if they are hardware based or based on protocols without such a feature (such as Ethernet) since this eats up TONS of hardware to do so (especially at high speeds). E.g. take an IPv6 VXLAN tunnel of an Ethernet frame on a 400 Gbps interface, how is an pure L3 intermediate carrier router doing truncation supposed to know not to update the UDP (a layer up the stack) checksum so the truncated Ethernet payload actually gets delivered to the client destination from the egress VTEP? It's not even that the egress VTEP needs some way to signal to the ingress VTEP how much the truncation was, it's that the original client which was VXLAN encapsulated by the ingressing VTEP needs its packet delivered to the remote client so the remote client can see the truncation and re-negotiate (in band or out of band) with the client to send smaller frames. This signaling will not occur because of the aforementioned UDP checksum being broken by an intermediate router. Just removing all checksums and allowing all modifications to headers and delivering whatever arrives would create not only high incidences of the propagation of deformed traffic but also security risks.

This brings us back to the example of secure tunnels, like IPsec, which have the same problem but in a much more succinct form. All parts of the payload of an IPsec tunnel are basically random noise after you truncate it, so there is no way to even attempt to consider sending the truncated payload to the intended destination. It's not the responsibility of the IPsec encapsulator to perform the encapsulation and the IPsec receiver usually doesn't have a path to communicate with the original client (not that it even knows who that is).

If you redesign everything about how network tunneling works under some severe limitations and assumptions then it may be possible to solve some (or maybe all if I can figure out what I'm missing regarding authentication of packets claiming MTU changes) of these problems but I'm not sure I could ever see the set of requirements needed as easier than the other MTU approaches. That doesn't necessarily mean I think there is an overall perfect answer all, just that I think PMTUD and its variants are definitely the easier path.

▲

Veserv 8 months ago | parent [-]

I just do not understand the problems you are stating. Let me present a concrete example.

We have A <-> B <-> C. A wishes to transmit a packet of 0x1000 bytes containing a Ethernet, IPv4, and then bespoke protocol, P, which is a header containing a length, MAC on the length + header, MAC on entire packet, encrypted payload, in that order. A then prepares transmit descriptors pointing at the packet and with size 0x1000 bytes.

C prepares receive descriptors pointing to buffers with a maximum capacity of size 0x1000 bytes per packet. B prepares receive descriptors pointing to buffers with a maximum capacity of size 0x500 (1280) bytes per packet.

A transmits the packet to B. The physical coding layer transmits the bytes terminating in the FCS. B receives bytes and does a running computation of the FCS. Upon reaching 0x500 bytes, it stops storing data into memory, stores the current FCS into memory, then continues receiving the data and computing the FCS until the data stream ends. Upon determining that the FCS matches, it marks the descriptor as valid for consumption and stores that the descriptor contains 0x500 bytes of data. The transmit engine of B then configures a transmit descriptor pointing at the packet and with size 0x500 bytes.

C then receives the 0x500 byte packet from B and observes that the FCS matches the 0x500 byte FCS and marks the descriptor as valid for consumption and stores that the descriptor contains 0x500 bytes of data. C then processes the packet observing that the P header indicates a length of 0x1000 bytes, but only 0x500 bytes are available. It attempts to authenticate the P header MAC using a secret known only to A and C. As the truncation only hit the encrypted payload at the tail, the P header MAC and the header data it is authenticating have not been modified by the truncation process. As such, C is able use the higher layer secret it shares with A to successfully authenticate the header data and determine that the header containing a length field with the value of 0x1000 bytes could have only been written by A and has not been tampered with. It then rejects the rest of the packet, but stores that the inbound MTU is only 0x500 bytes.

▲

zamadatix 8 months ago | parent [-]

In this process one can only show they are unable to authenticate the packet's length matches the length the header said it should have been i.e. you can only authenticate that nobody tried to claim the MTU should change. You have not provided any authentication to the parts of the message signaling the MTU is now supposed be 0x500.

The authentication header can only helps you authenticate when the MTU stayed the same as expected during delivery, it cannot help you authenticate the signals claiming MTU was supposed to be something else as those modifications, inherently, do not come from nodes partaking in the authentication header. The malicious middleman could falsely truncate a single packet to 0x500 bytes just as easily as they could falsely create an ICMP packet claiming the MTU is 0x500 bytes, in both cases the only thing you know for sure is "someone is trying to claim that last packet was too big".

▲

Veserv 8 months ago | parent [-]

Can you please explain how a malicious node on your path truncating packets to 0x500 bytes is distinct from a 0x500 byte path MTU?

The distinction between this and a false ICMP packet is that a valid ICMP packet from a node in your path comes from a unauthenticated source. You can not generally distinguish this from a forged ICMP packet from a malicious entity not on your path.

In contrast, the model I propose results in the authenticated endpoint learning of the path MTU in a way that can only be altered by a node in the path refusing to send data beyond a certain size. The authenticated endpoint can then use a authenticated channel to feedback the data allowing the source to get authenticated path MTU information that could only come from the authenticated endpoint.

▲

zamadatix 8 months ago | parent [-]

> Can you please explain how a malicious node on your path truncating packets to 0x500 bytes is distinct from a 0x500 byte path MTU?

Sure, it doesn't even actually have to be truly in-path in all cases, though that makes things easier. I receive a copy of your message (be it I'm actually inline, tap, spoofed arp for a second, shared media, etc - pick your poison of the day) and I create a truncated version of that packet to send on the line. I don't even need to truncate every single one yet, since you haven't re-added the a probing mechanism like found in PLPMTUD to allow the MTU to be raised again yet :).

Putting all of the talk about the length authentication aside, I still don't see how the truncated headers was supposed to be an improvement over the ICMP approach of just sending the headers back as part of the payload. The ICMP destination unreachable message signaling the packet was too large already includes the original IP header + at minimum the first 8 bytes (though in practice RFC1812 increased that so you'll get significantly more) of data precisely so the client is able to map the request to the specific underlying session. If having the 5 tuple + identification number + TCP sequence number you sent match up with what you sent isn't already enough to trust it came from someone who got a copy of your message then you can add the HMAC anywhere in your headers without needing to replace the ICMP based signaling approach completely to get it sent back to you too? I guess because you want the message the message sent was too large to go to the remote side first instead of the sender?

▲

Veserv 8 months ago | parent [-]

You appear to be approaching my proposal as if it were a complete RFC intended to describe a complete end-to-end MTU discovery protocol. I am describing a primitive that can be used to efficiently implement MTU discovery. I did not describe probing because it is trivial to implement on top of that primitive in whatever way you so desire. A simple mechanism would be to just use your PLPMTUD mechanism, but instead of submitting continuously increasing packets, you can just submit a single large one and get the precise MTU in one step which is delivered back in whatever way (or equivalent) your PLPMTUD prescribes getting feedback.

You are describing a attack where somebody is either getting the unique copy of your packet by maliciously getting themselves into your path, at which point they can just drop all your packets and DoS you in a much more robust manner, or for some stupid reason you just send copies of packets to random malicious entities when they ask for it. I mean, I can imagine some knuckle-head designed a protocol and network where that is the case, but you should probably first not do something that stupid. Assuming you can not avoid it, you can still detect the case because if they can not interject into your actual path, the endpoint will receive both a authenticated packet and a authenticated truncated packet. In such a case, you can first of all just use the larger packet size since the larger size did get through. Second, you now know that somebody malicious or incompetent is getting a copy of your packets. I leave how you want to handle that up to the endpoints.

And, just as a aside, I did not mention this as I assumed it was obvious, but I believe you should still have a minimum MTU size no smaller than the current safe MTU. I am not arguing that somebody should be able to truncate you down to some egregiously small MTU. So worst case you just have to use the safe minimum MTU which is no worse than existing non-probing techniques. You could also just treat truncation as dropping if you have tons of malicious actors which defaults you to current probing techniques. My proposal can trivially fall back to the current models.

I disagree with ICMP signaling on multiple levels. First of all, you need to accept data from unauthenticated endpoints. Oh, but it is okay because they need to be on your path to get the uniquifiers. Oh, except you just pointed out how malicious actors can get copies so that is not true. You are now required to accept large amounts of data from malicious actors if you want to do ICMP based probing.

Second, I think having source addresses in headers is a design mistake and we should move away from protocols that demand it be present. We can do source authentication cryptographically which is more secure and prevents ossified middleboxes from engaging in layering violations or interfering with flows.

	▲	zamadatix 8 months ago \| parent [-]
		It's less about whether your proposal is ready to be submitted for standards track and more about pointing out alternative proposals always seem easier when you look at the primitive in isolation compared to a fully fleshed out solution in place today. But yes, I imagine the needed mechanism would be similar (if not identical to) the existing solutions deployed in PMTUD and PLPMTUD. Yes, such a person in your path is problematic for both solutions. The point is not that such a person is wholly unproblematic with the ICMP solution, it's that truncation provides no additional authentication in such scenarios as was claimed. If you can already assume a fully trusted and validated path without the possibility of anyone interfering then you don't really need to worry about spoofed ICMP either, not that the involvement of the MTU discovery algorithm had much to do it with that result. Pragmatically the answer here is one loses trust of any payload which is not whole and matching contained signatures, but that still remains true when one uses truncation instead. Agreed on minimally safe MTU, another case where truncation would not actually be providing a change from the current implementation. In regards to the discussion around the downsides of the current ICMP approach the claim is it's no less safe than truncation, not that it was more safe. Truncation was said to have brought authentication of packets claiming MTU change, in reality it provides the same level of veracity as ICMP in, yet again, a very similar manner. In regards to size of data to accept one needs to accept single ICMP destination unreachable type packets of up to 576 bytes to handle the MTU use case (other ICMP packet types and use cases may allow for more but you don't have to support that for handling MTU notifications). It's not an accident 576 bytes is also the minimum maximum packet size define for IPv4. This means the largest unauthenticated packet you need to parse is the same size as it would be in the truncation case: 1 of whatever the protocol defines as the minimum maximum packet size. Protocols without a cleartext source address (or much besides a destination address) in the header are definitely an interesting topic, and there are certainly use cases for this, but these really end up at the same kind of conversation except you get to skip the PMTUD portions and go straight to doing PLPMTUD inside the encrypted portion. With PLPMTUD inside the encrypted portion clients never have to listen to anything which came in a packet with an invalid signature. If one introduces truncation in this scenario you lose that as clients now need to also implicitly trust that a small packet with a broken signature was genuinely an MTU hint rather than a maliciously spoofed copy or whatnot. The only thing I'll say against such high secure protocols is they would not necessarily be something for everyone: they'd mandate a lot of things which may not be pragmatic for many use cases. E.g. when one wants to trade source anonymity for RPF protections. Or when one doesn't get value from encryption but gets value from load balancing/directing on header info in a high performance scenario. Or easy handling of multiple generations of protocols with opposing design goals in transport equipment. This is all to say I think such protocols would be valuable but I'd stop short of saying it's what IP needs to/should have done.