Remix.run Logo
rklaehn 4 days ago

Nice list.

I think iroh checks all the boxes but one.

( ) Doesn't contain window logic to emulate best-effort datagrams over about 1500 bytes

So you want a way to send unreliable datagrams larger than one MTU. We don't have that, since we only support datagrams via https://datatracker.ietf.org/doc/html/rfc9221 .

You could just use streams - they are extremely lightweight. But those would then be reliable datagrams, which comes with some overhead you might not want.

So how hard would it be to implement window logic on top of RFC9221 datagrams?

flub 4 days ago | parent | next [-]

I'm not sure I fully understand this window logic question. QUIC does MTU discovery, so if the link supports bigger datagrams the MTU will go up. Unreliable datagrams using RFC9221 can be sent up to the MTU size minus the QUIC packet overhead. So if your link supports >1500 bytes then you should be able to send datagrams >1500 bytes using iroh.

rklaehn 4 days ago | parent [-]

I think the OP wants a built in solution to send unreliable datagrams larger than the MTU.

flub 4 days ago | parent [-]

Fragmenting datagrams (or IP packets) is generally not a good idea. All protocol designs have been moving away from this the past few decades. If you want unreliable messages of larger than the MTU maybe taking some inspiration from Media-over-QUIC is a good idea. They use one uni-directional QUIC stream per message and include some metadata at the start of each stream to explain how old it is. If a stream takes too long to read to end-of-stream and you already have a newer message in a new uni-directional stream you can cancel the previous streams (using something like SendStream::reset or RecvStream::stop in Quinn API terms, depending on which side detects the message is no longer needed earlier). Doing this will stop QUIC from retransmitting the lost data from the message that's being slow to receive.

zackmorris 4 days ago | parent [-]

Right, I should have been more clear about that. Window logic was perhaps the wrong term, since I don't care about resends.

The use case I have in mind is for realtime data synchronization. Say we want to share a state larger than 1500 bytes, then we have to come up with a clever scheme to compress the state or do partial state transfer, which could require knowledge of atomic updates or even database concepts like ACID, which feels over-engineered.

I'd prefer it if the protocol batched datagrams for me. For example, if we send a state of 3000 bytes, that's 2 datagrams at an MTU of 1500. Maybe 1 of those 2 fails so the message gets dropped. When we send a state again, for example in a game that sends updates 10 times per second, maybe the next 2 datagrams make it. So we get the most recent state in 3 datagrams instead of 4, and that's fine.

I'm thinking that a large unreliable message protocol should add a monotonically increasing message number and index id to each datagram. So sending 3000 bytes twice might look like [0][0],[0][1] and [1][0],[1][1]. For each complete message, the receiver could inspect the message number metadata and ignore any previous ones, even if they happen to arrive later.

Looks like UDP datagram loss on the internet is generally less than 1%:

https://stackoverflow.com/questions/15060180/what-are-the-ch...

So I think this scheme would generally "just work" and hiccup every 5 seconds or so when sending 10 messages per second at 2 datagrams each and a 99% success rate, and the outage would only last 100 ms.

We might need more checklist items:

  ( ) Doesn't provide a way to get the last known Maximum Transmission Unit (MTU)
And optionally:

  ( ) Doesn't provide a way to get large unreliable message number metadata
GoblinSlayer 4 days ago | parent | prev [-]

Also there's no solution to punch through NAT.

rklaehn 4 days ago | parent | next [-]

Iroh will do hole punching through NATs. It will even work in many cases when there are NATs on both sides.

There are some limitations regarding some double NATs or very strictly configured corporate firewalls. This is why there is always the relay path as a fallback.

If you have a specific situation in mind and want to know if hole punching works, we got a tool iroh-doctor to measure connection speed and connection status (relay, direct, mixed):

https://crates.io/crates/iroh-doctor , can be installed using cargo install iroh-doctor if you have rust installed.

flub 4 days ago | parent | prev [-]

There might be some confusion here, holepunching is a core functionality of iroh. There are still some firewall configurations that iroh can not yet holepunch and that can still be improved, but in general the holepunching works rather well.