| ▲ | kragen 17 hours ago | |
There are a lot of design alternatives possible to TCP within the "create a reliable stream of data on top of an unreliable datagram layer" space: • Full-duplex connections are probably a good idea, but certainly are not the only way, or the most obvious way, to create a reliable stream of data on top of an unreliable datagram layer. TCP's predecessor NCP was half-duplex. • TCP itself also supports a half-duplex mode—even if one end sends FIN, the other end can keep transmitting as long as it wants. This was probably also a good idea, but it's certainly not the only obvious choice. • Sequence numbers on messages or on bytes? • Wouldn't it be useful to expose message boundaries to applications, the way 9P, SCTP, and some SNA protocols do? • If you expose message boundaries to applications, maybe you'd also want to include a message type field? Protocol-level message-type fields have been found to be very useful in Ethernet and IP, and in a sense the port-number field in UDP is also a message-type field. • Do you really need urgent data? • Do servers need different port numbers? TCPMUX is a straightforward way of giving your servers port names, like in CHAOSNET, instead of port numbers. It only creates extra overhead at connection-opening time, assuming you have the moral equivalent of file descriptor passing on your OS. The only limitation is that you have to use different client ports for multiple simultaneous connections to the same server host. But in TCP everyone uses different client ports for different connections anyway. TCPMUX itself incurs an extra round-trip time delay for connection establishment, because the requested server name can't be transmitted until the client's ACK packet, but if you incorporated it into TCP, you'd put the server name in the SYN packet. If you eliminate the server port number in every TCP header, you can expand the client port number to 24 or even 32 bits. • Alternatively, maybe network addresses should be assigned to server processes, as in Appletalk (or IP-based virtual hosting before HTTP/1.1's Host: header, or, for TLS, before SNI became widespread), rather than assigning network addresses to hosts and requiring port numbers or TCPMUX to distinguish multiple servers on the same host? • Probably SACK was actually a good idea and should have always been the default? SACK gets a lot easier if you ack message numbers instead of byte numbers. • Why is acknowledgement reneging allowed in TCP? That was a terrible idea. • It turns out that measuring round-trip time is really important for retransmission, and TCP has no way of measuring RTT on retransmitted packets, which can pose real problems for correcting a ridiculously low RTT estimate, which results in excessive retransmission. • Do you really need a PUSH bit? C'mon. • A modest amount of overhead in the form of erasure-coding bits would permit recovery from modest amounts of packet loss without incurring retransmission timeouts, which is especially useful if your TCP-layer protocol requires a modest amount of packet loss for congestion control, as TCP does. • Also you could use a "congestion experienced" bit instead of packet loss to detect congestion in the usual case. (TCP did eventually acquire CWR and ECE, but not for many years.) • The fact that you can't resume a TCP connection from a different IP address, the way you can with a Mosh connection, is a serious flaw that seriously impedes nodes from moving around the network. • TCP's hardcoded timeout of 5 minutes is also a major flaw. Wouldn't it be better if the application could set that to 1 hour, 90 minutes, 12 hours, or a week, to handle intermittent connectivity, such as with communication satellites? Similarly for very-long-latency datagrams, such as those relayed by single LEO satellites. Together this and the previous flaw have resulted in TCP largely being replaced for its original session-management purpose with new ad-hoc protocols such as HTTP magic cookies, protocols which use TCP, if at all, merely as a reliable datagram protocol. • Initial sequence numbers turn out not to be a very good defense against IP spoofing, because that wasn't their original purpose. Their original purpose was preventing the erroneous reception of leftover TCP segments from a previous incarnation of the connection that have been bouncing around routers ever since; this purpose would be better served by using a different client port number for each new connection. The ISN namespace is far too small for current LFNs anyway, so we had to patch over the hole in TCP with timestamps and PAWS. | ||
| ▲ | Animats 11 hours ago | parent | next [-] | |
• Full-duplex connections are probably a good idea, but certainly are not the only way, or the most obvious way, to create a reliable stream of data on top of an unreliable datagram layer. TCP itself also supports a half-duplex mode—even if one end sends FIN, the other end can keep transmitting as long as it wants. This was probably also a good idea, but it's certainly not the only obvious choice. Much of that comes from the original applications being FTP and TELNET. • Sequence numbers on messages or on bytes? Bytes, because the whole TCP message might not fit in an IP packet. This is the MTU problem. • Wouldn't it be useful to expose message boundaries to applications, the way 9P, SCTP, and some SNA protocols do? Early on, there were some message-oriented, rather than stream-oriented, protocols on top of IP. Most of them died out. RDP was one such. Another was QNet.[2] Both still have assigned IP protocol numbers, but I doubt that a RDP packet would get very far across today's internet. This was a lack. TCP is not a great message-oriented protocol. • Do you really need urgent data? The purpose of urgent data is so that when your slow Teletype is typing away, and the recipient wants it to stop, there's a way to break in. See [1], p. 8. • It turns out that measuring round-trip time is really important for retransmission, and TCP has no way of measuring RTT on retransmitted packets, which can pose real problems for correcting a ridiculously low RTT estimate, which results in excessive retransmission. Yes, reliable RTT is a problem. • Do you really need a PUSH bit? C'mon. It's another legacy thing to make TELNET work on slow links. Is it even supported any more? • A modest amount of overhead in the form of erasure-coding bits would permit recovery from modest amounts of packet loss without incurring retransmission timeouts, which is especially useful if your TCP-layer protocol requires a modest amount of packet loss for congestion control, as TCP does. • Also you could use a "congestion experienced" bit instead of packet loss to detect congestion in the usual case. (TCP did eventually acquire CWR and ECE, but not for many years.) Originally, there was ICMP Source Quench for that, but Berkley didn't put it in BSD, so nobody used it. Nobody was sure when to send it or what to do when it was received. • The fact that you can't resume a TCP connection from a different IP address, the way you can with a Mosh connection, is a serious flaw that seriously impedes nodes from moving around the network. That would require a security system to prevent hijacking sessions. [1] https://archive.org/stream/rfc854/rfc854.txt_djvu.txt [2] https://en.wikipedia.org/wiki/List_of_IP_protocol_numbers | ||
| ▲ | musicale 15 hours ago | parent | prev | next [-] | |
AppleTalk didn't get much love for its broadcast (or possibly multicast?) based service discovery protocol - but of course that is what inspired mDNS. I believe AppleTalk's LAN addresses were always dynamic (like 169.x IP addresses), simplifying administration and deployment. I tend to think that one of the reasons linux containers are needed for network services is that DNS traditionally only returns an IP address (rather than address + port) so each service process needs to have its own IP address, which in linux requires a container or at least a network namespace. AppleTalk also supported a reliable transaction (basically request-response RPC) protocol (ATP) and a session protocol, which I believe were used for Mac network services (printing, file servers, etc.) Certainly easier than serializing/deserializing byte streams. | ||
| ▲ | musicale 14 hours ago | parent | prev [-] | |
> The fact that you can't resume a TCP connection from a different IP address, the way you can with a Mosh connection, is a serious flaw that seriously impedes nodes from moving around the network This 100% !! And basically the reason mosh had to be created in the first place (and it probably wasn't easy.) Unfortunately mosh only solves the problem for ssh. Exposing fixed IP addresses to the application layer probably doesn't help either. So annoying that TCP tends to break whenever you switch wi-fi networks or switch from wi-fi to cellular. (On iPhones at least you have MPTCP, but that requires server-side support.) | ||