| ▲ | Veserv 5 hours ago |
| While I appreciate more efficient and compact representations, I fail to see why this is particularly necessary. This article [1] on the same topic indicates a naive PQ chain is only ~40x the size of a current 4 KB chain. That means it is just ~160 KB. If you have the legal minimum to be considered broadband in the US, you need ~100 Mbps, so that would add ~12 ms. If you can stream one 4K video, you need ~20-40 Mbps, so that would add ~30-60 ms. If you can stream one 1080p video you need to ~3-6 Mbps, so that would add ~200-400 ms. Even on just a 1 Mbps connection, just barely enough to stream a single 480p video that would only add ~1 second. And I doubt the weight of most of pages is lower than 160 KB. Many of them are probably dramatically higher, so the total effect of a extra 160 KB is just a few percent. If there is a problem, it seems like it would be with poorly designed protocols and infrastructure which should be fixed as well instead of papering them over. [1] https://arstechnica.com/security/2026/02/google-is-using-cle... |
|
| ▲ | bwesterb 3 hours ago | parent | next [-] |
| The key will be 40x larger. Not that bad for the certs. It'll be about 15kB extra. Will depend on your use case if that's bad. For video it's fine. But not all browsing is video. At Cloudflare half of the QUIC connections we see transfer less than 8kB from server -> client total. On average 3-4kB of that is already certificates today. That'll probably be quite noticeable. https://blog.cloudflare.com/pq-2025/#do-we-really-care-about... |
| |
| ▲ | Veserv 2 hours ago | parent [-] | | But do those connections constitute a material amount of total bandwidth and thus resources? No, as the article points out the median is 8 KB, but the average is 583 KB. The extra 15 KB for each connection would only bump server-side bandwidth serving by ~2%. But even that is beside my point. The impact of making certificates larger should be, largely, just the cost of making them larger which, on average, would not actually be that significant of a impact. That is not the real problem. The problem is actually that there is so much broken crap everywhere in networks and network stacks that would either break or dramatically balloon what should otherwise be manageable costs. Everybody just wants to paper over that by blaming the larger certificates when what is actually happening is that the larger certificates are revealing the rot. That is not to say that the proposal which reduces the size of the certificates is bad, I think it is good to do so, but fixing the proximal cause so you can continue to ignore the root cause is a recipe that got us into this ossified, brittle networking mess. | | |
|
|
| ▲ | agwa 5 hours ago | parent | prev | next [-] |
| At the beginning of a TCP connection, which is when the certificate chain is sent, you can't send more data than the initial congestion window without waiting for it to be acknowledged. 160KB is far beyond the initial congestion window, so on a high-latency connection the additional time would be higher than the numbers you calculated. Of course, if the web page is very bloated the user might not notice, but not all pages are bloated. The increased certificate size would also be painful for Certificate Transparency logs, which are required to store certificates and transmit them to anyone who asks. MTC doesn't require logs to store the subject public key. |
| |
| ▲ | Veserv 4 hours ago | parent [-] | | That is exactly the type of poor design that I was saying should be rectified. You can already configure your initial congestion window, and if you are connecting to a system expecting the use of PQ encryption, you should set your initial congestion window to be large enough for the certificate; doing otherwise is height of incompetence and should be fixed. You could also use better protocols like QUIC which has a independently flow controlled crypto stream and you can avoid amplification attacks by pre-sending adequate amounts of data to stop amplification prevention from activating. And I fail to see how going from 4 KB of certificate chain to 160 KB of certificate chain poses a serious storage or transmission problem. You can fit literal millions into RAM on reasonable servers. You can fit literal billions into storage on reasonable servers. Sure, if you exactly right-sized your CT servers you might need to upgrade them, but the absolute amount of resources you need for this is miniscule. | | |
| ▲ | ekr____ an hour ago | parent | next [-] | | A few points of technical clarification might help here. 1. The reason for a relatively small initial congestion window (cwnd) is to avoid situations where a lot of connections start up and collectively exceed the capacity of the network, causing congestion collapse. Instead, you start slow and then gradually ramp up, as you learn the available capacity. Slow start started in TCP but it's in QUIC too. Initial windows actually used to be a lot smaller and TCP only moved up to its current 10 packet initial window (IW10) after a bunch of experimentation that determined it was safe. 2. The congestion window is actually a property of the sender, not the receiver. The receiver advertises the size of their flow control window, but that's about the buffer, not the sending rate (see section 7 of RFC 9002 for the discussion of slow start in QUIC). So in this case, the server controls cwnd, no matter what the client advertises (though the server isn't allowed to exceed the client's advertised flow control window). 3. QUIC and TCP behave fairly similarly in terms of the broad strokes of rate control. As I noted above, QUIC also uses Slow Start. The amplification limit you mention is a separate limit from initial cwnd, which is intended to avoid blind amplification attacks, because, unlike TCP, QUIC servers can start sending data immediately upon receiving the first packet, so you don't know that the IP address wasn't forged. However, even if the peer's IP is authenticated, that doesn't mean it's safe to use an arbitarily large initial cwnd. | |
| ▲ | raggi 2 hours ago | parent | prev | next [-] | | > You can already configure your initial congestion window, and if you are connecting to a system expecting the use of PQ encryption, you should set your initial congestion window to be large enough for the certificate; doing otherwise is height of incompetence and should be fixed. The aggressive tone is no defense against practical problems such as the poor scalability of such a solution. > You could also use better protocols like QUIC which has a independently flow controlled crypto stream and you can avoid amplification attacks by pre-sending adequate amounts of data to stop amplification prevention from activating. Not before key exchange it doesn't. There's no magic bullet here. A refresher on the state of TFO and QUIC PMTU might be worthwhile here before jumping this far ahead. | | |
| ▲ | Veserv an hour ago | parent [-] | | You have asserted without evidence that the increased certificate chain size is the primary scaling bottleneck. I assert that the bottleneck is most likely due to accidental complexity elsewhere on the argument that claimed problems look to be far in excess of the essential complexity. > Not before key exchange it doesn't. There's no magic bullet here. I was incorrect. Rereading the QUIC standard I see that they do not flow control the CRYPTO packet number space/stream. I thought they did because it is so easy to do that I did it as a afterthought. Truly another example of fundamental design errors introducing accidental complexity that should be fixed instead of papered over. | | |
| ▲ | ekr____ 41 minutes ago | parent [-] | | Can you elaborate a bit more about what you think the unnecessary complexity here? A basic source of concern here is whether it's safe for the server to use an initial congestion window large enough to handle the entire PQ certificate chain without having an unacceptable risk of congestion collapse or other negative consequences. This is a fairly complicated question of network dynamics and the interaction of a bunch of different potentially machines sharing the same network resources, and is largely independent of the network protocol in use (QUIC versus TCP). It's possible that IW20 (or whatever) is fine, but it may well may not be. There are two secondary issues:
1. Whether the certificate chain is consuming an unacceptable fraction of total bandwidth. I agree that this is less likely for many network flows, but as noted above, there are some flows where it is a large fraction of the total. 2. Potential additional latency introduced by packet loss and the necessary round trip. Every additional packet increases the chance of one of them being lost and you
need the entire certificate chain. It seems you disagree about the importance of these issues, which is an understandable position, but where you're losing me is that you seem to be attributing this to the design of the protocols we're using. Can you explain further how you think (for instance) QUIC could be different that would ameliorate these issues? |
|
| |
| ▲ | nickf 2 hours ago | parent | prev [-] | | Your failure to see the problem doesn’t mean it doesn’t exist. 40x the size might not really be an issue for the hypothetical server you’ve suggested - but that isn’t the reality for the world. Many devices do HTTPS and TLS.
Not to mention the issue is more with the clients.
CT logs would get a lot harder to run (and they’re already not so easy). |
|
|
|
| ▲ | bastawhiz 4 hours ago | parent | prev [-] |
| Let's say you visit a site that doesn't use H2. That's now nearly a megabyte (up from 24kb) of data across the six connections that HTTP/1.1 establishes. You're on LTE? You have high packet loss over a wireless connection? The initial TCP window size is ~16kb in a lot of cases, now you need multiple round trips over a high latency connection just to make the connection secure. You'll probably need 3-4 round trips on a stable connection just for the certificate. On a bad connection? Good luck. |
| |
| ▲ | Veserv 4 hours ago | parent [-] | | Exactly, HTTP/1.1 is a poorly designed protocol and there are good reasons why we have newer versions of HTTP which avoid multiple unnecessary encryption handshakes. Exactly, using a blanket default initial congestion window of 16 KB is stupid. Even ignoring that it was chosen when average bandwidth was many times less and thus should be increased anyways to something on the order of the average BDP or you should use a better congestion control algorithm, it is especially stupid if you are beginning a connection that has a known minimum requirement before useful data can be sent. These things should be fixed as well instead of papering them over. Your system should work well regardless of the size of the certificate chain except for the fundamental overhead of having a larger chain. | | |
| ▲ | bastawhiz 3 hours ago | parent [-] | | I mean, unless you stop supporting H1, you're stuck with it. "Fixing" it means killing it. Unless you break every site/API that uses it, you can't do that. Increasing the initial congestion window is probably smart, but increasing it to a size large enough to hold a 160kb certificate is almost certainly a terrible idea. Lots of people with "broadband" probably never get close to 160kb congestion window size. Flaky wifi or a bad mobile signal will probably never get above a 32kb congestion window size—that's today, with modern hardware. That's five round trips assuming you start at 32kb and it never increases. You think airplane wifi is bad? Imagine how bad it'll be when the congestion window starts at an order of magnitude bigger than it would normally ever reach. The "fix" means... Well I don't know actually, because if it could be good, you'd think at least one carrier would have good in-flight wifi. I doubt you could overcome to bureaucratic and technical challenges. This isn't a problem that can be "fixed" in a lot of cases. If you optimize for the happy path, you're not just hurting people who literally don't have another option, you're hurting yourself when under bad connections. | | |
| ▲ | Veserv 3 hours ago | parent [-] | | You are not breaking H1, it just runs poorly in a different environment than the one it was created during. This is frankly already true which is why we literally have had two entire major versions since. A 160 KB congestion window with 50 ms RTT means you are limited to a maximum bandwidth of 3,200 MB/s (~25 Mbps). At 200 ms RTT you are limited to ~6.5 Mbps. At 32 KB you are getting ~5 Mbps and ~1 Mbps, respectively. If you are literally being limited to 1 Mbps, then you should not use a initial 160 KB congestion window as that is too much for your connection anyways. You can solve this with proper adaptive channel parameter detection in your network stack. In the presence of arbitrarily poor, degraded, or lossy network conditions, you should already be doing this to achieve good throughput and initial connection throughput. A proper design should only really have the problem of "we are literally sending more data which fundamentally takes a extra N units of time on our K rate connection". This is a problem that is still worth solving by reducing the size of the certificate chain, but if you have other problems than that then you should solve them as well. More pointedly, having problems other than that directly points at serious structural design deficiencies that are ossified and brittle. |
|
|
|