Remix.run Logo
Sohcahtoa82 5 days ago

> The article mocks Postel's law

As they should. Postel's Law was a terrible idea and has created minefields all over the place.

Sometimes, those mines aren't just bugs, but create gaping security holes.

If your client is sending data that doesn't conform to spec, you have a bug, and you need to fix it. It should never be up to the server to figure out what you meant and accept it.

ndusart 4 days ago | parent | next [-]

Following Postel's law does not mean to accept anything. The received data should still be unambiguous.

You can see that in the case where ASN.1 data need to be exchanged. You could decide to always send them in the DER form (conservative) but accept BER (liberal). BER is still an unambiguous encoding for ASN.1 data but allow several representations for the same data.

The problem with BER mainly lies with cryptographic signature as the signature will only match a specific encoding so that's why DER is used in certificates. But you can still apply Postel's law, you may still accept BER fields when parsing file. If the field has been incorrectly encoded in a varied form which is incompatible with the signature, you will just reject it as you would reject it because it is not standard with DER. But still, you lessen the burden to make sure all parts follow exactly the standards the same way and things tend to work more reliably across server/clients combinations.

jeffrallen 4 days ago | parent | prev | next [-]

I agree that being liberal in what you accept can leave technical debt. But my comment was about the place in the code where they set a cookie with JSON content instead of keeping to a format that is known to pass easily through HTTP header parsing, like base64. They should have been conservative in what they sent.

SilasX 5 days ago | parent | prev | next [-]

You could split the difference with a 397 TOLERATING response, which lets you say "okay I'll handle that for now, but here's what you were supposed to do, and I'll expect that in the future". (j/k it's an April Fool's parody)

https://pastebin.com/TPj9RwuZ

emn13 5 days ago | parent | prev | next [-]

And yet the html5 syntax variation survived (with all it's weird now-codified quirks), and the simpler, stricter xhtml died out. I'm not disagreeing with out; it's just that being flexible, even if it's bad for the ecosystem is good for surviving in the ecosystem.

plorkyeran 5 days ago | parent [-]

There was a lot of pain and suffering along the way to html5, and html5 is the logical end state of postel's law: every possible sequence of bytes is a valid html5 document with a well-defined parsing, so there is no longer any room to be more liberal in what you accept than what the standard permits (at least so far as parsing the document).

emn13 5 days ago | parent [-]

Getting slightly off topic, but I think it's hard to find the right terminology to talk about html's complexities. As you point out, it isn't really a syntax anymore now that literally every sequence is valid. Yet the parsing rules are obviously not as simple as a .* regex. It's syntactically simple, but structurally complex? What's the right term for the complexity represented by how the stack of open elements interacts with self-closing or otherwise special elements?

Anyhow, I can't say I'm thrilled that some deeply nested subtree of divs for instance might be closed by a open-button tag just because they were themselves part of a button, except when... well, lots of exceptions. It's what we have, I guess.

It's also not a (fully) solved problem; just earlier this year I had to work around an issue in the chromium html parser that caused IIRC quadratic parsing behavior in select items with many options. That's probably the most widely used parser in the world, and a really inanely simple repro. I wonder whether stuff like that would slip through as often were the parsing rules at all sane. And of course encapsulation of a document-fragment is tricky due to the context-sensitivity of the parsing rules; many valid DOM trees don't have an HTML serialization.

rendall 5 days ago | parent | prev [-]

[flagged]