Remix.run Logo
maxbond 5 days ago

I would say avoid trying to understand arcane nuances better than the adversary. Assume they've simultaneously got more time on their hands and sat on the relevant standards committees. Adopt a strategy that's robust to having missed a small nuance in the standard or in the particular implementation by this or that browser. (That doesn't mean there isn't value in a blog post enumerating the edge cases, of course.)

Kaminsky described a very simple and nearly-universal technique to deal with escaping/injection issues. Encode the embedded data as base64 and decode it on the client side. This projects arbitrary data into a fixed, known domain (generally `[a-zA-Z0-9+/]*`) which you can ensure is free from control characters. (You may need to use a particular variant to achieve this, eg for URLs the last characters used are generally `-_` because both + and / are significant in that context.)

After decoding, you can pass it to JSON.parse().

Dylan16807 5 days ago | parent [-]

To me, escaping < for web stuff is just as non-arcane and non-nuanced as base64.

And yeah use URL-safe base64 when you do use it. -_ with no padding.

maxbond 4 days ago | parent [-]

Yeah, that's fair, and I did forget about `=`/padding when I discussed base64. This instance is a solved problem with a simple solution, blessed by the standards body.

The advantage of the base64 technique is that it provides fewer degrees of freedom, and so is more robust to unforseen vectors of attack. It's defensive programming. But it comes at a cost of memory/bandwidth.