Remix.run Logo
chrismorgan 3 days ago

Although appealing, that’s an extremely bad idea, when you’re limited to JavaScript. In a language with a better type system, it can be only a very bad idea.

The problem is that different contexts have different escaping rules. It’s not possible to give a one-size-fits-all answer from the server side. It has to be done in a context-aware way.

Field A is plain text. Someone enters the value “Alpha & Beta”. Now, what does your server do? If it sanitises by stripping HTML characters, you’ve just blocked valid input; not good. If it doesn’t sanitise but instead unconditionally escapes HTML, somewhere, sooner or later, you’re going to end up with an “Alpha & Beta” shown to the user, when the value gets used in a place that isn’t taking serialised HTML. It always happens sooner or later. (If it doesn’t sanitise or escape, and the client doesn’t escape but just drops it directly into the serialised HTML, that’s an injection vulnerability.)

Field B is HTML. Someone enters the value “<img src=/ onerror=alert('pwnd')>”. Now, what does your server do? If it sanitises by applying a tag/attribute whitelist so that you end up with perhaps “<img src="/">”, fine.

krapp 3 days ago | parent | next [-]

Server-side templating frameworks had context-aware escaping strategies for years before front end frameworks were even a thing. Injection attacks don't persist because this is a hard problem, they persist because security is not a priority over getting a minimum viable product to market for most webdev projects.

The old tried and true strategy of "never sanitize data, push to the database with prepared statements and escape in the templates" is basically bulletproof.

naasking 3 days ago | parent | prev | next [-]

You're unnecessarily complicating this. The server is aware of what fields are HTML so it just encodes the data that it returns like we've been doing for 30 years now. If your point is that this approach is only good with servers that you trust, then that's useful to point out, although we kind of already are vulnerable to server data.

chrismorgan 2 days ago | parent [-]

You’re not getting it: we’re not talking about the server producing templated HTML, which is fine; but rather the server producing JSON, and then the client dropping strings from that object directly into serialised HTML. That’s a problem, because the only way to be safe is to entity-encode everything, but then when you use a string in a context that doesn’t use HTML syntax, you’ll get the wrong result.

It’s not an unnecessary complication. You fundamentally need to know what format you’re embedding something into, in order to encode it, and the server can’t know that.

Depending on what you do, you may want it unencoded, encoded for HTML data or double-quoted attribute value state (& → &amp;, < → &lt;, " → &quot;), encoded for a URL query string parameter value (percent-encoding but with & → %26 as well), and there are several more reasonable possibilities even in the browser frontend context.

These encodings are incompatible, therefore it’s impossible for the server to just choose one and have it work everywhere.

naasking 21 minutes ago | parent [-]

> It’s not an unnecessary complication. You fundamentally need to know what format you’re embedding something into, in order to encode it, and the server can’t know that.

There are two cases here:

1. Backend endpoints are specifically tied to the view being generated (returns viewmodels), in which case the server knows what the client is rendering and can encode it. This frankly should be the default approach because it minimizes network traffic and roundtrips. The original code displayed is perfectly fine in this case.

2. Endpoints are generic and the client assembles views by making multiple requests to various endpoints and takes on the responsibility that server-side frameworks used to do, including encoding.

3 days ago | parent | prev [-]
[deleted]