Remix.run Logo
epec254 10 hours ago

A key challenge with HTML is client side trust. How do I enable an agent platform (say Gemini, Claude, OpenAI) to render UI from an untrusted 3p agent that’s integrated with the platform? This is a common scenario in the enterprise version of these apps - eg I want to use the agent from (insert saas vendor) alongside my company’s home grown agents and data.

Most HTML is actually HTML+CSS+JS - IMO, accepting this is a code injection attack waiting to happen. By abstracting to JSON, a client can safely render UI without this concern.

lunar_mycroft 10 hours ago | parent | next [-]

If the JSON protocol in question supports arbitrary behaviors and styles, then you still have an injection problem even over JSON. If it doesn't support them you don't need to support those in an HTML protocol either, and you can solve the injection problem the way we already do: sanitizing the HTML to remove all/some (depending on your specific requirements) script tags, event listeners, etc.

epicurean 9 hours ago | parent | prev | next [-]

Perhaps the protocol, is then html/css/js in a strict sandbox. Component has no access to anything outside of component bounds (no network, no dom/object access, no draw access, etc).

awei 8 hours ago | parent [-]

I think you can do that with an iframe, but it always makes me nervous

awei 10 hours ago | parent | prev [-]

Right this makes sense, I wonder if it would then be a good idea to abstract html to JSON, making it impossible to include css and js into it

epec254 10 hours ago | parent | next [-]

Curious to learn more what you are thinking?

One challenge is you do likely want JS to process/capture the data - for example, taking the data from a form and turning it into json to send back to the agent

oooyay 10 hours ago | parent | prev [-]

If you play with A2UIs generator that's effectively what it does, just layer of abstraction or two above what you're describing.

awei 10 hours ago | parent [-]

That's what I thought too skimming through the documentation, my thinking is that since it does that, which makes sense to avoid script injection, why not do it with "jsonized" html.

oooyay 6 hours ago | parent [-]

I was thinking that raw html might be too verbose, but canned components have signatures and types.