Remix.run Logo
rao-v 6 hours ago

I don’t really think this reflects the current era of challenges?

The “enforcement layer” is the hardest and most important part, and is barely addressed.

- is the answer structurally / syntactically valid?

- is it appropriately grounded and evidenced?

- is it accurate? In what ways does it fall short?

Each of these should be triggering an agent to rework and resubmit etc. or failing that a disclosure to the user about how the answer falls short and should be reviewed / remediated.

This feels like it’s from the era of trying to oneshot a good enough answer.