| ▲ | rao-v 6 hours ago | |
I don’t really think this reflects the current era of challenges? The “enforcement layer” is the hardest and most important part, and is barely addressed. - is the answer structurally / syntactically valid? - is it appropriately grounded and evidenced? - is it accurate? In what ways does it fall short? Each of these should be triggering an agent to rework and resubmit etc. or failing that a disclosure to the user about how the answer falls short and should be reviewed / remediated. This feels like it’s from the era of trying to oneshot a good enough answer. | ||