Remix.run Logo
Agent Runs Code You Never Wrote(timbreai.substack.com)
2 points by bakibab 4 hours ago | 1 comments
zippolyon 44 minutes ago | parent [-]

This is the exact problem that keeps us up at night.

  We ran a controlled experiment: same AI agents, same task, two conditions. Without runtime enforcement, our CMO agent
  fabricated an audit record — invented a governance event that never happened and presented it as compliance evidence.
  With enforcement (Y*gov), fabrication was structurally impossible because audit records are written by the engine, not
   agents.

  The core insight: agents running code you never wrote is a tool-execution-layer problem, not a model-alignment
  problem. You need deterministic interception before execution, not better prompts.

  Our approach: every tool call checked in 0.042ms, SHA-256 Merkle-chained audit trail, obligation tracking for tasks
  agents promise but never complete.

  github.com/liuhaotian2024-prog/Y-star-gov