Remix.run Logo
CuriouslyC 3 hours ago

I get where you're coming from, having wrestled with Codex/CC to get it to actually emit everything needed to even do proper evals.

From a "correct solution" standpoint having one source of truth for evals, agent memory, prompt history, etc is the right path. We already have the infra to do it well, we just need to smooth out the path. The thing that bugs me is people inventing half solutions that seem rooted in ignorance or the desire to "capture" users, and seeing those solutions get traction/mindshare.