Remix.run Logo
alistairSH 6 hours ago

How is success defined in those metrics? Is success "perfect - can deploy to prod immediately" or "saved some arbitrary amount of engineering time"?

Anecdotal experience from my team of 15 engineers is we rarely get "perfect" but we do get enough to massive time savings across several common problem domains.

Esophagus4 4 hours ago | parent [-]

I think for me, it’s not so much an objective success metric as it is showing its progression over time.

That’s what marvels me is how fast LLMs are progressing. And it still feels like early days (!).

For methodology, I would check out the METR website though, they’ve published their results.