| ▲ | amoss 2 hours ago | |
> rollout will be incremental and it will self monitor by defining success conditions at rollout time. This sounds a lot like allowing an LLM to define tests as well as implementation, and allowing the LLM to update the tests to make the code pass. Recently people have come to understand (again?) that testing and evaluation works better outside of the sandbox. | ||