| ▲ | HeytalePazguato 5 hours ago | |
This matches what I've seen working with automated systems. The watching part is genuinely underrated. Evals give you a score. Watching gives you intuition about failure modes you didn't know to test for. Sitting with a running system teaches you things you would never think to measure. | ||