| ▲ | nojs 10 hours ago | |
My working theory is that all models are approximately the same, and the variance in quality mostly depends on how long they think for. So the trick is to always set to max, and then begin every task with “this is an extremely complex task, do not complete it without extensive deep thinking and research” or whatever. You’re basically fighting a battle to make the model think more, against the defaults getting more and more nerfed to save costs. | ||
| ▲ | beering 4 hours ago | parent [-] | |
My experience has been that this isn’t generally true, mainly because worse models pursue red herrings or get confused and stuck. a better model will get to the correct solution in fewer tokens, and my surface-level understanding of how RL works supports this. | ||