| ▲ | jorvi 11 hours ago | |
I don't know if it is still the case with current models, but a few generations back Microsoft had some research results where asking a model to iterate N times would significantly improve performance, with the optimal point being 4 iterations. | ||
| ▲ | andai 7 hours ago | parent | next [-] | |
I think there's a sweet spot for it. If a model can't do a task, iterating won't help. If a model can do it reliably, there's no need to iterate. If it can do it, but unreliably, that's where you would get major gains from iterating. I think the Chinese models are in that sweet spot, for many tasks. I would love to test that. I started working on my own fusion system yesterday. I'm not sure how to benchmark it though. The thing I'm most interested in is reliability. Going from 90% to 95% on a benchmark doesn't seem like much but you've cut the error rate in half. | ||
| ▲ | Garlef 11 hours ago | parent | prev [-] | |
> but a few generations back Out of interest: Was this still before CoT/thinking-mode became the norm? | ||