| ▲ | patall 4 hours ago | ||||||||||||||||
Maybe a naive question: given that they see better performance with more passes but the effect hits a limit after a few passes, would performance increase if they used different models per pass, i.e leanstral, kimi, qwen and leanstral again instead of 4x leanstral? | |||||||||||||||||
| ▲ | andai 4 hours ago | parent [-] | ||||||||||||||||
This is called a "LLM alloy", you can even do it in agentic, where you simply swap the model on each llm invocation. It does actually significantly boost performance. There was an article on here about it recently, I'll see if I can find it. Edit: https://news.ycombinator.com/item?id=44630724 They found the more different the models were (the less overlap in correctly solved problems), the more it boosted the score. | |||||||||||||||||
| |||||||||||||||||