Maybe a naive question: given that they see better performance with more passes but the effect hits a limit after a few passes, would performance increase if they used different models per pass, i.e leanstral, kimi, qwen and leanstral again instead of 4x leanstral?

▲

andai 4 hours ago | parent [-]

This is called a "LLM alloy", you can even do it in agentic, where you simply swap the model on each llm invocation.

It does actually significantly boost performance. There was an article on here about it recently, I'll see if I can find it.

Edit: https://news.ycombinator.com/item?id=44630724

They found the more different the models were (the less overlap in correctly solved problems), the more it boosted the score.

▲

patall 4 hours ago | parent [-]

That sounds quite interesting. Makes me wonder if sooner or later they will have to train multiple independent models that cover those different niches. But maybe we will see that sooner or later. Thanks for the link.

	▲	cyanydeez 4 hours ago \| parent [-]
		One would think that LoRAs being so successful in StableDiffusion, that more people would be focused on constructing framework based LoRas; but the economics of all this probably preclude trying to go niche in any direction and just keep building the do-all models.