| ▲ | JumpCrisscross 7 hours ago |
| I’m so confused. The top fusion is Fable 5 and GPT 5.5. That is not an “ensemble of weaker AI models.” |
|
| ▲ | maniacwhat 7 hours ago | parent | next [-] |
| What its saying is if you look at any single model, it can be beaten by an ensemble of weaker models.
E.g fable 5 is beaten by an ensemble of previous gen models. |
| |
| ▲ | JumpCrisscross 7 hours ago | parent [-] | | I guess so. 4.8 + 4.8 > Fable 5 is interesting, though not particularly game changing. (The others all fuse frontier models. Which is an argument for using those frontier models more. Not less.) | | |
| ▲ | pants2 6 hours ago | parent [-] | | Yeah, all that's really saying is a weaker model with a better harness can beat a stronger model with a worse harness, specifically on the DRACO benchmark This isn't really a surprising result. Needs more evidence to make a broader claim. |
|
|
|
| ▲ | tim-star 7 hours ago | parent | prev [-] |
| i guess the point is that any fusion is better than any single model and a fusion of the top two models is obviously the best?
for cost though i guess you could just duct tape together 10 open source models and then thats comparable? |
| |
| ▲ | JumpCrisscross 7 hours ago | parent [-] | | > though i guess you could just duct tape together 10 open source models and then thats comparable? This is what I was hoping to see data for. | | |
|