The comparison results seem very plausible.

From the conclusion, I agree with:

> I wouldn't make either one the top-level coordinator by default.

But I do not agree with the follow-up sentence:

> The best shape is still a frontier coordinator or judge above them: GPT-5.5 or Claude Opus deciding what to delegate, checking the finished work, and rerunning narrow pieces when the answer looks wrong. These models make the worker layer much more serious, not the coordinator layer unnecessary.

For the coordinator or judge above them I would put myself, not a too expensive LLM under the control of an external entity, achieving thus simultaneously higher quality, lower cost and greater security.

▲

throwa356262 a day ago | parent | next [-]

A lot of LLM discussions is driven by people who cannot code themselves.

There are multiple AI influencers on youtube who can't code 5 lines of python to save their lives. But they do own 3 DGX spark and a stack of maxed out mac minis...

(Not complaining, AI is supposed to be democratic)

▲

incrudible a day ago | parent | prev [-]

> For the coordinator or judge above them I would put myself

You will not be able to keep up with the sheer volume, or alternatively you're never gonna ingest as much information as the LLM, so you're gonna miss out. Input tokens are relatively cheap.

Think of yourself as the CTO, they can't possibly make a judgement call on every detail, but an LLM can, and if you're gonna let an LLM do that, might as well go with frontier, and if you're not gonna let an LLM do that, you're stuck with whatever the lower-tier LLMs provided you with.

That doesn't mean you shouldn't read or judge the code at all, but you're still gonna want to use the LLM as the lever.

	▲	halJordan 10 hours ago \| parent [-]
		Yeah, the comment you're responding doesn't understand the workflow being discussed. And of course that makes the person believe they're genius level on the topic