I wonder to what extent models should figure out which model to forward a query to. Or perhaps the big models could learn the difference between an easy and a hard question and charge accordingly? Perhaps, if it can measure complexity, even generate a quote?

Small models are fine for small coding tasks but I don't see why big ones can't be broken down most of the time.

▲

AgentMasterRace 3 hours ago | parent | next [-]

Many harnesses do this, I've recently dropped all my big subscriptions for using deepseek. Codewhale (formerly deepseek-tui) will use pro for large tasks and route smaller ones to flash. It's pretty good, but I just use pro and everything as the cost is quite low.

This one does not have routing, but reasonix is insane, absolutely insane for saving money. I've used 1.3billion tokens at the cost of 4$. (99-100% cache hit)

▲

ValentineC 3 hours ago | parent | prev [-]

> I wonder to what extent models should figure out which model to forward a query to. Or perhaps the big models could learn the difference between an easy and a hard question and charge accordingly?

This sounds like something a harness could do (and might already be doing), with work delegated to subagents running on lower-cost models.

	▲	jorl17 an hour ago \| parent [-]
		Yes, they are all already doing this