I think the next major innovation is going to be intelligent model routing. I've been exploring OpenClaw and OpenRouter, and there is a real lack of options to select the best model for the job and execute. The providers are trying to do that with their own models, but none of them offer everything to everyone at all times. I see a future with increasingly niche models being offered for all kinds of novel use cases. We need a way to fluidly apply the right model for the job.

▲

nylonstrung 5 hours ago | parent | next [-]

Agree that routing is becoming the critical layer here. Vllm iris is really promising for this https://blog.vllm.ai/2026/01/05/vllm-sr-iris.html

There's already some good work on router benchmarking which is pretty interesting

▲

condiment 3 hours ago | parent | prev | next [-]

At 16k tokens/s why bother routing? We're talking about multiple orders of magnitude faster and cheaper execution.

Abundance supports different strategies. One approach: Set a deadline for a response, send the turn to every AI that could possibly answer, and when the deadline arrives, cancel any request that hasn't yet completed. You know a priori which models have the highest quality in aggregate. Pick that one.

	▲	IanCal an hour ago \| parent [-]
		The best coding model won’t be the best roleplay one which won’t be the best at tool use. It depends what you want to do in order to pick the best model.

▲

monooso 4 hours ago | parent | prev | next [-]

I came across this yesterday. Haven't tried it, but it looks interesting:

https://agent-relay.com/

▲

eshaham78 5 hours ago | parent | prev [-]

[dead]