I'm betting on a company like Taalas making a model that is perhaps less capable but 100x as fast, where you could have dozens of agents looking at your problem from all different angles simultaneously, and so still have better results and faster.

▲

100ms 2 hours ago | parent | next [-]

I'm excited for Taalas, but the worry with that suggestion is that it would blow out energy per net unit of work, which kills a lot of Taalas' buzz. Still, it's inevitable if you make something an order of magnitude faster, folk will just come along and feed it an order of magnitude more work. I hope the middleground with Taalas is a cottage industry of LLM hosts with a small-mid sized budget hosting last gen models for quite cheap. Although if they're packed to max utilisation with all the new workloads they enable, latency might not be much better than what we already have today

▲

andai 4 hours ago | parent | prev [-]

Yeah, it's a search problem. When verification is cheap, reducing success rate in exchange for massively reducing cost and runtime is the right approach.

	▲	never_inline 4 hours ago \| parent [-]
		You underestimating the algorithmic complexity of such brute forcing, and the indirect cost of brittle code that's produced by inferior models