Remix clone Hacker News

new | show | ask | jobs Github

	▲	condiment 3 hours ago
		At 16k tokens/s why bother routing? We're talking about multiple orders of magnitude faster and cheaper execution. Abundance supports different strategies. One approach: Set a deadline for a response, send the turn to every AI that could possibly answer, and when the deadline arrives, cancel any request that hasn't yet completed. You know a priori which models have the highest quality in aggregate. Pick that one.
	▲	IanCal an hour ago \| parent [-]
		The best coding model won’t be the best roleplay one which won’t be the best at tool use. It depends what you want to do in order to pick the best model.