Remix clone Hacker News

new | show | ask | jobs Github

	▲	brikym 2 days ago
		It's all trade-offs between price, speed and accuracy. It's no good using a free model when the latency is 10s+ and the throughput is sub 100token/s and this is often the case on OpenRouter. I have to use a speedy provider like Groq and a small model. Dumber models need a lot more context to correct the inaccuracies. I'm mostly using mid tier models like Gemini 3 flash to generate the boards and then I use the fastest models to answer questions (currently gpt-oss-120b on Groq).