Remix.run Logo
behnamoh 5 days ago

> OpenAI made a huge mistake neglecting fast inferencing models.

It's a lost battle. It'll always be cheaper to use an open source model hosted by others like together/fireworks/deepinfra/etc.

I've been maining Mistral lately for low latency stuff and the price-quality is hard to beat.

mips_avatar 5 days ago | parent [-]

I'll try benchmarking mistral against my eval, I've been impressed by kimi's importance but it's too slow to do anything useful realtime.