Remix.run Logo
drclegg 4 days ago

Distributed compute is cool, but $320 for 13 tokens/s on a tiny input prompt, 4 bit quantization, and 3B active parameter model is very underwhelming