Remix.run Logo
ACCount37 3 hours ago

This "~512 batching" makes me think of things like diffusion or prefill.

If they managed to put together some dirty hack that lets them generate about 512 tokens worth of reasoning in parallel instead of in sequence? That would explain it.