Remix.run Logo
girvo 7 hours ago

It's being explored right now for speculative decoding in the local-LLM space, which I think is quite interesting as a use-case

https://www.emergentmind.com/topics/dflash-block-diffusion-f...

roger_ 5 hours ago | parent [-]

DFlash immediately came to my mind.

There are several Mac implementations of it that show > 2x faster Qwen3.5 already.