| ▲ | girvo 7 hours ago | |
It's being explored right now for speculative decoding in the local-LLM space, which I think is quite interesting as a use-case https://www.emergentmind.com/topics/dflash-block-diffusion-f... | ||
| ▲ | roger_ 5 hours ago | parent [-] | |
DFlash immediately came to my mind. There are several Mac implementations of it that show > 2x faster Qwen3.5 already. | ||