Remix.run Logo
ilaksh 5 hours ago

Any chance they will have this for Qwen 27 b also?

kamranjon 5 hours ago | parent [-]

The paper actually references testing their DSpark speculative decoding strategy with Qwen 3 4b, 8b and 14b models so while I doubt they will release builds themselves, they’ve open sourced (DeepSpec) their training pipeline for this so we will likely see folks adopting for other models.