Remix.run Logo
syntaxing 4 hours ago

Is it worth running speculative decoding on small active models like this? Or does MTP make speculative decoding unnecessary?