Remix.run Logo
somewhatrandom9 3 hours ago

Could these quantized models make MTP (Multi-Token Prediction) significantly faster when used as drafters for larger regular Gemma 4 models?

dist-epoch 2 hours ago | parent [-]

Google already released specialized drafters for Gemma 4.