Has anyone managed to get this to work in LM Studio? They've got a option in the UI, but it never seems to allow me to enable it.

▲

dvt 6 hours ago | parent | next [-]

It's not implemented in mlx[1] yet (or llama.cpp[2]), so it may take a while.

[1] https://github.com/ml-explore/mlx-lm/pull/990

[2] https://github.com/ggml-org/llama.cpp/pull/22673

▲

AlphaSite 6 hours ago | parent | prev | next [-]

Yes. Make sure you’re not using the Gemma sparse models since they don’t have a small model to use. Also I removed all the image models from the workspace.

	▲	adrian_b 3 hours ago \| parent [-]
		I do not know what you mean by sparse models. All 4 gemma-4-*-it models, regardless whether they are dense models or MoE models, have associated small models for MTP, whose names are obtained by adding the "-assistant" suffix. https://huggingface.co/google/gemma-4-E2B-it-assistant https://huggingface.co/google/gemma-4-E4B-it-assistant https://huggingface.co/google/gemma-4-26B-A4B-it-assistant https://huggingface.co/google/gemma-4-31B-it-assistant

▲

Havoc 6 hours ago | parent | prev | next [-]

Normally when LM Studio doesn't like it it's because of the presence of mmproj files in the folder. Sometimes removing them helps it show up.

They're somehow connected to vision & block speculative decode...don't ask me how/why though

For gemma specifically had more luck with speculative using the llama-server route than lm studio

▲

svachalek 6 hours ago | parent | prev [-]

I've gotten it to work with other models. They've got to be perfectly aligned usually, in terms of provider, quantization etc. Might be a bit before you can get a matched set.