| ▲ | dofm 10 days ago | |||||||||||||||||||||||||||||||
Could you outline how you are running the MTP drafters? I've tried LM Studio but no dice there. I'm probably missing something but I think llama.cpp and Ollama can't do it yet either? | ||||||||||||||||||||||||||||||||
| ▲ | thot_experiment 9 days ago | parent | next [-] | |||||||||||||||||||||||||||||||
I just build llama.cpp from scratch on the PR that has MTP drafters. https://github.com/ggml-org/llama.cpp/pull/23398 Please don't use Ollama, it's a bad actor in the OSS community. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | Patrick_Devine 10 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
I haven't yet pushed the MTP enabled gemma4 12b model for Ollama because in my testing I wasn't getting a performance bump. The other gemma4 MTP models should work OK right now, but there are some fixes we're just about to push. This is specifically for the MLX backend. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | ch_sm 10 days ago | parent | prev [-] | |||||||||||||||||||||||||||||||
can‘t speak to compatibility with this new model, but oMLX supports MTP drafters very well. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||