Remix clone Hacker News

new | show | ask | jobs Github

	▲	dot_treo an hour ago
		Just to get it into a GGUF file would be fairly trivial. But using that GGUF file would need a bunch of additional things. One would need to create a new architecture derived from Qwen3, and then probably adapt the speculative decoding functionality. At the moment not even MTP is merged into llama.cpp, so I wouldn't quite hold my breath for it.