Remix clone Hacker News

new | show | ask | jobs Github

	▲	Ladioss 7 hours ago
		More like `ollama launch claude --model qwen3.6:latest` Also you need to check your context size, Ollama default to 4K if <24 Gb of VRAM and you need 64K minimum if you want claude to be able to at least lift a finger.
	▲	Patrick_Devine 5 hours ago \| parent \| next [-]
		If you're on a Mac, use the MLX backend versions which are considerably faster than the GGML based versions (including llama.cpp) and you don't need to fiddle with the context size. The models are `qwen3.6:35b-a3b-nvfp4`, `qwen3.6:35b-a3b-mxfp8`, and `qwen3.6:35b-a3b-mlx-bf16`.
	▲	txtsd 4 hours ago \| parent \| prev [-]
		I only have 16GB VRAM, and my system uses ~4GB from that. What are my options? I got this one: `Qwen3.6-35B-A3B-UD-IQ2_XXS.gguf`