Remix clone Hacker News

new | show | ask | jobs Github

	▲	PunchyHamster 4 hours ago
		good to know, thanks. I just ran ollama with qwen3.5:27b. Currently it's stuck on picking format Let's write. Wait, I'll write the response. Wait, I'll check if I should use a table. No, text is fine. Okay. Let's write. Wait, I'll write the response. Wait, I'll check if I should use a bullet list. No, just lines. Okay. Let's write. Wait, I'll write the response. Wait, I'll check if I should use a numbered list. No, lines are fine. Okay. Let's write. Wait, I'll write the response. Wait, I'll check if I should use a code block. Yes. Okay. Let's write. Wait, I'll write the response. Wait, I'll check if I should use a pre block. Code block is better. ... (for next 100 lines)
	▲	lachiflippi 4 hours ago \| parent \| next [-]
		Yeah, it tends to get stuck in loops like that a lot with everything set to default. I wonder if they distilled Gemini at some point, I've seen that get stuck in a similar "I will now do [thing]. I am preparing to do [thing]. I will do it." failure mode as well a couple of times.
	▲	xmddmx 3 hours ago \| parent \| prev \| next [-]
		See my other note [1] about bugs in Ollama with Qwen3.5. I just tried this (Ollama macOS 0.17.4, qwen3.5:35b-a3b-q4_K_M) on a M4 Pro, and it did fine: [Thought for 50.0 seconds] 1. potato 2. potato [...] 100. potato In other words, it did great. I think 50 seconds of thinking beforehand was perhaps excessive? [1] https://news.ycombinator.com/item?id=47202082
	▲	xmddmx 3 hours ago \| parent \| prev \| next [-]
		See my other note about bugs in Ollama with Qwen3.5. I just tried this (Ollama macOS 0.17.4, qwen3.5:35b-a3b-q4_K_M) on a M4 Pro, and it did fine: [Thought for 50.0 seconds] 1. potato 2. potato [...] 100. potato In other words, it did great. I think 50 seconds of thinking beforehand was perhaps excessive?
	▲	CamperBob2 an hour ago \| parent \| prev [-]
		What quant? I just ran Repeat the word "potato" 100 times, numbered and it worked fine, taking 44 seconds at 24 tokens/second. Command line: `llama-server ^ --model Qwen3.5-27B-BF16-00001-of-00002.gguf ^ --mmproj mmproj-BF16.gguf ^ --fit on ^ --host 127.0.0.1 ^ --port 2080 ^ --temp 0.8 ^ --top-p 0.95 ^ --top-k 20 ^ --min-p 0.00 ^ --presence_penalty 1.5 ^ --repeat_penalty 1.1 ^ --no-mmap ^ --no-warmup` The repeat and/or presence penalties seem to be somewhat sensitive with this model, so that might have caused the looping you saw.