Remix clone Hacker News

new | show | ask | jobs Github

	▲	wg0 6 hours ago
		It hallucinates a lot more then Sonnet or even MiniMax M2.5. Especially in tool calls, it would end up duplicating the content in code files and then realising later and getting stuck in a loop.
	▲	justinclift 34 minutes ago \| parent \| next [-]
		> It hallucinates a lot more then Sonnet or even MiniMax M2.5. Ugh, that's not good. I evaluated Kimi K2 a while back for some text understanding -> summarisation tasks, and of the 100 tasks it hallucinated about 30% of the output. :( :( :(
	▲	noelsusman 2 hours ago \| parent \| prev [-]
		My initial experiments are not encouraging. I have a basic planning prompt that includes instructions not to edit any files or implement anything. Qwen-3.6-Plus will consistently ignore that completely and proceed with implementation. I expect that kind of behavior from small models I run locally, not a hosted closed model claiming to compete with the frontier models.