For pure, weird, late-night LLM chats, I've recently started using Qwen3.6-35B-A3B-Uncensored just running with llama-cli and it is a very refreshing chat experience.

Uncensored model means it will not deny any requests (at least I have yet to come across one), if you grew up in the 90s it sort of feels like coming across the anarchist cookbook for the first time (though with more accurate content). Using llama-cli means the session is entirely local and entirely ephemeral. As a bonus all the reasoning steps are fully visible to the user.

The base Qwen3.6-35B-A3B is more than adequate for "weird late night brainstorming chats" and I've really started to dislike the natural tendency to self-censor when the model is willing to refuse (and potentially report) any requests it feels is "inappropriate" and all these private thoughts are stored on someone else's server.

▲

Xeoncross an hour ago | parent | next [-]

Even for work questions about sensitive IP/code Qwen3.6-35B-A3B is a great option on macOS (35t/s) when you don't want info leaving your laptop. I'm using it with oMLX.

	▲	vorticalbox 6 minutes ago \| parent [-]
		I switched to oMLX today from lm studio. Really nice but I have found qwen3.6 sometimes failing to call tools correctly.

▲

bastawhiz an hour ago | parent | prev [-]

I run a Gemma 4 32b abliteration (int8) and it's remarkably good. It's been a real step up from Qwen in my experience.