Remix.run Logo
sigmoid10 3 hours ago

Thanks a lot, I was about to clone their llama.cpp branch and do the same.

Some more interesting tidbits from my go-to tests:

* Fails the car wash test (basic logic seems to be weak in general)

* Fails simple watch face generation in html/css.

* Fails the "how many Rs in raspberry test" (not enough cross-token training data), but will funnily assume you may be talking about Indian Rupees and tell you a lot about raspberry prices in India without being asked. Possible Indian training data unbalance?

* Flat out refuses to talk about Tiananmen square when pushed directly - despite being from a US company. Again, perhaps they are exposed to some censored training data? Anyways, when slowly built up along the conversation by asking about locations and histories, it will eventually tell you about the massacre, so the censorship bias seems weak in general. Also has no problem immediately talking about anything Gaza/Israel/US or other sensitive topics.

* Happily tells you how to synthesize RDX with list of ingredients and chemical process step by step. At least it warns you that it is highly dangerous and legally controlled in the US.

yorwba 3 hours ago | parent | next [-]

The 1-bit Bonsai and Ternary Bonsai models are all based on the corresponding Qwen3 model: https://raw.githubusercontent.com/PrismML-Eng/Bonsai-demo/re... (page 4)

sigmoid10 2 hours ago | parent [-]

Thanks, already suspected as much. Also gives context to the other comment here that says it is basically equivalent in accuracy to Qwen3.5-4B. Essentially seems to be a very good quantization of that model, not a new BitNet.

yorwba 2 hours ago | parent [-]

It's a good-per-byte-but-not-in-absolute-terms quantization of Qwen3-8B that's comparable in accuracy to Qwen3.5-4B at 4-bit quantization (which makes the 4B model larger in terms of storage, though the lower number of parameters and hybrid attention give it a speed advantage if you're not bottlenecked on memory bandwidth for the model weights.)

3 hours ago | parent | prev [-]
[deleted]