| ▲ | nickcw 3 hours ago |
| > bitnet.cpp is the official inference framework for 1-bit LLMs (e.g., BitNet b1.58). It offers a suite of optimized kernels, that support fast and lossless inference of 1.58-bit models on CPU and GPU (NPU support will coming next). One bit or one trit? I am confused! |
|
| ▲ | drsopp 3 hours ago | parent | next [-] |
| "1-bit LLMs" is just marketing. The Shannon entropy of one letter with a 3 symbol alphabet (-1, 0, 1) is 1.58. |
| |
| ▲ | Dwedit 3 hours ago | parent [-] | | Log Base 2 of 3 = ~1.5849625, so that's the limit to how well you can pack three-state values into bits of data. For something more practical, you can pack five three-state values within a byte because 3^5 = 243, which is smaller than 256. To unpack, you divide and modulo by 3 five separate times. This encodes data in bytes at 1.6 bits per symbol. But the packing of 5 symbols into a byte was not done here. Instead, they packed 4 symbols into a byte to reduce computational complexity (no unpacking needed) | | |
| ▲ | rasz 2 hours ago | parent [-] | | >1-bit model >packed 4 symbols into a byte microslop, typical bunch of two-bit frauds! |
|
|
|
| ▲ | cubefox 3 hours ago | parent | prev [-] |
| Yeah, "1.58 bit" is 1 trit with three states, since log2(3)≈1.58. So it's not a inference framework for 1-bit models (two states per parameter) but for 1.58 bit models (three states per parameter). Annoying that they try to mix up the two. |
| |
| ▲ | silon42 2 hours ago | parent [-] | | I always hope for "just a bunch of if statements" ... this is not it. | | |
|