The bulk of Kimi-K2.6's parameters are stored with 4 bits per weight, not 16 or 32. There are a few parameters that are stored with higher precision, but they make up only a fraction of the total parameters.

▲

gpm 6 hours ago | parent [-]

Huh, cool. I guess that makes a lot of sense with all the success the quantization people have been having.

So am I misunderstanding "Tensor type F32 · I32 · BF16" or is it just tagged wrong?

	▲	rockinghigh 4 hours ago \| parent \| next [-]
		The MoE experts are quantized to int4, all other weights like the shared expert weights are excluded from quantization and use bf16.
	▲	liuliu 4 hours ago \| parent \| prev [-]
		I32 are 8 4-bit value packed into one int32.