▲ | vessenes 17 hours ago | |
Thanks Jeff -- can you point me to something written up about rANS? All I find on line is turbulence modeling solutions; I presume this is not what you're referring to. As we know, quantizations are a critical tool for local LLM runners; RAM is typically the gating factor. Are you aware of other better lossless compression of BF16 weights out there? The reason I ask is this Dfloat11 seems relatively easy to plug in to existing quantization workflows, but you seem dismissive of the paper -- I presume it's my gap in understanding, and I'd like to understand. | ||
▲ | zorgmonkey 17 hours ago | parent [-] | |
I don't know of any great write-ups unfortunately, but the rANS you're looking for is range asymmetric numeral systems. |