| ▲ | chrischavez a day ago | |
Went through the official blog and the developers post, no mention of TurboQuant anywhere. Google's own research team tested it on Gemma models for KV-cache compression to 3 bits, so it's surprising it's not mentioned in this release. Anyone know if it's baked in already or if we'd need to apply it ourselves? Would love to run the 26B MoE locally as a daily driver. | ||