Remix.run Logo
cubefox 2 hours ago

I assume that theoretically, 1-bit models could be most efficient because modern models switched from 32 bit to 16 bit to 8 bit per parameter (without quantization).