That is true, it is a 1.6T parameters model so it requires a great deal of memory. I also heard there's a 2bit quantization that works well on Apple metal.