Given the RAM limitations of the first gen Ryzen AI MAX, you have no choice but to go heavy on the quantization of the larger LLMs on that hardware.