A 32gb card does run it nicely. I use unsloth's UD-Q5_K_XL at 256k context (k/v at q8_0), and get ~67 t/s on a 5090. I still need to look into MTP.
[dead]