Memory is expensive? If reads are as rare as they claim you can just stash the KV-cache on spinning disk.
Aren’t those latency sensitive though?