Isn't that what you described Direct Storage?
You're still running through the PCIe slot and it's bandwidth limit. I'm suggesting you bypass even that and put more memory directly on the card.
So an additional layer slower and larger than global GPU memory?
I believe that's kind of what bolt graphics is doing with the dimm slots next to the soldered on lpddr5. https://bolt.graphics/how-it-works/