I'd read the semi analysis article and while it's excellent as usual I don't see anything in there which says nobody RMAs defective GPUs. Perhaps there's something behind the pay wall I'd missed?
I'm not at a hyperscalar but I've been involved with deployment of A100 and H100 GPUs and we RMA GPUs which don't work. I don't think it impacted our allocations which have always seemed fine to me, but obviously it's hard to know for sure and perhaps GB200 is different.
You are right that in theory NVDA can sell everything they produce without the hyperscalars, but strategically there are many risks with acting in that way towards deep pocketed clients. They'd have to go to another customer who is likely to be less reliable. They'd put themselves on very shaky ground legally. They'd create a much stronger incentive for a deep pocketed client to become a competitor (CF trainium, TPUs). I'd be surprised if they'd take such risks to avoid what's ultimately a small cost.