Isn't Blackwell optimized for FP4? This blog post runs Deepseek at fp8, which is probably the sweet spot but new models with fp4 native training and inference would be drastically faster than fp8 on blackwell.