So from CDNA3 to 4 they doubled fp16 and fp8 performance but cut fp32 and fp64 by half?
Wonder why the regression on non-AI workloads?
cuz area and power