| ▲ | AugSun 12 hours ago | |
"We can run your dumbed down models faster": #The use of NVFP4 results in a 3.5x reduction in model memory footprint relative to FP16 and a 1.8x reduction compared to FP8, while maintaining model accuracy with less than 1% degradation on key language modeling tasks for some models. | ||