Remix.run Logo
buildbot 4 hours ago

You can even train in 4 & 8 bits with newer microscaled formats! From https://arxiv.org/pdf/2310.10537 to gpt-oss being trained (partially) natively in MXFP4 - https://huggingface.co/blog/RakshitAralimatti/learn-ai-with-...

To Nemotron 3 Super, which had 25T of nvfp4 native pretraining! https://docs.nvidia.com/nemotron/0.1.0/nemotron/super3/pretr...