Remix clone Hacker News

new | show | ask | jobs Github

▲

mchiang 7 days ago

OpenAI has only provided MXFP4 weights. These are the same weights used by other cloud providers.

▲

irthomasthomas 7 days ago | parent [-]

Oh, I didn't know that. Weird!

▲

reissbaker 7 days ago | parent [-]

It was natively trained in FP4. Probably both to reduce VRAM usage at inference time (fits on a single H100), and to allow better utilization of B200s (which are especially fast for FP4).

▲

irthomasthomas 7 days ago | parent [-]

Interesting, thanks. I didn't know you could even train at FP4 on H100s

	▲	reissbaker 5 days ago \| parent [-]
		It's impressive they got it to work — the lowest I'd heard of this far was native FP8 training.