Remix.run Logo
adgjlsfhk1 4 days ago

AMD has more FP16 and FP64 flops (but ~1/2 the FP32 flops). Also the AMD is at half the TDP (300 vs 600 W)

grim_io 4 days ago | parent | next [-]

FP16+ doesn't really matter for local LLM inference, no one can run reasonably big models at FP16. Usually the models are quantized to 8/4 bits, where the 5090 again demolishes the w7900 by having a multiple of max TOPS.

adgjlsfhk1 4 days ago | parent [-]

with 48 GB of vram you could run a 20b model at fp16. It won't be a better GPU for everything, but it definitely beats a 5090 for some use case. It's also a generation old, and the newer rx9070 seems like it should be pretty competitive with a 5090 from a flops perspective, so a workstation model with 32 gb of vram and a less cut back core would be interesting.

lostmsu 4 days ago | parent | prev [-]

The FP16 bit is very wrong re: LLMs. 5090 has 3.5x FP16 for LLMs. 400+ vs ~120 Tops.