While the GP might be technically wrong in a narrow sense, GPUs are built for FP, and that's what you want to be doing if you're using them as accelerators.

▲

mathisfun123 2 hours ago | parent [-]

You don't know what you're talking about: an enormous amount of TOPs now runs through quantized (read: integer) kernels. Many GPUs don't have even FP64 or even FP32 support.

▲

jmalicki an hour ago | parent [-]

EDIT: I was completely wrong, I have mostly worked with GGUF and related quantizations that are LUTs, thank you for correcting me.

	▲	mathisfun123 an hour ago \| parent [-]
		> The quantized integer kernels aren't running true integer multiplication, the quantization is it's own thing, they're basically enums not integers ELI-a-GPU-compiler-engineer-working-at-a-major-vendor (because I am). Ie I can pull up the design docs for our ALUs and literally see that you're wrong.