Remix.run Logo
kevmo314 an hour ago

> This is not the first time we can see Nvidia taking shortcuts to achieve maximum performance of their GPUs

Why is implementing it correctly not performant? For context I have no idea how rounding is typically implemented anyways.