| ▲ | DougBTX 3 hours ago | |
> GPUs put the associativity of the sums in matrix multiplications in arbitrary order That’s user-controlled too, not an inherent property of GPUs: https://docs.pytorch.org/docs/2.12/generated/torch.use_deter... | ||
| ▲ | vbarrielle 2 hours ago | parent [-] | |
The matrix multiplication is only deterministic for sparse-dense products under these settings: > torch.bmm() when called on sparse-dense CUDA tensors And it's not listed under the operations that raise an exception otherwise, so I'm not sure the docs promise that dense-dense matrix-matrix products are deterministic. | ||