| ▲ | kingstnap 3 days ago | |
By default CUDA isn't deterministic because of thread scheduling. The main difference comes from rounding order of reduction difference. It does make a small difference. Unless you have an unstable floating point algorithm, but if you have an unstable floating point algorithm on a GPU at low precision you were doomed from the start. | ||