| ▲ | nzach 7 hours ago | |||||||
I went in expecting to find 'branch prediction'[0] as the answer, but apparently things are even more complex nowadays. [0] - https://stackoverflow.com/questions/11227809/why-is-conditio... | ||||||||
| ▲ | gruez 5 hours ago | parent | next [-] | |||||||
>I went in expecting to find 'branch prediction'[0] GPUs do branch prediction? I thought they didn't bother and try to minimize wasted effort by using high amounts of concurrent threads? | ||||||||
| ||||||||
| ▲ | kangalioo 6 hours ago | parent | prev | next [-] | |||||||
To be fair, the culprit in the article is _less complex_ than branch prediction: "with random data, bits are flipped often, and bit flips in transistors inherently draw power" is less mental gymnastics than "with random data, the cpu fails to predict the future, causing redundant speculative execution" | ||||||||
| ||||||||
| ▲ | bee_rider 5 hours ago | parent | prev | next [-] | |||||||
I expected a “torch is smart enough to keep track of cases where it just initialized the C in C <= A*B+C to zero, avoiding the add” type situation but I was wrong. | ||||||||
| ▲ | ryanisnan 5 hours ago | parent | prev [-] | |||||||
That's exactly what I thought. | ||||||||