▲ | jasonwatkinspdx 3 days ago | |||||||
This is still using a Tomasulo like algorithm, it's just been shifted from the backend to the front end. And instructions don't lock up on an L1 miss. Instead the results of that instruction are marked as poisoned, and the front end replays the their microps forward in the execution stream once the L1 miss is resolved. As the article points out, this replay is likely to fill out otherwise unused execution slots on general purpose code, as OoO cpus rarely sustain their full execution width. It's a smart idea, and has some parallels to the Mill CPU design. The backend is conceptually similar to a statically scheduled VLIW core, and the front end races ahead using it's matrix scorecard trying to queue up as much as it can for it vs the presence of unpredictable latencies. | ||||||||
▲ | quantummagic 3 days ago | parent [-] | |||||||
> Mill CPU design There were some fascinating concepts being explored in that project. It's a shame nothing came of it. | ||||||||
|