Remix clone Hacker News

new | show | ask | jobs Github

	▲	imtringued 2 days ago
		Assuming a parallel programming language and a SMT aware compiler, the CPU could just switch to another block of static instructions while it is waiting.
	▲	namibj 2 days ago \| parent \| next [-]
		You mean like e.g. Nvidia Maxwell? (There's decent 3rd party documentation from nervana systems from when they squeezed all they could out of f32 dense matrix multiply, at the time substantially faster than Nvidia's cuBLAS library; this is very not exclusive to that architecture, though.)
	▲	tliltocatl 2 days ago \| parent \| prev [-]
		> Assuming a parallel programming language Assuming a parallelizable workload, which is often not the case.