Remix clone Hacker News

new | show | ask | jobs Github

	▲	fc417fc802 6 days ago
		> a common optimizing operation for most architectures is to trade calculation for memory(unroll loops, lookup tables...) That really depends. A cache miss adds eons of latency thus is far worse than doing a few extra cycles of work but depending on the workload the reorder buffer might manage to negate the negative impact entirely. Memory bandwidth as a whole is also incredibly scarce relative to CPU clock cycles. The only time it's a sure win is if you trade instruction count for data in registers or L1 cache hits but those are themselves very scarce resources.