Those 10 instructions are for one access, not for a tight loop. A tight loop could be done with a much more complex macro that iterates separately in each segment, amortizing the overhead.