For compute bound region(high batch size) yes, but for low batch size it could improve the throughput.